Coroutines problem domain

Mircea Baja - 6 May 2025

# Coroutines problem domain

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-0-box" width="100%" viewBox="0 0 250 100">
  <style>
    #svg20250506-0-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-0-task_blue {
      stroke: #000000;
      stroke-width: 4;
      fill: #2d93ad;
    }
    .svg20250506-0-task_purple {
      stroke: #000000;
      stroke-width: 4;
      fill: #9665ac;
    }
    .svg20250506-0-task_orange {
      stroke: #000000;
      stroke-width: 4;
      fill: #f5bb00;
    }
  </style>
  <rect x="-10" y="-10" width="80" height="80" class="svg20250506-0-task_purple"/>
  <rect x="70" y="-10" width="60" height="80" class="svg20250506-0-task_blue"/>
  <rect x="130" y="-10" width="30" height="80" class="svg20250506-0-task_orange"/>
  <rect x="160" y="-10" width="40" height="80" class="svg20250506-0-task_blue"/>
  <rect x="200" y="-10" width="60" height="80" class="svg20250506-0-task_purple"/>
  <rect x="-10" y="80" width="100" height="80" class="svg20250506-0-task_orange"/>
  <rect x="90" y="80" width="80" height="80" class="svg20250506-0-task_purple"/>
  <rect x="170" y="80" width="90" height="80" class="svg20250506-0-task_orange"/>
</svg>
</div>

---

# Asynchronous vs. parallel

- intuitively illustrated using abstract pictures
- for intro into C++ standardese doublespeak see ["Forward Progress Guarantees in
  C++ - Olivier Giroux - CppNow 2023"](https://www.youtube.com/watch?v=g9Rgu6YEuqY&t=978s)

---

# Sequential and synchronous

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-1-box" width="100%" viewBox="0 0 800 70">
  <style>
    #svg20250506-1-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-1-task_blue {
      stroke: #000000;
      stroke-width: 4;
      fill: #2d93ad;
    }
    .svg20250506-1-task_purple {
      stroke: #000000;
      stroke-width: 4;
      fill: #9665ac;
    }
    .svg20250506-1-task_orange {
      stroke: #000000;
      stroke-width: 4;
      fill: #f5bb00;
    }
  </style>
  <rect x="10" y="10" width="220" height="50" class="svg20250506-1-task_purple"/>
  <rect x="230" y="10" width="220" height="50" class="svg20250506-1-task_orange"/>
  <rect x="450" y="10" width="110" height="50" class="svg20250506-1-task_blue"/>
</svg>
</div>

# Parallel and synchronous

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-2-box" width="100%" viewBox="0 0 800 130">
  <style>
    #svg20250506-2-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-2-task_blue {
      stroke: #000000;
      stroke-width: 4;
      fill: #2d93ad;
    }
    .svg20250506-2-task_purple {
      stroke: #000000;
      stroke-width: 4;
      fill: #9665ac;
    }
    .svg20250506-2-task_orange {
      stroke: #000000;
      stroke-width: 4;
      fill: #f5bb00;
    }
  </style>
  <rect x="10" y="10" width="220" height="50" class="svg20250506-2-task_purple"/>
  <rect x="10" y="70" width="220" height="50" class="svg20250506-2-task_orange"/>
  <rect x="230" y="10" width="110" height="50" class="svg20250506-2-task_blue"/>
</svg>
</div>

- there is additional overhead (not illustrated), uses more hardware resources

---

# Sequential and synchronous

# Asynchronous, but not parallel

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-3-box" width="100%" viewBox="0 0 800 70">
  <style>
    #svg20250506-3-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-3-task_blue {
      stroke: #000000;
      stroke-width: 4;
      fill: #2d93ad;
    }
    .svg20250506-3-task_purple {
      stroke: #000000;
      stroke-width: 4;
      fill: #9665ac;
    }
    .svg20250506-3-task_orange {
      stroke: #000000;
      stroke-width: 4;
      fill: #f5bb00;
    }
  </style>
  <rect x="10" y="10" width="80" height="50" class="svg20250506-3-task_purple"/>
  <rect x="90" y="10" width="100" height="50" class="svg20250506-3-task_orange"/>
  <rect x="190" y="10" width="60" height="50" class="svg20250506-3-task_blue"/>
  <rect x="250" y="10" width="30" height="50" class="svg20250506-3-task_orange"/>
  <rect x="280" y="10" width="80" height="50" class="svg20250506-3-task_purple"/>
  <rect x="360" y="10" width="50" height="50" class="svg20250506-3-task_blue"/>
  <rect x="410" y="10" width="60" height="50" class="svg20250506-3-task_purple"/>
  <rect x="470" y="10" width="90" height="50" class="svg20250506-3-task_orange"/>
</svg>
</div>

- asynchronous means work stops (e.g. when it needs to wait) and continues
  later, letting other work be done in the meantime
- purple and yellow end later
- there is additional overhead (not illustrated)

---

# Sequential and synchronous

# Parallel and asynchronous

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-4-box" width="100%" viewBox="0 0 800 130">
  <style>
    #svg20250506-4-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-4-task_blue {
      stroke: #000000;
      stroke-width: 4;
      fill: #2d93ad;
    }
    .svg20250506-4-task_purple {
      stroke: #000000;
      stroke-width: 4;
      fill: #9665ac;
    }
    .svg20250506-4-task_orange {
      stroke: #000000;
      stroke-width: 4;
      fill: #f5bb00;
    }
  </style>
  <rect x="10" y="10" width="80" height="50" class="svg20250506-4-task_purple"/>
  <rect x="90" y="10" width="60" height="50" class="svg20250506-4-task_blue"/>
  <rect x="150" y="10" width="30" height="50" class="svg20250506-4-task_orange"/>
  <rect x="180" y="10" width="50" height="50" class="svg20250506-4-task_blue"/>
  <rect x="230" y="10" width="60" height="50" class="svg20250506-4-task_purple"/>
  <rect x="10" y="70" width="100" height="50" class="svg20250506-4-task_orange"/>
  <rect x="110" y="70" width="80" height="50" class="svg20250506-4-task_purple"/>
  <rect x="190" y="70" width="90" height="50" class="svg20250506-4-task_orange"/>
</svg>
</div>

- purple ends later
- there is additional overhead (not illustrated), uses more hardware resources

---

# What are coroutines?

---

# Function vs. coroutine

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-5-box" width="100%" viewBox="0 0 800 320">
  <style>
    #svg20250506-5-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-5-l1 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
    }
    .svg20250506-5-l2 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
      stroke-dasharray: 5;
    }
    .svg20250506-5-task_white {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-5-task_green {
      stroke: #000000;
      stroke-width: 4;
      fill: #c2b102;
    }
    .svg20250506-5-task_neutral2 {
      stroke: #000000;
      stroke-width: 4;
      fill: #6ac3d9;
    }
  </style>
  <rect x="10" y="10" width="80" height="300" class="svg20250506-5-task_white"/>
  <rect x="130" y="85" width="80" height="160" class="svg20250506-5-task_green"/>
  <path class="svg20250506-5-l1" d="M 50 30 v 70 h 120 v 65"/>
  <path class="svg20250506-5-l2" d="M 170 165 v 70 l -120 -125 v 180"/>
  <rect x="290" y="10" width="80" height="300" class="svg20250506-5-task_white"/>
  <rect x="410" y="85" width="80" height="160" class="svg20250506-5-task_neutral2"/>
  <path class="svg20250506-5-l1" d="M 330 30 v 70 h 120 v 20"/>
  <path class="svg20250506-5-l2" d="M 450 120 v 20 l -120 -30 v 55 l 120 -15 v 40 l -120 -15 v 35
    l 120 -10 v 35 l -120 -15 v 70"/>
</svg>
</div>

---

# Coroutines

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-6-box" width="100%" viewBox="0 0 800 320">
  <style>
    #svg20250506-6-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-6-l1 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
    }
    .svg20250506-6-l2 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
      stroke-dasharray: 5;
    }
    .svg20250506-6-task_white {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-6-task_neutral2 {
      stroke: #000000;
      stroke-width: 4;
      fill: #6ac3d9;
    }
  </style>
  <rect x="10" y="10" width="80" height="300" class="svg20250506-6-task_white"/>
  <rect x="130" y="85" width="80" height="160" class="svg20250506-6-task_neutral2"/>
  <path class="svg20250506-6-l1" d="M 50 30 v 70 h 120 v 20"/>
  <path class="svg20250506-6-l2" d="M 170 120 v 20 l -120 -30 v 55 l 120 -15 v 40 l -120 -15 v 35
    l 120 -10 v 35 l -120 -15 v 70"/>
  <rect x="290" y="10" width="80" height="300" class="svg20250506-6-task_white"/>
  <rect x="410" y="85" width="80" height="160" class="svg20250506-6-task_neutral2"/>
  <rect x="530" y="85" width="80" height="140" class="svg20250506-6-task_neutral2"/>
  <rect x="650" y="10" width="80" height="300" class="svg20250506-6-task_white"/>
  <path class="svg20250506-6-l1" d="M 330 30 v 70 h 120 v 20 l 120 -20 v 50"/>
  <path class="svg20250506-6-l2" d="M 570 150 l -120 -20 v 10 l -120 -30 v 180"/>
  <path class="svg20250506-6-l2" d="M 690 30 v 200 l -120 -70 v 35 l -120 -40 v 50 l 120 0 v 10 h -120
    v 20 l 240 5 v 55"/>
</svg>
</div>

---

# Stack

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-7-box" width="100%" viewBox="0 0 800 320">
  <style>
    #svg20250506-7-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-7-l1 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
    }
    .svg20250506-7-l2 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
      stroke-dasharray: 5;
    }
    .svg20250506-7-task_white {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-7-task_grey {
      stroke: #000000;
      stroke-width: 4;
      fill: #cccccc;
    }
    .svg20250506-7-task_green {
      stroke: #000000;
      stroke-width: 4;
      fill: #a8ac03;
    }
    .svg20250506-7-task_neutral2 {
      stroke: #000000;
      stroke-width: 4;
      fill: #2d93ad;
    }
  </style>
  <rect x="10" y="10" width="100" height="160" class="svg20250506-7-task_grey"/>
  <rect x="10" y="170" width="100" height="50" class="svg20250506-7-task_green"/>
  <rect x="10" y="220" width="100" height="80" class="svg20250506-7-task_white"/>
</svg>
</div>

---

# Stackful

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-8-box" width="100%" viewBox="0 0 800 320">
  <style>
    #svg20250506-8-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-8-l1 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
    }
    .svg20250506-8-l2 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
      stroke-dasharray: 5;
    }
    .svg20250506-8-task_white {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-8-task_grey {
      stroke: #000000;
      stroke-width: 4;
      fill: #cccccc;
    }
    .svg20250506-8-task_neutral {
      stroke: #000000;
      stroke-width: 4;
      fill: #2d93ad;
    }
  </style>
  <rect x="10" y="10" width="100" height="210" class="svg20250506-8-task_grey"/>
  <rect x="10" y="220" width="100" height="80" class="svg20250506-8-task_white"/>
  <rect x="290" y="10" width="100" height="240" class="svg20250506-8-task_grey"/>
  <rect x="290" y="250" width="100" height="50" class="svg20250506-8-task_neutral"/>
  <rect x="570" y="10" width="100" height="220" class="svg20250506-8-task_grey"/>
  <rect x="570" y="230" width="100" height="70" class="svg20250506-8-task_neutral"/>
</svg>
</div>

---

# Stackless

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-9-box" width="100%" viewBox="0 0 800 320">
  <style>
    #svg20250506-9-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-9-l1 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
    }
    .svg20250506-9-l2 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
      stroke-dasharray: 5;
    }
    .svg20250506-9-task_white {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-9-task_grey {
      stroke: #000000;
      stroke-width: 4;
      fill: #cccccc;
    }
    .svg20250506-9-task_neutral {
      stroke: #000000;
      stroke-width: 4;
      fill: #2d93ad;
    }
  </style>
  <rect x="10" y="10" width="100" height="210" class="svg20250506-9-task_grey"/>
  <rect x="10" y="220" width="100" height="80" class="svg20250506-9-task_white"/>
  <rect x="290" y="250" width="100" height="50" class="svg20250506-9-task_neutral"/>
  <rect x="570" y="230" width="100" height="70" class="svg20250506-9-task_neutral"/>
</svg>
</div>

---

# Duality

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-10-box" width="100%" viewBox="0 0 800 320">
  <style>
    #svg20250506-10-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-10-l1 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
    }
    .svg20250506-10-c1 {
      stroke: #000000;
      stroke-width: 2;
      fill: #ff36ab;
    }
    .svg20250506-10-c2 {
      stroke: #000000;
      stroke-width: 2;
      fill: #ff9bd5;
    }
    .svg20250506-10-c3 {
      stroke: #000000;
      stroke-width: 2;
      fill: #ffcdea;
    }
    .svg20250506-10-l2 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
      stroke-dasharray: 5;
    }
    .svg20250506-10-task_white {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-10-task_green {
      stroke: #000000;
      stroke-width: 4;
      fill: #c2b102;
    }
    .svg20250506-10-task_neutral {
      stroke: #000000;
      stroke-width: 4;
      fill: #2d93ad;
    }
    .svg20250506-10-task_neutral2 {
      stroke: #000000;
      stroke-width: 4;
      fill: #6ac3d9;
    }
  </style>
  <rect x="10" y="10" width="80" height="300" class="svg20250506-10-task_white"/>
  <rect x="130" y="85" width="80" height="160" class="svg20250506-10-task_green"/>
  <path class="svg20250506-10-l1" d="M 50 30 v 70 h 120 v 65"/>
  <path class="svg20250506-10-l2" d="M 170 165 v 70 l -120 -125 v 180"/>
  <circle cx="50" cy="110" r="7" class="svg20250506-10-c1"/>
  <rect x="290" y="10" width="80" height="300" class="svg20250506-10-task_white"/>
  <rect x="410" y="85" width="80" height="160" class="svg20250506-10-task_neutral2"/>
  <path class="svg20250506-10-l1" d="M 330 30 v 70 h 120 v 20"/>
  <path class="svg20250506-10-l2" d="M 450 120 v 20 l -120 -30 v 55 l 120 -15 v 40 l -120 -15 v 35
    l 120 -10 v 35 l -120 -15 v 70"/>
  <circle cx="330" cy="110" r="7" class="svg20250506-10-c1"/>
  <circle cx="330" cy="175" r="7" class="svg20250506-10-c2"/>
  <circle cx="330" cy="220" r="7" class="svg20250506-10-c3"/>
  <rect x="570" y="250" width="100" height="50" class="svg20250506-10-task_neutral"/>
</svg>
</div>

---

# Use cases for concurrency

---

# Compilers (maybe)

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-11-box" width="100%" viewBox="0 0 800 220">
  <style>
    #svg20250506-11-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-11-t1 {
      font-family: sans-serif;
      font-size: 16px;
      text-anchor: middle;
      dominant-baseline: middle;
    }
    .svg20250506-11-l1 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
    }
    .svg20250506-11-l2 {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-11-task_neutral2 {
      stroke: #000000;
      stroke-width: 4;
      fill: #6ac3d9;
    }
  </style>
  <path class="svg20250506-11-l2" d="M 10 80 h 60 l 20 20 v 100 h -80 z"/>
  <line class="svg20250506-11-l1" x1="90" y1="140" x2="150" y2="140"/>
  <text class="svg20250506-11-t1" x="50" y="35">source</text>
  <rect x="150" y="80" width="80" height="120" class="svg20250506-11-task_neutral2"/>
  <line class="svg20250506-11-l1" x1="230" y1="140" x2="290" y2="140"/>
  <text class="svg20250506-11-t1" x="190" y="20">lexical</text>
  <text class="svg20250506-11-t1" x="190" y="50">analysis</text>
  <rect x="290" y="80" width="80" height="120" class="svg20250506-11-task_neutral2"/>
  <line class="svg20250506-11-l1" x1="370" y1="140" x2="430" y2="140"/>
  <text class="svg20250506-11-t1" x="330" y="20">syntactical</text>
  <text class="svg20250506-11-t1" x="330" y="50">analysis</text>
  <rect x="430" y="80" width="80" height="120" class="svg20250506-11-task_neutral2"/>
  <line class="svg20250506-11-l1" x1="510" y1="140" x2="570" y2="140"/>
  <text class="svg20250506-11-t1" x="470" y="20">data</text>
  <text class="svg20250506-11-t1" x="470" y="50">analysis</text>
  <rect x="570" y="80" width="80" height="120" class="svg20250506-11-task_neutral2"/>
  <line class="svg20250506-11-l1" x1="650" y1="140" x2="710" y2="140"/>
  <text class="svg20250506-11-t1" x="610" y="20">instruction</text>
  <text class="svg20250506-11-t1" x="610" y="50">generator</text>
  <path class="svg20250506-11-l2" d="M 710 80 h 60 l 20 20 v 100 h -80 z"/>
  <text class="svg20250506-11-t1" x="750" y="35">code</text>
</svg>
</div>

- Melvin Conway: Design of a separable transition-diagram compiler
- of Conway law fame: product design mirrors organisation structure
- motivation: single pass compiler in memory constraints environments, each
  "coroutine" can output zero or more than one results each time it's invoked
- in the meantime: better ways of designing a compiler

---

# Simulations

- Bjarne Stroustrup: [A Set of C++ Classes for Co−routine Style
  Programming.](https://www.softwarepreservation.org/projects/c_plus_plus/cfront/release_e/doc/ClassesForCoroutines.pdf)
  Bell Laboratories Computer Science Technical Report CSTR−90. November 1980.
- motivation: event driven simulations (e.g. Bjarne's Ph.D. thesis work in Cambridge
  UK)
- `task` base class used to represent a independent activity which:
  - suspends voluntarily
  - can be resumed, canceled
  - can wait, sleep
  - provides an `int` result and communicates with other `task`s (e.g. producer
    and consumer, server reading from a queue)
  - work is done in the derived class constructor
- one of the earliest libraries in C++ (think `complex` and `string` without
  templates), yet did not make it to the standard
- some of the problems it tries to solve turn out to be recurring:
  cancellation, time management, forking and joining, queues, debugging, stack
  size, memory overhead etc.

---

# While down memory lane (1)

[A History of C++: 1979− 1991 Bjarne Stroustrup
(1995)](https://www.stroustrup.com/hopl2.pdf)

"was used to write the library that supported the desired styles of concurrency.
Please note that ‘‘styles’’ is plural. I considered it crucial – as I still do
– that more than one notion of concurrency should be expressible in the
language. This decision has been reconfirmed repeatedly by me and my
colleagues, by other C++ users, and by the C++ standards committee. There are
many applications for which support for concurrency is essential, but there is
no one dominant model for concurrency support; thus when support is needed it
should be provided through a library or a special purpose extension so that a
particular form of concurrency support does not preclude other forms."

- this is another recurring theme: there are all sort of ways to do concurrency

---

# While down memory lane (2)

[A History of C++: 1979− 1991 Bjarne Stroustrup
(1995)](https://www.stroustrup.com/hopl2.pdf)

"C with Classes could not provide benefits at the expense of removing
‘‘dangerous’’ or ‘‘ugly’’ features of C. This observation/principle had to be
repeated often to people (rarely C with Classes users) who wanted C with
Classes made safer by increasing static type checking along the lines of early
Pascal"

- this is another recurring theme: e.g. drive to use Rust "for safety" is not a
  grass-roots movement from the C++ users/developers

---

# GUI

```cpp
MSG msg;
BOOL bRet;

while ( (bRet = ::GetMessage(&msg, NULL, 0, 0)) != 0)
{
  if (bRet == -1)
  {
    // handle the error and possibly exit
  }
  else
  {
    ::TranslateMessage(&msg);
    ::DispatchMessage(&msg);
  }
}
```

---

# GUI

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-12-box" width="100%" viewBox="0 0 800 320">
  <style>
    #svg20250506-12-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-12-queue {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-12-l0 {
      stroke: #000000;
      stroke-width: 4;
      fill: none;
    }
    .svg20250506-12-t1 {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-12-t2 {
      stroke: #000000;
      stroke-width: 4;
      fill: #cccccc;
    }
    .svg20250506-12-l1 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
    }
    .svg20250506-12-l2 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
      stroke-dasharray: 5;
    }
    .svg20250506-12-tx {
      font-family: sans-serif;
      font-size: 16px;
      dominant-baseline: middle;
    }
  </style>
  <defs>
    <marker id="svg20250506-12-arrow" viewBox="0, 0, 10, 10" refX="5" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
      <path d="M 0 0 L 10 5 L 0 10 L 2.5 5 Z"/>
    </marker>
  </defs>
  <rect x="10" y="50" width="80" height="150" class="svg20250506-12-queue"/>
  <path class="svg20250506-12-l0" d="M 20 180 h 60"/>
  <path class="svg20250506-12-l0" d="M 20 160 h 60"/>
  <path class="svg20250506-12-l0" d="M 20 140 h 60"/>
  <circle cx="50" cy="270" r="40" class="svg20250506-12-t1"/>
  <path class="svg20250506-12-l1" d="M 50 200 v 22" marker-end="url(#svg20250506-12-arrow)"/>
  <path class="svg20250506-12-l2" d="M 90 270 h 90 v -240 h -110 v 12" marker-end="url(#svg20250506-12-arrow)"/>
  <circle cx="350" cy="270" r="40" class="svg20250506-12-t2"/>
  <path class="svg20250506-12-l2" d="M 390 270 h 50 v -260 h -410 v 32" marker-end="url(#svg20250506-12-arrow)"/>
  <text class="svg20250506-12-tx" x="60" y="220">GetMessage</text>
  <text class="svg20250506-12-tx" x="190" y="120">PostMessage</text>
  <text class="svg20250506-12-tx" x="450" y="120">PostThreadMessage</text>
</svg>
</div>

---

# GUI

- usually single threaded: a single dedicated UI thread
- the thread has a queue
- the thread is blocked waiting for a message from the queue
- when a message is returned, the thread processes it via `switch` statements
- messages are added to the queue via `PostMessage` (from the same thread) or
  `PostThreadMessage` (from a different thread)
- there is also a special "quit" message

- GUIs show the usage of queue of work to manage concurrency, of a blocking
  function to get next piece of work, not necessarily in need of coroutines
  (often long background activities are scheduled on additional threads)

---

# Async IO

- motivation: C10K (1999)
- networking is often IO bound (wait for data to be sent or received)
- what's wrong with `WaitForMultipleObjects` (more later)
  - `O(N)` time complexity with the number of handles provided
  - similar `select` for Linux
- IO completion ports: it's a queue again (more later)
  - similar `kevent` for Linux
- (boost) ASIO
- file IO
- timers/events/registry monitoring
- one problem: thread(s) blocked using API to get data from the queue, can't
  have a thread check for work in multiple incompatible queue types
- Windows thread pools to unify work covered by IO completion ports e.g.
  networking, with work not covered via IO completion ports
- similar evolution on other OSes

---

# WaitForMultipleObjects

- when a thread calls `WaitForMultipleObjects`, the caller provides a set of
  objects it waits for
- for each object in the set, the OS has to atomically (thread safe) add this
  thread ID to a list associated to the object
- the thread is suspended

- when an object in the set is signaled, the OS finds the thread ID in the
  object's list
- the thread is scheduled to resume

- when the thread resumes
- for each object in the set, the OS has to atomically remove this thread ID
  from the list associated to the object
- `WaitForMultipleObjects` returns

- The steps above lead to `O(N)` time complexity with the number of objects
  provided, which is why `WaitForMultipleObjects` limits it to 64

---

# Completion port

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-14-box" width="100%" viewBox="0 0 800 320">
  <style>
    #svg20250506-14-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-14-queue {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-14-l0 {
      stroke: #000000;
      stroke-width: 4;
      fill: none;
    }
    .svg20250506-14-t1 {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-14-t2 {
      stroke: #000000;
      stroke-width: 4;
      fill: #cccccc;
    }
    .svg20250506-14-l1 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
    }
    .svg20250506-14-l2 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
      stroke-dasharray: 5;
    }
    .svg20250506-14-tx {
      font-family: sans-serif;
      font-size: 16px;
      text-anchor: middle;
      dominant-baseline: middle;
    }
  </style>
  <defs>
    <marker id="svg20250506-14-arrow" viewBox="0, 0, 10, 10" refX="5" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
      <path d="M 0 0 L 10 5 L 0 10 L 2.5 5 Z"/>
    </marker>
  </defs>
  <rect x="110" y="50" width="80" height="150" class="svg20250506-14-queue"/>
  <path class="svg20250506-14-l0" d="M 120 180 h 60"/>
  <path class="svg20250506-14-l0" d="M 120 160 h 60"/>
  <path class="svg20250506-14-l0" d="M 120 140 h 60"/>
  <circle cx="100" cy="270" r="40" class="svg20250506-14-t1"/>
  <path class="svg20250506-14-l1" d="M 130 200 l -12 25" marker-end="url(#svg20250506-14-arrow)"/>
  <path class="svg20250506-14-l2" d="M 60 270 h -50 v -240 h 110 v 12" marker-end="url(#svg20250506-14-arrow)"/>
  <circle cx="200" cy="270" r="40" class="svg20250506-14-t1"/>
  <path class="svg20250506-14-l1" d="M 170 200 l 12 25" marker-end="url(#svg20250506-14-arrow)"/>
  <path class="svg20250506-14-l2" d="M 240 270 h 50 v -240 h -110 v 12" marker-end="url(#svg20250506-14-arrow)"/>
  <circle cx="350" cy="270" r="40" class="svg20250506-14-t2"/>
  <path class="svg20250506-14-l2" d="M 390 270 h 50 v -260 h -290 v 32" marker-end="url(#svg20250506-14-arrow)"/>
  <text class="svg20250506-14-tx" x="150" y="215">GetQueuedCompletionStatus x2</text>
  <text class="svg20250506-14-tx" x="560" y="120">PostQueuedCompletionStatus</text>
</svg>
</div>

---

# Completion port

- when a thread calls `GetQueuedCompletionStatus`
- it checks if work is queued (requires a lock)
- if not, it suspends, a single object needs to have this thread ID stored

- when an object associated with the completion port is signaled
- it's added to the queue
- the thread is scheduled to resume (if needed)

- when the thread resumes
- it picks work from the queue
- `GetQueuedCompletionStatus` return

- additional queueing can be done using `PostQueuedCompletionStatus`

- it solves the `O(N)` problem, work to queue/dequeue one item does not depend
  on the number of items in the queue
- the disadvantage is that most such queueing systems are incompatible, a
  thread is blocked on reading from a particular queue system

---

# Alertable I/O

<div align="center">
<svg xmlns="http://www.w3.org/2000/svg" id="svg20250506-13-box" width="100%" viewBox="0 0 800 320">
  <style>
    #svg20250506-13-box {
      border: 1px solid #e8e8e8;
      background-color: #f5f5f5;
    }
    .svg20250506-13-queue {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-13-l0 {
      stroke: #000000;
      stroke-width: 4;
      fill: none;
    }
    .svg20250506-13-t1 {
      stroke: #000000;
      stroke-width: 4;
      fill: #ffffff;
    }
    .svg20250506-13-t2 {
      stroke: #000000;
      stroke-width: 4;
      fill: #cccccc;
    }
    .svg20250506-13-l1 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
    }
    .svg20250506-13-l2 {
      stroke: #000000;
      stroke-width: 2;
      fill: none;
      stroke-dasharray: 5;
    }
    .svg20250506-13-tx {
      font-family: sans-serif;
      font-size: 16px;
      dominant-baseline: middle;
    }
  </style>
  <defs>
    <marker id="svg20250506-13-arrow" viewBox="0, 0, 10, 10" refX="5" refY="5" markerWidth="6" markerHeight="6" orient="auto-start-reverse">
      <path d="M 0 0 L 10 5 L 0 10 L 2.5 5 Z"/>
    </marker>
  </defs>
  <rect x="10" y="50" width="80" height="150" class="svg20250506-13-queue"/>
  <path class="svg20250506-13-l0" d="M 20 180 h 60"/>
  <path class="svg20250506-13-l0" d="M 20 160 h 60"/>
  <path class="svg20250506-13-l0" d="M 20 140 h 60"/>
  <circle cx="50" cy="270" r="40" class="svg20250506-13-t1"/>
  <path class="svg20250506-13-l1" d="M 50 200 v 22" marker-end="url(#svg20250506-13-arrow)"/>
  <path class="svg20250506-13-l2" d="M 90 270 h 110 v -240 h -130 v 12" marker-end="url(#svg20250506-13-arrow)"/>
  <circle cx="350" cy="270" r="40" class="svg20250506-13-t2"/>
  <path class="svg20250506-13-l2" d="M 390 270 h 50 v -260 h -410 v 32" marker-end="url(#svg20250506-13-arrow)"/>
  <text class="svg20250506-13-tx" x="60" y="220">SleepEx/Wait...Ex</text>
  <text class="svg20250506-13-tx" x="210" y="120">ReadFileEx</text>
  <text class="svg20250506-13-tx" x="210" y="140">WriteFileEx</text>
  <text class="svg20250506-13-tx" x="450" y="120">QueueUserAPC</text>
</svg>
</div>

---

# Alertable I/O

- `ReadFileEx` and `WriteFileEx` receive a completion function which is
  executed when the thread enters a "alertable wait state" which happens when
  the thread calls `SleepEx`, `WaitForSingleObjectEx` etc.
- effectively the completion functions are queued when the operation completes
- the queue is emptied when one of the `Sleep/Wait..Ex` are called
- other options to queue: `QueueUserAPC`, `SetWaitableTimer`
- single threaded: the completions for `ReadFileEx` and `WriteFileEx` are
  dequeued from the same thread (it's harder to distribute load to multiple
  threads compared with the completion port)

---

# How about C++11 mechanisms?

- `std::thread`: too much resources (e.g. stack) to just serve a single/few
  operations
- `std::async`: similar, also poorly defined behaviour
- `std::future`/`std::promise`: require allocation and reference counting of
  shared state, synchronization even for sequential activities AND detached
  i.e. operation continues even if `std::future` does not wait for result
  (potential for dangling references)

---

# Coroutines: Async IO

- C++20 coroutines allow code as the one below
- network echo work with coroutine support
- motivation for this series of articles/presentations

```cpp
task<void> async_echo(socket s) {
  buffer buff;
  while (true) {
    std::size_t length = co_await async_read_some(s, buff);
    co_await async_write(s, buff, length);
  }
}
```

- easy to understand behaviour e.g.:
  - scope for the buffer
  - exceptions (but slow, so use sparingly)
  - cancellation
- disadvantage: additional allocation

---

# Coroutines: Async IO

- without coroutines it's a lot of callbacks and manual memory management

```cpp
class session : public std::enable_shared_from_this<session>
{
  // ...
  void do_read()
  {
    auto self(shared_from_this());
    socket_.async_read_some(buff_/*...*/,
      [this, self](boost::system::error_code ec, std::size_t length) {
        do_write(length);
      });
  }

void do_write(std::size_t length)
  {
    auto self(shared_from_this());
    boost::asio::async_write(buff_/*...*/,
      [this, self](boost::system::error_code ec, std::size_t /*length*/) {
        do_read();
      });
  }
```

---

# Nanocoroutines

- motivation: handle memory latencies for database JOINs
- CppCon 2018: G. Nishanov [“Nano-coroutines to the Rescue! (Using Coroutines
  TS, of Course)”](https://www.youtube.com/watch?v=j9tlJAqMV7U)
- the "slow" operation is fetching data from memory to CPU cache
- the allocation overheads of coroutines will have to be mitigated somehow

---

# Generator

```cpp
struct fib_state {
  int x_1 = 1;
  int x_2 = 0;
};

int next_fib(fib_state & state) {
  int val = state.x_1 + state.x_2;
  state.x_2 = state.x_1;
  state.x_1 = val;
  return val;
}

void foo() {
  fib_state state;
  while (true) {
    int x = next_fib(state);
    // use x, exit loop at some point
  }
}
```
---

# Generator

```cpp
generator fib() {
  int x_1 = 1;
  int x_2 = 0;
  while (true) {
    int val = x_1 + x_2;
    yield val;
    x_2 = x_1;
    x_1 = val;
  }
}

void foo() {
  for (int x: fib()) {
    // use x, exit loop at some point
  }
}
```

- `val` is `yield`ed before `x_2` and `x_1` are updated

---

# Generator

- using C++23

```cpp
std::generator<int> fib() {
  int x_1 = 1;
  int x_2 = 0;
  while (true) {
    int val = x_1 + x_2;
    co_yield val;
    x_2 = x_1;
    x_1 = val;
  }
}

void foo() {
  for (int x: fib()) {
    // use x, exit loop at some point
  }
}
```

---

# GPU work

- there is a lot of effort (e.g. by Nvidia) to make calculations work from C++
  (algorithms, coroutines, sender/receives etc.) compatible with "normal" CPU
  usage

- beware that choices made for a use case (e.g. GPU) might be not the right
  ones for another use case (e.g. async networking IO)

---

# Embedded work

- don't throw exceptions
- don't allocate

---

# Kernel mode

- don't throw exceptions
- allocations: don't throw `std::bad_alloc` either

---

# Sender/receiver

- concepts for concurrency included in C++26
- can work with coroutines, CPU, GPU, embedded
- can deliver fast code, no allocations
- but with development complexity costs for simple sequential code

---

# Choices

---

# Interaction with threads

- single thread simplifies a lot of things, but has limits of how much
  computation is available
- explicit multithreading
  - can a coroutine resume in a different thread?
  - `scoped_lock|lock_guard<std::mutex>` potential for misuse
- detached operations
  - work is not waited to complete: usually a bad pattern
  - but in the embedded work it makes sense to continue work, triggered by
    interrupts, long after main exited

---

# Exception-less mode

- e.g. for using in kernel code without using C++ exceptions
- this was important for Microsoft (that significantly drove towards C++
  coroutine adoptions)

---

# Memory

- how do we control allocations?
- even better: how do we avoid allocations?

- in some cases it's important to avoid allocations:
  - hot/performance sensitive code
  - kernel/embedded environments

- also stack usage: it's easy to stack overflow

---

# Timing matters/choices

- read operations might get more than what was requested (e.g. when request is
  less than data in the IP package). Subsequent reads:
  - could always queue the continuation (to allow fairness, at cost of
    additional work)
  - avoid queueing if data already available (maximise speed by avoiding work
    in this case)

- wait when_all(op1, op2, op3) can mean different things for different
  environments:
 - initiate all operations, the progress in interleaved, return when the last one
   (by time) completes (e.g. op2)
 - initiate sequentially, progress is not interleaved, return when the last one
   (in sequence) completes (i.e. op3)

---

# Cancellation

- it's easy to cancel before starting work
- but in some cases the work can be pending for long periods of time: e.g. wait
  for registry entry to change (i.e. even never) => cancellation mechanism IS
  required

---

# Programming ergonomics

- A library that can do anything might sacrifice ease of use
  - and might not do specific things well
  - boost::asio and CreateFileA
  - networking TS and HTTPS
- How easy is it to shoot yourself in the foot?
  - sometimes safety is sacrificed to allow some usage scenarios

---

# Questions?