1# Combiner Explanation
2## Talk by ctiller, notes by vjpai
3
4Typical way of doing critical section
5
6```
7mu.lock()
8do_stuff()
9mu.unlock()
10```
11
12An alternative way of doing it is
13
14```
15class combiner {
16  run(f) {
17    mu.lock()
18    f()
19    mu.unlock()
20  }
21  mutex mu;
22}
23
24combiner.run(do_stuff)
25```
26
27If you have two threads calling combiner, there will be some kind of
28queuing in place. It's called `combiner` because you can pass in more
29than one do_stuff at once and they will run under a common `mu`.
30
31The implementation described above has the issue that you're blocking a thread
32for a period of time, and this is considered harmful because it's an application thread that you're blocking.
33
34Instead, get a new property:
35* Keep things running in serial execution
36* Don't ever sleep the thread
37* But maybe allow things to end up running on a different thread from where they were started
38* This means that `do_stuff` doesn't necessarily run to completion when `combiner.run` is invoked
39
40```
41class combiner {
42  mpscq q; // multi-producer single-consumer queue can be made non-blocking
43  state s; // is it empty or executing
44
45  run(f) {
46    if (q.push(f)) {
47      // q.push returns true if it's the first thing
48      while (q.pop(&f)) { // modulo some extra work to avoid races
49        f();
50      }
51    }
52  }
53}
54```
55
56The basic idea is that the first one to push onto the combiner
57executes the work and then keeps executing functions from the queue
58until the combiner is drained.
59
60Our combiner does some additional work, with the motivation of write-batching.
61
62We have a second tier of `run` called `run_finally`. Anything queued
63onto `run_finally` runs after we have drained the queue. That means
64that there is essentially a finally-queue. This is not guaranteed to
65be final, but it's best-effort. In the process of running the finally
66item, we might put something onto the main combiner queue and so we'll
67need to re-enter.
68
69`chttp2` runs all ops in the run state except if it sees a write it puts that into a finally. That way anything else that gets put into the combiner can add to that write.
70
71```
72class combiner {
73  mpscq q; // multi-producer single-consumer queue can be made non-blocking
74  state s; // is it empty or executing
75  queue finally; // you can only do run_finally when you are already running something from the combiner
76
77  run(f) {
78    if (q.push(f)) {
79      // q.push returns true if it's the first thing
80      loop:
81      while (q.pop(&f)) { // modulo some extra work to avoid races
82        f();
83      }
84      while (finally.pop(&f)) {
85        f();
86      }
87      goto loop;
88    }
89  }
90}
91```
92
93So that explains how combiners work in general. In gRPC, there is
94`start_batch(..., tag)` and then work only gets activated by somebody
95calling `cq::next` which returns a tag. This gives an API-level
96guarantee that there will be a thread doing polling to actually make
97work happen. However, some operations are not covered by a poller
98thread, such as cancellation that doesn't have a completion. Other
99callbacks that don't have a completion are the internal work that gets
100done before the batch gets completed. We need a condition called
101`covered_by_poller` that means that the item will definitely need some
102thread at some point to call `cq::next` . This includes those
103callbacks that directly cause a completion but also those that are
104indirectly required before getting a completion. If we can't tell for
105sure for a specific path, we have to assumed it is not covered by
106poller.
107
108The above combiner has the problem that it keeps draining for a
109potentially infinite amount of time and that can lead to a huge tail
110latency for some operations. So we can tweak it by returning to the application
111if we know that it is valid to do so:
112
113```
114while (q.pop(&f)) {
115  f();
116  if (control_can_be_returned && some_still_queued_thing_is_covered_by_poller) {
117    offload_combiner_work_to_some_other_thread();
118  }
119}
120```
121
122`offload` is more than `break`; it does `break` but also causes some
123other thread that is currently waiting on a poll to break out of its
124poll. This is done by setting up a per-polling-island work-queue
125(distributor) wakeup FD. The work-queue is the converse of the combiner; it
126tries to spray events onto as many threads as possible to get as much concurrency as possible.
127
128So `offload` really does:
129
130```
131  workqueue.run(continue_from_while_loop);
132  break;
133```
134
135This needs us to add another class variable for a `workqueue`
136(which is really conceptually a distributor).
137
138```
139workqueue::run(f) {
140  q.push(f)
141  eventfd.wakeup()
142}
143
144workqueue::readable() {
145  eventfd.consume();
146  q.pop(&f);
147  f();
148  if (!q.empty()) {
149    eventfd.wakeup(); // spray across as many threads as are waiting on this workqueue
150  }
151}
152```
153
154In principle, `run_finally` could get starved, but this hasn't
155happened in practice. If we were concerned about this, we could put a
156limit on how many things come off the regular `q` before the `finally`
157queue gets processed.
158
159