1In order to buy some performance on the common, uninstrumented, fast path, we replace repeated
2checks for both allocation instrumentation and allocator changes by  a single function table
3dispatch, and templatized allocation code that can be used to generate either instrumented
4or uninstrumented versions of allocation routines.
5
6When we call an allocation routine, we always indirect through a thread-local function table that
7either points to instrumented or uninstrumented allocation routines. The instrumented code has a
8`kInstrumented` = true template argument (or `kIsInstrumented` in some places), the uninstrumented
9code has `kInstrumented` = false.
10
11The function table is thread-local. There appears to be no logical necessity for that; it just
12makes it easier to access from compiled Java code.
13
14- The function table is switched out by `InstrumentQuickAllocEntryPoints[Locked]`, and a
15corresponding `UninstrumentQuickAlloc`... function.
16
17- These in turn are called by `SetStatsEnabled()`, `SetAllocationListener()`, et al, which
18require the mutator lock is not held.
19
20- With a started runtime, `SetEntrypointsInstrumented()` calls `ScopedSupendAll(`) before updating
21  the function table.
22
23Mutual exclusion in the dispatch table is thus ensured by the fact that it is only updated while
24all other threads are suspended, and is only accessed with the mutator lock logically held,
25which inhibits suspension.
26
27To ensure correctness, we thus must:
28
291. Suspend all threads when swapping out the dispatch table, and
302. Make sure that we hold the mutator lock when accessing it.
313. Not trust kInstrumented once we've given up the mutator lock, since it could have changed in the
32    interim.
33
34