1.. _module-pw_metric:
2
3=========
4pw_metric
5=========
6
7.. attention::
8
9  This module is **not yet production ready**; ask us if you are interested in
10  using it out or have ideas about how to improve it.
11
12--------
13Overview
14--------
15Pigweed's metric module is a **lightweight manual instrumentation system** for
16tracking system health metrics like counts or set values. For example,
17``pw_metric`` could help with tracking the number of I2C bus writes, or the
18number of times a buffer was filled before it could drain in time, or safely
19incrementing counters from ISRs.
20
21Key features of ``pw_metric``:
22
23- **Tokenized names** - Names are tokenized using the ``pw_tokenizer`` enabling
24  long metric names that don't bloat your binary.
25
26- **Tree structure** - Metrics can form a tree, enabling grouping of related
27  metrics for clearer organization.
28
29- **Per object collection** - Metrics and groups can live on object instances
30  and be flexibly combined with metrics from other instances.
31
32- **Global registration** - For legacy code bases or just because it's easier,
33  ``pw_metric`` supports automatic aggregation of metrics. This is optional but
34  convenient in many cases.
35
36- **Simple design** - There are only two core data structures: ``Metric`` and
37  ``Group``, which are both simple to understand and use. The only type of
38  metric supported is ``uint32_t`` and ``float``. This module does not support
39  complicated aggregations like running average or min/max.
40
41Example: Instrumenting a single object
42--------------------------------------
43The below example illustrates what instrumenting a class with a metric group
44and metrics might look like. In this case, the object's
45``MySubsystem::metrics()`` member is not globally registered; the user is on
46their own for combining this subsystem's metrics with others.
47
48.. code::
49
50  #include "pw_metric/metric.h"
51
52  class MySubsystem {
53   public:
54    void DoSomething() {
55      attempts_.Increment();
56      if (ActionSucceeds()) {
57        successes_.Increment();
58      }
59    }
60    Group& metrics() { return metrics_; }
61
62   private:
63    PW_METRIC_GROUP(metrics_, "my_subsystem");
64    PW_METRIC(metrics_, attempts_, "attempts", 0u);
65    PW_METRIC(metrics_, successes_, "successes", 0u);
66  };
67
68The metrics subsystem has no canonical output format at this time, but a JSON
69dump might look something like this:
70
71.. code:: none
72
73  {
74    "my_subsystem" : {
75      "successes" : 1000,
76      "attempts" : 1200,
77    }
78  }
79
80In this case, every instance of ``MySubsystem`` will have unique counters.
81
82Example: Instrumenting a legacy codebase
83----------------------------------------
84A common situation in embedded development is **debugging legacy code** or code
85which is hard to change; where it is perhaps impossible to plumb metrics
86objects around with dependency injection. The alternative to plumbing metrics
87is to register the metrics through a global mechanism. ``pw_metric`` supports
88this use case. For example:
89
90**Before instrumenting:**
91
92.. code::
93
94  // This code was passed down from generations of developers before; no one
95  // knows what it does or how it works. But it needs to be fixed!
96  void OldCodeThatDoesntWorkButWeDontKnowWhy() {
97    if (some_variable) {
98      DoSomething();
99    } else {
100      DoSomethingElse();
101    }
102  }
103
104**After instrumenting:**
105
106.. code::
107
108  #include "pw_metric/global.h"
109  #include "pw_metric/metric.h"
110
111  PW_METRIC_GLOBAL(legacy_do_something, "legacy_do_something");
112  PW_METRIC_GLOBAL(legacy_do_something_else, "legacy_do_something_else");
113
114  // This code was passed down from generations of developers before; no one
115  // knows what it does or how it works. But it needs to be fixed!
116  void OldCodeThatDoesntWorkButWeDontKnowWhy() {
117    if (some_variable) {
118      legacy_do_something.Increment();
119      DoSomething();
120    } else {
121      legacy_do_something_else.Increment();
122      DoSomethingElse();
123    }
124  }
125
126In this case, the developer merely had to add the metrics header, define some
127metrics, and then start incrementing them. These metrics will be available
128globally through the ``pw::metric::global_metrics`` object defined in
129``pw_metric/global.h``.
130
131Why not just use simple counter variables?
132------------------------------------------
133One might wonder what the point of leveraging a metric library is when it is
134trivial to make some global variables and print them out. There are a few
135reasons:
136
137- **Metrics offload** - To make it easy to get metrics off-device by sharing
138  the infrastructure for offloading.
139
140- **Consistent format** - To get the metrics in a consistent format (e.g.
141  protobuf or JSON) for analysis
142
143- **Uncoordinated collection** - To provide a simple and reliable way for
144  developers on a team to all collect metrics for their subsystems, without
145  having to coordinate to offload. This could extend to code in libraries
146  written by other teams.
147
148- **Pre-boot or interrupt visibility** - Some of the most challenging bugs come
149  from early system boot when not all system facilities are up (e.g. logging or
150  UART). In those cases, metrics provide a low-overhead approach to understand
151  what is happening. During early boot, metrics can be incremented, then after
152  boot dumping the metrics provides insights into what happened. While basic
153  counter variables can work in these contexts to, one still has to deal with
154  the offloading problem; which the library handles.
155
156---------------------
157Metrics API reference
158---------------------
159
160The metrics API consists of just a few components:
161
162- The core data structures ``pw::metric::Metric`` and ``pw::metric::Group``
163- The macros for scoped metrics and groups ``PW_METRIC`` and
164  ``PW_METRIC_GROUP``
165- The macros for globally registered metrics and groups
166  ``PW_METRIC_GLOBAL`` and ``PW_METRIC_GROUP_GLOBAL``
167- The global groups and metrics list: ``pw::metric::global_groups`` and
168  ``pw::metric::global_metrics``.
169
170Metric
171------
172The ``pw::metric::Metric`` provides:
173
174- A 31-bit tokenized name
175- A 1-bit discriminator for int or float
176- A 32-bit payload (int or float)
177- A 32-bit next pointer (intrusive list)
178
179The metric object is 12 bytes on 32-bit platforms.
180
181.. cpp:class:: pw::metric::Metric
182
183  .. cpp:function:: Increment(uint32_t amount = 0)
184
185    Increment the metric by the given amount. Results in undefined behaviour if
186    the metric is not of type int.
187
188  .. cpp:function:: Set(uint32_t value)
189
190    Set the metric to the given value. Results in undefined behaviour if the
191    metric is not of type int.
192
193  .. cpp:function:: Set(float value)
194
195    Set the metric to the given value. Results in undefined behaviour if the
196    metric is not of type float.
197
198Group
199-----
200The ``pw::metric::Group`` object is simply:
201
202- A name for the group
203- A list of children groups
204- A list of leaf metrics groups
205- A 32-bit next pointer (intrusive list)
206
207The group object is 16 bytes on 32-bit platforms.
208
209.. cpp:class:: pw::metric::Group
210
211  .. cpp:function:: Dump(int indent_level = 0)
212
213    Recursively dump a metrics group to ``pw_log``. Produces output like:
214
215    .. code:: none
216
217      "$6doqFw==": {
218        "$05OCZw==": {
219          "$VpPfzg==": 1,
220          "$LGPMBQ==": 1.000000,
221          "$+iJvUg==": 5,
222        }
223        "$9hPNxw==": 65,
224        "$oK7HmA==": 13,
225        "$FCM4qQ==": 0,
226      }
227
228    Note the metric names are tokenized with base64. Decoding requires using
229    the Pigweed detokenizer. With a detokenizing-enabled logger, you could get
230    something like:
231
232    .. code:: none
233
234      "i2c_1": {
235        "gyro": {
236          "num_sampleses": 1,
237          "init_time_us": 1.000000,
238          "initialized": 5,
239        }
240        "bus_errors": 65,
241        "transactions": 13,
242        "bytes_sent": 0,
243      }
244
245Macros
246------
247The **macros are the primary mechanism for creating metrics**, and should be
248used instead of directly constructing metrics or groups. The macros handle
249tokenizing the metric and group names.
250
251.. cpp:function:: PW_METRIC(identifier, name, value)
252.. cpp:function:: PW_METRIC(group, identifier, name, value)
253.. cpp:function:: PW_METRIC_STATIC(identifier, name, value)
254.. cpp:function:: PW_METRIC_STATIC(group, identifier, name, value)
255
256  Declare a metric, optionally adding it to a group.
257
258  - **identifier** - An identifier name for the created variable or member.
259    For example: ``i2c_transactions`` might be used as a local or global
260    metric; inside a class, could be named according to members
261    (``i2c_transactions_`` for Google's C++ style).
262  - **name** - The string name for the metric. This will be tokenized. There
263    are no restrictions on the contents of the name; however, consider
264    restricting these to be valid C++ identifiers to ease integration with
265    other systems.
266  - **value** - The initial value for the metric. Must be either a floating
267    point value (e.g. ``3.2f``) or unsigned int (e.g. ``21u``).
268  - **group** - A ``pw::metric::Group`` instance. If provided, the metric is
269    added to the given group.
270
271  The macro declares a variable or member named "name" with type
272  ``pw::metric::Metric``, and works in three contexts: global, local, and
273  member.
274
275  If the `_STATIC` variant is used, the macro declares a variable with static
276  storage. These can be used in function scopes, but not in classes.
277
278  1. At global scope:
279
280    .. code::
281
282      PW_METRIC(foo, "foo", 15.5f);
283
284      void MyFunc() {
285        foo.Increment();
286      }
287
288  2. At local function or member function scope:
289
290    .. code::
291
292      void MyFunc() {
293        PW_METRIC(foo, "foo", 15.5f);
294        foo.Increment();
295        // foo goes out of scope here; be careful!
296      }
297
298  3. At member level inside a class or struct:
299
300    .. code::
301
302      struct MyStructy {
303        void DoSomething() {
304          somethings.Increment();
305        }
306        // Every instance of MyStructy will have a separate somethings counter.
307        PW_METRIC(somethings, "somethings", 0u);
308      }
309
310  You can also put a metric into a group with the macro. Metrics can belong to
311  strictly one group, otherwise a assertion will fail. Example:
312
313  .. code::
314
315    PW_METRIC_GROUP(my_group, "my_group");
316    PW_METRIC(my_group, foo, "foo", 0.2f);
317    PW_METRIC(my_group, bar, "bar", 44000u);
318    PW_METRIC(my_group, zap, "zap", 3.14f);
319
320  .. tip::
321
322    If you want a globally registered metric, see ``pw_metric/global.h``; in
323    that contexts, metrics are globally registered without the need to
324    centrally register in a single place.
325
326.. cpp:function:: PW_METRIC_GROUP(identifier, name)
327.. cpp:function:: PW_METRIC_GROUP(parent_group, identifier, name)
328.. cpp:function:: PW_METRIC_GROUP_STATIC(identifier, name)
329.. cpp:function:: PW_METRIC_GROUP_STATIC(parent_group, identifier, name)
330
331  Declares a ``pw::metric::Group`` with name name; the name is tokenized.
332  Works similar to ``PW_METRIC`` and can be used in the same contexts (global,
333  local, and member). Optionally, the group can be added to a parent group.
334
335  If the `_STATIC` variant is used, the macro declares a variable with static
336  storage. These can be used in function scopes, but not in classes.
337
338  Example:
339
340  .. code::
341
342    PW_METRIC_GROUP(my_group, "my_group");
343    PW_METRIC(my_group, foo, "foo", 0.2f);
344    PW_METRIC(my_group, bar, "bar", 44000u);
345    PW_METRIC(my_group, zap, "zap", 3.14f);
346
347.. cpp:function:: PW_METRIC_GLOBAL(identifier, name, value)
348
349  Declare a ``pw::metric::Metric`` with name name, and register it in the
350  global metrics list ``pw::metric::global_metrics``.
351
352  Example:
353
354  .. code::
355
356    #include "pw_metric/metric.h"
357    #include "pw_metric/global.h"
358
359    // No need to coordinate collection of foo and bar; they're autoregistered.
360    PW_METRIC_GLOBAL(foo, "foo", 0.2f);
361    PW_METRIC_GLOBAL(bar, "bar", 44000u);
362
363  Note that metrics defined with ``PW_METRIC_GLOBAL`` should never be added to
364  groups defined with ``PW_METRIC_GROUP_GLOBAL``. Each metric can only belong
365  to one group, and metrics defined with ``PW_METRIC_GLOBAL`` are
366  pre-registered with the global metrics list.
367
368  .. attention::
369
370    Do not create ``PW_METRIC_GLOBAL`` instances anywhere other than global
371    scope. Putting these on an instance (member context) would lead to dangling
372    pointers and misery. Metrics are never deleted or unregistered!
373
374.. cpp:function:: PW_METRIC_GROUP_GLOBAL(identifier, name, value)
375
376  Declare a ``pw::metric::Group`` with name name, and register it in the
377  global metric groups list ``pw::metric::global_groups``.
378
379  Note that metrics created with ``PW_METRIC_GLOBAL`` should never be added to
380  groups! Instead, just create a freestanding metric and register it into the
381  global group (like in the example below).
382
383  Example:
384
385  .. code::
386
387    #include "pw_metric/metric.h"
388    #include "pw_metric/global.h"
389
390    // No need to coordinate collection of this group; it's globally registered.
391    PW_METRIC_GROUP_GLOBAL(leagcy_system, "legacy_system");
392    PW_METRIC(leagcy_system, foo, "foo",0.2f);
393    PW_METRIC(leagcy_system, bar, "bar",44000u);
394
395  .. attention::
396
397    Do not create ``PW_METRIC_GROUP_GLOBAL`` instances anywhere other than
398    global scope. Putting these on an instance (member context) would lead to
399    dangling pointers and misery. Metrics are never deleted or unregistered!
400
401----------------------
402Usage & Best Practices
403----------------------
404This library makes several tradeoffs to enable low memory use per-metric, and
405one of those tradeoffs results in requiring care in constructing the metric
406trees.
407
408Use the Init() pattern for static objects with metrics
409------------------------------------------------------
410A common pattern in embedded systems is to allocate many objects globally, and
411reduce reliance on dynamic allocation (or eschew malloc entirely). This leads
412to a pattern where rich/large objects are statically constructed at global
413scope, then interacted with via tasks or threads. For example, consider a
414hypothetical global ``Uart`` object:
415
416.. code::
417
418  class Uart {
419   public:
420    Uart(span<std::byte> rx_buffer, span<std::byte> tx_buffer)
421      : rx_buffer_(rx_buffer), tx_buffer_(tx_buffer) {}
422
423    // Send/receive here...
424
425   private:
426    std::span<std::byte> rx_buffer;
427    std::span<std::byte> tx_buffer;
428  };
429
430  std::array<std::byte, 512> uart_rx_buffer;
431  std::array<std::byte, 512> uart_tx_buffer;
432  Uart uart1(uart_rx_buffer, uart_tx_buffer);
433
434Through the course of building a product, the team may want to add metrics to
435the UART to for example gain insight into which operations are triggering lots
436of data transfer. When adding metrics to the above imaginary UART object, one
437might consider the following approach:
438
439.. code::
440
441  class Uart {
442   public:
443    Uart(span<std::byte> rx_buffer,
444         span<std::byte> tx_buffer,
445         Group& parent_metrics)
446      : rx_buffer_(rx_buffer),
447        tx_buffer_(tx_buffer) {
448        // PROBLEM! parent_metrics may not be constructed if it's a reference
449        // to a static global.
450        parent_metrics.Add(tx_bytes_);
451        parent_metrics.Add(rx_bytes_);
452     }
453
454    // Send/receive here which increment tx/rx_bytes.
455
456   private:
457    std::span<std::byte> rx_buffer;
458    std::span<std::byte> tx_buffer;
459
460    PW_METRIC(tx_bytes_, "tx_bytes", 0);
461    PW_METRIC(rx_bytes_, "rx_bytes", 0);
462  };
463
464  PW_METRIC_GROUP(global_metrics, "/");
465  PW_METRIC_GROUP(global_metrics, uart1_metrics, "uart1");
466
467  std::array<std::byte, 512> uart_rx_buffer;
468  std::array<std::byte, 512> uart_tx_buffer;
469  Uart uart1(uart_rx_buffer,
470             uart_tx_buffer,
471             uart1_metrics);
472
473However, this **is incorrect**, since the ``parent_metrics`` (pointing to
474``uart1_metrics`` in this case) may not be constructed at the point of
475``uart1`` getting constructed. Thankfully in the case of ``pw_metric`` this
476will result in an assertion failure (or it will work correctly if the
477constructors are called in a favorable order), so the problem will not go
478unnoticed.  Instead, consider using the ``Init()`` pattern for static objects,
479where references to dependencies may only be stored during construction, but no
480methods on the dependencies are called.
481
482Instead, the ``Init()`` approach separates global object construction into two
483phases: The constructor where references are stored, and a ``Init()`` function
484which is called after all static constructors have run. This approach works
485correctly, even when the objects are allocated globally:
486
487.. code::
488
489  class Uart {
490   public:
491    // Note that metrics is not passed in here at all.
492    Uart(span<std::byte> rx_buffer,
493         span<std::byte> tx_buffer)
494      : rx_buffer_(rx_buffer),
495        tx_buffer_(tx_buffer) {}
496
497     // Precondition: parent_metrics is already constructed.
498     void Init(Group& parent_metrics) {
499        parent_metrics.Add(tx_bytes_);
500        parent_metrics.Add(rx_bytes_);
501     }
502
503    // Send/receive here which increment tx/rx_bytes.
504
505   private:
506    std::span<std::byte> rx_buffer;
507    std::span<std::byte> tx_buffer;
508
509    PW_METRIC(tx_bytes_, "tx_bytes", 0);
510    PW_METRIC(rx_bytes_, "rx_bytes", 0);
511  };
512
513  PW_METRIC_GROUP(root_metrics, "/");
514  PW_METRIC_GROUP(root_metrics, uart1_metrics, "uart1");
515
516  std::array<std::byte, 512> uart_rx_buffer;
517  std::array<std::byte, 512> uart_tx_buffer;
518  Uart uart1(uart_rx_buffer,
519             uart_tx_buffer);
520
521  void main() {
522    // uart1_metrics is guaranteed to be initialized by this point, so it is
523    safe to pass it to Init().
524    uart1.Init(uart1_metrics);
525  }
526
527.. attention::
528
529  Be extra careful about **static global metric registration**. Consider using
530  the ``Init()`` pattern.
531
532Metric member order matters in objects
533--------------------------------------
534The order of declaring in-class groups and metrics matters if the metrics are
535within a group declared inside the class. For example, the following class will
536work fine:
537
538.. code::
539
540  #include "pw_metric/metric.h"
541
542  class PowerSubsystem {
543   public:
544     Group& metrics() { return metrics_; }
545     const Group& metrics() const { return metrics_; }
546
547   private:
548    PW_METRIC_GROUP(metrics_, "power");  // Note metrics_ declared first.
549    PW_METRIC(metrics_, foo, "foo", 0.2f);
550    PW_METRIC(metrics_, bar, "bar", 44000u);
551  };
552
553but the following one will not since the group is constructed after the metrics
554(and will result in a compile error):
555
556.. code::
557
558  #include "pw_metric/metric.h"
559
560  class PowerSubsystem {
561   public:
562     Group& metrics() { return metrics_; }
563     const Group& metrics() const { return metrics_; }
564
565   private:
566    PW_METRIC(metrics_, foo, "foo", 0.2f);
567    PW_METRIC(metrics_, bar, "bar", 44000u);
568    PW_METRIC_GROUP(metrics_, "power");  // Error: metrics_ must be first.
569  };
570
571.. attention::
572
573  Put **groups before metrics** when declaring metrics members inside classes.
574
575Thread safety
576-------------
577``pw_metric`` has **no built-in synchronization for manipulating the tree**
578structure. Users are expected to either rely on shared global mutex when
579constructing the metric tree, or do the metric construction in a single thread
580(e.g. a boot/init thread). The same applies for destruction, though we do not
581advise destructing metrics or groups.
582
583Individual metrics have atomic ``Increment()``, ``Set()``, and the value
584accessors ``as_float()`` and ``as_int()`` which don't require separate
585synchronization, and can be used from ISRs.
586
587.. attention::
588
589  **You must synchronize access to metrics**. ``pw_metrics`` does not
590  internally synchronize access during construction. Metric Set/Increment are
591  safe.
592
593Lifecycle
594---------
595Metric objects are not designed to be destructed, and are expected to live for
596the lifetime of the program or application. If you need dynamic
597creation/destruction of metrics, ``pw_metric`` does not attempt to cover that
598use case. Instead, ``pw_metric`` covers the case of products with two execution
599phases:
600
6011. A boot phase where the metric tree is created.
6022. A run phase where metrics are collected. The tree structure is fixed.
603
604Technically, it is possible to destruct metrics provided care is taken to
605remove the given metric (or group) from the list it's contained in. However,
606there are no helper functions for this, so be careful.
607
608Below is an example that **is incorrect**. Don't do what follows!
609
610.. code::
611
612  #include "pw_metric/metric.h"
613
614  void main() {
615    PW_METRIC_GROUP(root, "/");
616    {
617      // BAD! The metrics have a different lifetime than the group.
618      PW_METRIC(root, temperature, "temperature_f", 72.3f);
619      PW_METRIC(root, humidity, "humidity_relative_percent", 33.2f);
620    }
621    // OOPS! root now has a linked list that points to the destructed
622    // "humidity" object.
623  }
624
625.. attention::
626
627  **Don't destruct metrics**. Metrics are designed to be registered /
628  structured upfront, then manipulated during a device's active phase. They do
629  not support destruction.
630
631-----------------
632Exporting metrics
633-----------------
634Collecting metrics on a device is not useful without a mechanism to export
635those metrics for analysis and debugging. ``pw_metric`` offers an optional RPC
636service library (``:metric_service_nanopb``) that enables exporting a
637user-supplied set of on-device metrics via RPC. This facility is intended to
638function from the early stages of device bringup through production in the
639field.
640
641The metrics are fetched by calling the ``MetricService.Get`` RPC method, which
642streams all registered metrics to the caller in batches (server streaming RPC).
643Batching the returned metrics avoids requiring a large buffer or large RPC MTU.
644
645The returned metric objects have flattened paths to the root. For example, the
646returned metrics (post detokenization and jsonified) might look something like:
647
648.. code:: none
649
650  {
651    "/i2c1/failed_txns": 17,
652    "/i2c1/total_txns": 2013,
653    "/i2c1/gyro/resets": 24,
654    "/i2c1/gyro/hangs": 1,
655    "/spi1/thermocouple/reads": 242,
656    "/spi1/thermocouple/temp_celcius": 34.52,
657  }
658
659Note that there is no nesting of the groups; the nesting is implied from the
660path.
661
662RPC service setup
663-----------------
664To expose a ``MetricService`` in your application, do the following:
665
6661. Define metrics around the system, and put them in a group or list of
667   metrics. Easy choices include for example the ``global_groups`` and
668   ``global_metrics`` variables; or creat your own.
6692. Create an instance of ``pw::metric::MetricService``.
6703. Register the service with your RPC server.
671
672For example:
673
674.. code::
675
676   #include "pw_rpc/server.h"
677   #include "pw_metric/metric.h"
678   #include "pw_metric/global.h"
679   #include "pw_metric/metric_service_nanopb.h"
680
681   // Note: You must customize the RPC server setup; see pw_rpc.
682   Channel channels[] = {
683    Channel::Create<1>(&uart_output),
684   };
685   Server server(channels);
686
687   // Metric service instance, pointing to the global metric objects.
688   // This could also point to custom per-product or application objects.
689   pw::metric::MetricService metric_service(
690       pw::metric::global_metrics,
691       pw::metric::global_groups);
692
693   void RegisterServices() {
694     server.RegisterService(metric_service);
695     // Register other services here.
696   }
697
698   void main() {
699     // ... system initialization ...
700
701     RegisterServices();
702
703     // ... start your applcation ...
704   }
705
706.. attention::
707
708  Take care when exporting metrics. Ensure **appropriate access control** is in
709  place. In some cases it may make sense to entirely disable metrics export for
710  production builds. Although reading metrics via RPC won't influence the
711  device, in some cases the metrics could expose sensitive information if
712  product owners are not careful.
713
714.. attention::
715
716  **MetricService::Get is a synchronous RPC method**
717
718  Calls to is ``MetricService::Get`` are blocking and will send all metrics
719  immediately, even though it is a server-streaming RPC. This will work fine if
720  the device doesn't have too many metics, or doesn't have concurrent RPCs like
721  logging, but could be a problem in some cases.
722
723  We plan to offer an async version where the application is responsible for
724  pumping the metrics into the streaming response. This gives flow control to
725  the application.
726
727-----------
728Size report
729-----------
730The below size report shows the cost in code and memory for a few examples of
731metrics. This does not include the RPC service.
732
733.. include:: metric_size_report
734
735.. attention::
736
737  At time of writing, **the above sizes show an unexpectedly large flash
738  impact**. We are investigating why GCC is inserting large global static
739  constructors per group, when all the logic should be reused across objects.
740
741----------------
742Design tradeoffs
743----------------
744There are many possible approaches to metrics collection and aggregation. We've
745chosen some points on the tradeoff curve:
746
747- **Atomic-sized metrics** - Using simple metric objects with just uint32/float
748  enables atomic operations. While it might be nice to support larger types, it
749  is more useful to have safe metrics increment from interrupt subroutines.
750
751- **No aggregate metrics (yet)** - Aggregate metrics (e.g. average, max, min,
752  histograms) are not supported, and must be built on top of the simple base
753  metrics. By taking this route, we can considerably simplify the core metrics
754  system and have aggregation logic in separate modules. Those modules can then
755  feed into the metrics system - for example by creating multiple metrics for a
756  single underlying metric. For example: "foo", "foo_max", "foo_min" and so on.
757
758  The other problem with automatic aggregation is that what period the
759  aggregation happens over is often important, and it can be hard to design
760  this cleanly into the API. Instead, this responsibility is pushed to the user
761  who must take more care.
762
763  Note that we will add helpers for aggregated metrics.
764
765- **No virtual metrics** - An alternate approach to the concrete Metric class
766  in the current module is to have a virtual interface for metrics, and then
767  allow those metrics to have their own storage. This is attractive but can
768  lead to many vtables and excess memory use in simple one-metric use cases.
769
770- **Linked list registration** - Using linked lists for registration is a
771  tradeoff, accepting some memory overhead in exchange for flexibility. Other
772  alternatives include a global table of metrics, which has the disadvantage of
773  requiring centralizing the metrics -- an impossibility for middleware like
774  Pigweed.
775
776- **Synchronization** - The only synchronization guarantee provided by
777  pw_metric is that increment and set are atomic. Other than that, users are on
778  their own to synchonize metric collection and updating.
779
780- **No fast metric lookup** - The current design does not make it fast to
781  lookup a metric at runtime; instead, one must run a linear search of the tree
782  to find the matching metric. In most non-dynamic use cases, this is fine in
783  practice, and saves having a more involved hash table. Metric updates will be
784  through direct member or variable accesses.
785
786- **Relying on C++ static initialization** - In short, the convenience
787  outweighs the cost and risk. Without static initializers, it would be
788  impossible to automatically collect the metrics without post-processing the
789  C++ code to find the metrics; a huge and debatably worthwhile approach. We
790  have carefully analyzed the static initializer behaviour of Pigweed's
791  IntrusiveList and are confident it is correct.
792
793- **Both local & global support** - Potentially just one approach (the local or
794  global one) could be offered, making the module less complex. However, we
795  feel the additional complexity is worthwhile since there are legimitate use
796  cases for both e.g. ``PW_METRIC`` and ``PW_METRIC_GLOBAL``. We'd prefer to
797  have a well-tested upstream solution for these use cases rather than have
798  customers re-implement one of these.
799
800----------------
801Roadmap & Status
802----------------
803- **String metric names** - ``pw_metric`` stores metric names as tokens. On one
804  hand, this is great for production where having a compact binary is often a
805  requirement to fit the application in the given part. However, in early
806  development before flash is a constraint, string names are more convenient to
807  work with since there is no need for host-side detokenization. We plan to add
808  optional support for using supporting strings.
809
810- **Aggregate metrics** - We plan to add support for aggregate metrics on top
811  of the simple metric mechanism, either as another module or as additional
812  functionality inside this one. Likely examples include min/max,
813
814- **Selectively enable or disable metrics** - Currently the metrics are always
815  enabled once included. In practice this is not ideal since many times only a
816  few metrics are wanted in production, but having to strip all the metrics
817  code is error prone. Instead, we will add support for controlling what
818  metrics are enabled or disabled at compile time. This may rely on of C++20's
819  support for zero-sized members to fully remove the cost.
820
821- **Async RCPC** - The current RPC service exports the metrics by streaming
822  them to the client in batches. However, the current solution streams all the
823  metrics to completion; this may block the RPC thread. In the future we will
824  have an async solution where the user is in control of flow priority.
825
826- **Timer integration** - We would like to add a stopwatch type mechanism to
827  time multiple in-flight events.
828
829- **C support** - In practice it's often useful or necessary to instrument
830  C-only code. While it will be impossible to support the global registration
831  system that the C++ version supports, we will figure out a solution to make
832  instrumenting C code relatively smooth.
833
834- **Global counter** - We may add a global metric counter to help detect cases
835  where post-initialization metrics manipulations are done.
836
837- **Proto structure** - It may be possible to directly map metrics to a custom
838  proto structure, where instead of a name or token field, a tag field is
839  provided. This could result in elegant export to an easily machine parsable
840  and compact representation on the host. We may investigate this in the
841  future.
842
843- **Safer data structures** - At a cost of 4B per metric and 4B per group, it
844  may be possible to make metric structure instantiation safe even in static
845  constructors, and also make it safe to remove metrics dynamically. We will
846  consider whether this tradeoff is the right one, since a 4B cost per metric
847  is substantial on projects with many metrics.
848