1.. _module-pw_metric: 2 3========= 4pw_metric 5========= 6 7.. attention:: 8 9 This module is **not yet production ready**; ask us if you are interested in 10 using it out or have ideas about how to improve it. 11 12-------- 13Overview 14-------- 15Pigweed's metric module is a **lightweight manual instrumentation system** for 16tracking system health metrics like counts or set values. For example, 17``pw_metric`` could help with tracking the number of I2C bus writes, or the 18number of times a buffer was filled before it could drain in time, or safely 19incrementing counters from ISRs. 20 21Key features of ``pw_metric``: 22 23- **Tokenized names** - Names are tokenized using the ``pw_tokenizer`` enabling 24 long metric names that don't bloat your binary. 25 26- **Tree structure** - Metrics can form a tree, enabling grouping of related 27 metrics for clearer organization. 28 29- **Per object collection** - Metrics and groups can live on object instances 30 and be flexibly combined with metrics from other instances. 31 32- **Global registration** - For legacy code bases or just because it's easier, 33 ``pw_metric`` supports automatic aggregation of metrics. This is optional but 34 convenient in many cases. 35 36- **Simple design** - There are only two core data structures: ``Metric`` and 37 ``Group``, which are both simple to understand and use. The only type of 38 metric supported is ``uint32_t`` and ``float``. This module does not support 39 complicated aggregations like running average or min/max. 40 41Example: Instrumenting a single object 42-------------------------------------- 43The below example illustrates what instrumenting a class with a metric group 44and metrics might look like. In this case, the object's 45``MySubsystem::metrics()`` member is not globally registered; the user is on 46their own for combining this subsystem's metrics with others. 47 48.. code:: 49 50 #include "pw_metric/metric.h" 51 52 class MySubsystem { 53 public: 54 void DoSomething() { 55 attempts_.Increment(); 56 if (ActionSucceeds()) { 57 successes_.Increment(); 58 } 59 } 60 Group& metrics() { return metrics_; } 61 62 private: 63 PW_METRIC_GROUP(metrics_, "my_subsystem"); 64 PW_METRIC(metrics_, attempts_, "attempts", 0u); 65 PW_METRIC(metrics_, successes_, "successes", 0u); 66 }; 67 68The metrics subsystem has no canonical output format at this time, but a JSON 69dump might look something like this: 70 71.. code:: none 72 73 { 74 "my_subsystem" : { 75 "successes" : 1000, 76 "attempts" : 1200, 77 } 78 } 79 80In this case, every instance of ``MySubsystem`` will have unique counters. 81 82Example: Instrumenting a legacy codebase 83---------------------------------------- 84A common situation in embedded development is **debugging legacy code** or code 85which is hard to change; where it is perhaps impossible to plumb metrics 86objects around with dependency injection. The alternative to plumbing metrics 87is to register the metrics through a global mechanism. ``pw_metric`` supports 88this use case. For example: 89 90**Before instrumenting:** 91 92.. code:: 93 94 // This code was passed down from generations of developers before; no one 95 // knows what it does or how it works. But it needs to be fixed! 96 void OldCodeThatDoesntWorkButWeDontKnowWhy() { 97 if (some_variable) { 98 DoSomething(); 99 } else { 100 DoSomethingElse(); 101 } 102 } 103 104**After instrumenting:** 105 106.. code:: 107 108 #include "pw_metric/global.h" 109 #include "pw_metric/metric.h" 110 111 PW_METRIC_GLOBAL(legacy_do_something, "legacy_do_something"); 112 PW_METRIC_GLOBAL(legacy_do_something_else, "legacy_do_something_else"); 113 114 // This code was passed down from generations of developers before; no one 115 // knows what it does or how it works. But it needs to be fixed! 116 void OldCodeThatDoesntWorkButWeDontKnowWhy() { 117 if (some_variable) { 118 legacy_do_something.Increment(); 119 DoSomething(); 120 } else { 121 legacy_do_something_else.Increment(); 122 DoSomethingElse(); 123 } 124 } 125 126In this case, the developer merely had to add the metrics header, define some 127metrics, and then start incrementing them. These metrics will be available 128globally through the ``pw::metric::global_metrics`` object defined in 129``pw_metric/global.h``. 130 131Why not just use simple counter variables? 132------------------------------------------ 133One might wonder what the point of leveraging a metric library is when it is 134trivial to make some global variables and print them out. There are a few 135reasons: 136 137- **Metrics offload** - To make it easy to get metrics off-device by sharing 138 the infrastructure for offloading. 139 140- **Consistent format** - To get the metrics in a consistent format (e.g. 141 protobuf or JSON) for analysis 142 143- **Uncoordinated collection** - To provide a simple and reliable way for 144 developers on a team to all collect metrics for their subsystems, without 145 having to coordinate to offload. This could extend to code in libraries 146 written by other teams. 147 148- **Pre-boot or interrupt visibility** - Some of the most challenging bugs come 149 from early system boot when not all system facilities are up (e.g. logging or 150 UART). In those cases, metrics provide a low-overhead approach to understand 151 what is happening. During early boot, metrics can be incremented, then after 152 boot dumping the metrics provides insights into what happened. While basic 153 counter variables can work in these contexts to, one still has to deal with 154 the offloading problem; which the library handles. 155 156--------------------- 157Metrics API reference 158--------------------- 159 160The metrics API consists of just a few components: 161 162- The core data structures ``pw::metric::Metric`` and ``pw::metric::Group`` 163- The macros for scoped metrics and groups ``PW_METRIC`` and 164 ``PW_METRIC_GROUP`` 165- The macros for globally registered metrics and groups 166 ``PW_METRIC_GLOBAL`` and ``PW_METRIC_GROUP_GLOBAL`` 167- The global groups and metrics list: ``pw::metric::global_groups`` and 168 ``pw::metric::global_metrics``. 169 170Metric 171------ 172The ``pw::metric::Metric`` provides: 173 174- A 31-bit tokenized name 175- A 1-bit discriminator for int or float 176- A 32-bit payload (int or float) 177- A 32-bit next pointer (intrusive list) 178 179The metric object is 12 bytes on 32-bit platforms. 180 181.. cpp:class:: pw::metric::Metric 182 183 .. cpp:function:: Increment(uint32_t amount = 0) 184 185 Increment the metric by the given amount. Results in undefined behaviour if 186 the metric is not of type int. 187 188 .. cpp:function:: Set(uint32_t value) 189 190 Set the metric to the given value. Results in undefined behaviour if the 191 metric is not of type int. 192 193 .. cpp:function:: Set(float value) 194 195 Set the metric to the given value. Results in undefined behaviour if the 196 metric is not of type float. 197 198Group 199----- 200The ``pw::metric::Group`` object is simply: 201 202- A name for the group 203- A list of children groups 204- A list of leaf metrics groups 205- A 32-bit next pointer (intrusive list) 206 207The group object is 16 bytes on 32-bit platforms. 208 209.. cpp:class:: pw::metric::Group 210 211 .. cpp:function:: Dump(int indent_level = 0) 212 213 Recursively dump a metrics group to ``pw_log``. Produces output like: 214 215 .. code:: none 216 217 "$6doqFw==": { 218 "$05OCZw==": { 219 "$VpPfzg==": 1, 220 "$LGPMBQ==": 1.000000, 221 "$+iJvUg==": 5, 222 } 223 "$9hPNxw==": 65, 224 "$oK7HmA==": 13, 225 "$FCM4qQ==": 0, 226 } 227 228 Note the metric names are tokenized with base64. Decoding requires using 229 the Pigweed detokenizer. With a detokenizing-enabled logger, you could get 230 something like: 231 232 .. code:: none 233 234 "i2c_1": { 235 "gyro": { 236 "num_sampleses": 1, 237 "init_time_us": 1.000000, 238 "initialized": 5, 239 } 240 "bus_errors": 65, 241 "transactions": 13, 242 "bytes_sent": 0, 243 } 244 245Macros 246------ 247The **macros are the primary mechanism for creating metrics**, and should be 248used instead of directly constructing metrics or groups. The macros handle 249tokenizing the metric and group names. 250 251.. cpp:function:: PW_METRIC(identifier, name, value) 252.. cpp:function:: PW_METRIC(group, identifier, name, value) 253.. cpp:function:: PW_METRIC_STATIC(identifier, name, value) 254.. cpp:function:: PW_METRIC_STATIC(group, identifier, name, value) 255 256 Declare a metric, optionally adding it to a group. 257 258 - **identifier** - An identifier name for the created variable or member. 259 For example: ``i2c_transactions`` might be used as a local or global 260 metric; inside a class, could be named according to members 261 (``i2c_transactions_`` for Google's C++ style). 262 - **name** - The string name for the metric. This will be tokenized. There 263 are no restrictions on the contents of the name; however, consider 264 restricting these to be valid C++ identifiers to ease integration with 265 other systems. 266 - **value** - The initial value for the metric. Must be either a floating 267 point value (e.g. ``3.2f``) or unsigned int (e.g. ``21u``). 268 - **group** - A ``pw::metric::Group`` instance. If provided, the metric is 269 added to the given group. 270 271 The macro declares a variable or member named "name" with type 272 ``pw::metric::Metric``, and works in three contexts: global, local, and 273 member. 274 275 If the `_STATIC` variant is used, the macro declares a variable with static 276 storage. These can be used in function scopes, but not in classes. 277 278 1. At global scope: 279 280 .. code:: 281 282 PW_METRIC(foo, "foo", 15.5f); 283 284 void MyFunc() { 285 foo.Increment(); 286 } 287 288 2. At local function or member function scope: 289 290 .. code:: 291 292 void MyFunc() { 293 PW_METRIC(foo, "foo", 15.5f); 294 foo.Increment(); 295 // foo goes out of scope here; be careful! 296 } 297 298 3. At member level inside a class or struct: 299 300 .. code:: 301 302 struct MyStructy { 303 void DoSomething() { 304 somethings.Increment(); 305 } 306 // Every instance of MyStructy will have a separate somethings counter. 307 PW_METRIC(somethings, "somethings", 0u); 308 } 309 310 You can also put a metric into a group with the macro. Metrics can belong to 311 strictly one group, otherwise a assertion will fail. Example: 312 313 .. code:: 314 315 PW_METRIC_GROUP(my_group, "my_group"); 316 PW_METRIC(my_group, foo, "foo", 0.2f); 317 PW_METRIC(my_group, bar, "bar", 44000u); 318 PW_METRIC(my_group, zap, "zap", 3.14f); 319 320 .. tip:: 321 322 If you want a globally registered metric, see ``pw_metric/global.h``; in 323 that contexts, metrics are globally registered without the need to 324 centrally register in a single place. 325 326.. cpp:function:: PW_METRIC_GROUP(identifier, name) 327.. cpp:function:: PW_METRIC_GROUP(parent_group, identifier, name) 328.. cpp:function:: PW_METRIC_GROUP_STATIC(identifier, name) 329.. cpp:function:: PW_METRIC_GROUP_STATIC(parent_group, identifier, name) 330 331 Declares a ``pw::metric::Group`` with name name; the name is tokenized. 332 Works similar to ``PW_METRIC`` and can be used in the same contexts (global, 333 local, and member). Optionally, the group can be added to a parent group. 334 335 If the `_STATIC` variant is used, the macro declares a variable with static 336 storage. These can be used in function scopes, but not in classes. 337 338 Example: 339 340 .. code:: 341 342 PW_METRIC_GROUP(my_group, "my_group"); 343 PW_METRIC(my_group, foo, "foo", 0.2f); 344 PW_METRIC(my_group, bar, "bar", 44000u); 345 PW_METRIC(my_group, zap, "zap", 3.14f); 346 347.. cpp:function:: PW_METRIC_GLOBAL(identifier, name, value) 348 349 Declare a ``pw::metric::Metric`` with name name, and register it in the 350 global metrics list ``pw::metric::global_metrics``. 351 352 Example: 353 354 .. code:: 355 356 #include "pw_metric/metric.h" 357 #include "pw_metric/global.h" 358 359 // No need to coordinate collection of foo and bar; they're autoregistered. 360 PW_METRIC_GLOBAL(foo, "foo", 0.2f); 361 PW_METRIC_GLOBAL(bar, "bar", 44000u); 362 363 Note that metrics defined with ``PW_METRIC_GLOBAL`` should never be added to 364 groups defined with ``PW_METRIC_GROUP_GLOBAL``. Each metric can only belong 365 to one group, and metrics defined with ``PW_METRIC_GLOBAL`` are 366 pre-registered with the global metrics list. 367 368 .. attention:: 369 370 Do not create ``PW_METRIC_GLOBAL`` instances anywhere other than global 371 scope. Putting these on an instance (member context) would lead to dangling 372 pointers and misery. Metrics are never deleted or unregistered! 373 374.. cpp:function:: PW_METRIC_GROUP_GLOBAL(identifier, name, value) 375 376 Declare a ``pw::metric::Group`` with name name, and register it in the 377 global metric groups list ``pw::metric::global_groups``. 378 379 Note that metrics created with ``PW_METRIC_GLOBAL`` should never be added to 380 groups! Instead, just create a freestanding metric and register it into the 381 global group (like in the example below). 382 383 Example: 384 385 .. code:: 386 387 #include "pw_metric/metric.h" 388 #include "pw_metric/global.h" 389 390 // No need to coordinate collection of this group; it's globally registered. 391 PW_METRIC_GROUP_GLOBAL(leagcy_system, "legacy_system"); 392 PW_METRIC(leagcy_system, foo, "foo",0.2f); 393 PW_METRIC(leagcy_system, bar, "bar",44000u); 394 395 .. attention:: 396 397 Do not create ``PW_METRIC_GROUP_GLOBAL`` instances anywhere other than 398 global scope. Putting these on an instance (member context) would lead to 399 dangling pointers and misery. Metrics are never deleted or unregistered! 400 401---------------------- 402Usage & Best Practices 403---------------------- 404This library makes several tradeoffs to enable low memory use per-metric, and 405one of those tradeoffs results in requiring care in constructing the metric 406trees. 407 408Use the Init() pattern for static objects with metrics 409------------------------------------------------------ 410A common pattern in embedded systems is to allocate many objects globally, and 411reduce reliance on dynamic allocation (or eschew malloc entirely). This leads 412to a pattern where rich/large objects are statically constructed at global 413scope, then interacted with via tasks or threads. For example, consider a 414hypothetical global ``Uart`` object: 415 416.. code:: 417 418 class Uart { 419 public: 420 Uart(span<std::byte> rx_buffer, span<std::byte> tx_buffer) 421 : rx_buffer_(rx_buffer), tx_buffer_(tx_buffer) {} 422 423 // Send/receive here... 424 425 private: 426 std::span<std::byte> rx_buffer; 427 std::span<std::byte> tx_buffer; 428 }; 429 430 std::array<std::byte, 512> uart_rx_buffer; 431 std::array<std::byte, 512> uart_tx_buffer; 432 Uart uart1(uart_rx_buffer, uart_tx_buffer); 433 434Through the course of building a product, the team may want to add metrics to 435the UART to for example gain insight into which operations are triggering lots 436of data transfer. When adding metrics to the above imaginary UART object, one 437might consider the following approach: 438 439.. code:: 440 441 class Uart { 442 public: 443 Uart(span<std::byte> rx_buffer, 444 span<std::byte> tx_buffer, 445 Group& parent_metrics) 446 : rx_buffer_(rx_buffer), 447 tx_buffer_(tx_buffer) { 448 // PROBLEM! parent_metrics may not be constructed if it's a reference 449 // to a static global. 450 parent_metrics.Add(tx_bytes_); 451 parent_metrics.Add(rx_bytes_); 452 } 453 454 // Send/receive here which increment tx/rx_bytes. 455 456 private: 457 std::span<std::byte> rx_buffer; 458 std::span<std::byte> tx_buffer; 459 460 PW_METRIC(tx_bytes_, "tx_bytes", 0); 461 PW_METRIC(rx_bytes_, "rx_bytes", 0); 462 }; 463 464 PW_METRIC_GROUP(global_metrics, "/"); 465 PW_METRIC_GROUP(global_metrics, uart1_metrics, "uart1"); 466 467 std::array<std::byte, 512> uart_rx_buffer; 468 std::array<std::byte, 512> uart_tx_buffer; 469 Uart uart1(uart_rx_buffer, 470 uart_tx_buffer, 471 uart1_metrics); 472 473However, this **is incorrect**, since the ``parent_metrics`` (pointing to 474``uart1_metrics`` in this case) may not be constructed at the point of 475``uart1`` getting constructed. Thankfully in the case of ``pw_metric`` this 476will result in an assertion failure (or it will work correctly if the 477constructors are called in a favorable order), so the problem will not go 478unnoticed. Instead, consider using the ``Init()`` pattern for static objects, 479where references to dependencies may only be stored during construction, but no 480methods on the dependencies are called. 481 482Instead, the ``Init()`` approach separates global object construction into two 483phases: The constructor where references are stored, and a ``Init()`` function 484which is called after all static constructors have run. This approach works 485correctly, even when the objects are allocated globally: 486 487.. code:: 488 489 class Uart { 490 public: 491 // Note that metrics is not passed in here at all. 492 Uart(span<std::byte> rx_buffer, 493 span<std::byte> tx_buffer) 494 : rx_buffer_(rx_buffer), 495 tx_buffer_(tx_buffer) {} 496 497 // Precondition: parent_metrics is already constructed. 498 void Init(Group& parent_metrics) { 499 parent_metrics.Add(tx_bytes_); 500 parent_metrics.Add(rx_bytes_); 501 } 502 503 // Send/receive here which increment tx/rx_bytes. 504 505 private: 506 std::span<std::byte> rx_buffer; 507 std::span<std::byte> tx_buffer; 508 509 PW_METRIC(tx_bytes_, "tx_bytes", 0); 510 PW_METRIC(rx_bytes_, "rx_bytes", 0); 511 }; 512 513 PW_METRIC_GROUP(root_metrics, "/"); 514 PW_METRIC_GROUP(root_metrics, uart1_metrics, "uart1"); 515 516 std::array<std::byte, 512> uart_rx_buffer; 517 std::array<std::byte, 512> uart_tx_buffer; 518 Uart uart1(uart_rx_buffer, 519 uart_tx_buffer); 520 521 void main() { 522 // uart1_metrics is guaranteed to be initialized by this point, so it is 523 safe to pass it to Init(). 524 uart1.Init(uart1_metrics); 525 } 526 527.. attention:: 528 529 Be extra careful about **static global metric registration**. Consider using 530 the ``Init()`` pattern. 531 532Metric member order matters in objects 533-------------------------------------- 534The order of declaring in-class groups and metrics matters if the metrics are 535within a group declared inside the class. For example, the following class will 536work fine: 537 538.. code:: 539 540 #include "pw_metric/metric.h" 541 542 class PowerSubsystem { 543 public: 544 Group& metrics() { return metrics_; } 545 const Group& metrics() const { return metrics_; } 546 547 private: 548 PW_METRIC_GROUP(metrics_, "power"); // Note metrics_ declared first. 549 PW_METRIC(metrics_, foo, "foo", 0.2f); 550 PW_METRIC(metrics_, bar, "bar", 44000u); 551 }; 552 553but the following one will not since the group is constructed after the metrics 554(and will result in a compile error): 555 556.. code:: 557 558 #include "pw_metric/metric.h" 559 560 class PowerSubsystem { 561 public: 562 Group& metrics() { return metrics_; } 563 const Group& metrics() const { return metrics_; } 564 565 private: 566 PW_METRIC(metrics_, foo, "foo", 0.2f); 567 PW_METRIC(metrics_, bar, "bar", 44000u); 568 PW_METRIC_GROUP(metrics_, "power"); // Error: metrics_ must be first. 569 }; 570 571.. attention:: 572 573 Put **groups before metrics** when declaring metrics members inside classes. 574 575Thread safety 576------------- 577``pw_metric`` has **no built-in synchronization for manipulating the tree** 578structure. Users are expected to either rely on shared global mutex when 579constructing the metric tree, or do the metric construction in a single thread 580(e.g. a boot/init thread). The same applies for destruction, though we do not 581advise destructing metrics or groups. 582 583Individual metrics have atomic ``Increment()``, ``Set()``, and the value 584accessors ``as_float()`` and ``as_int()`` which don't require separate 585synchronization, and can be used from ISRs. 586 587.. attention:: 588 589 **You must synchronize access to metrics**. ``pw_metrics`` does not 590 internally synchronize access during construction. Metric Set/Increment are 591 safe. 592 593Lifecycle 594--------- 595Metric objects are not designed to be destructed, and are expected to live for 596the lifetime of the program or application. If you need dynamic 597creation/destruction of metrics, ``pw_metric`` does not attempt to cover that 598use case. Instead, ``pw_metric`` covers the case of products with two execution 599phases: 600 6011. A boot phase where the metric tree is created. 6022. A run phase where metrics are collected. The tree structure is fixed. 603 604Technically, it is possible to destruct metrics provided care is taken to 605remove the given metric (or group) from the list it's contained in. However, 606there are no helper functions for this, so be careful. 607 608Below is an example that **is incorrect**. Don't do what follows! 609 610.. code:: 611 612 #include "pw_metric/metric.h" 613 614 void main() { 615 PW_METRIC_GROUP(root, "/"); 616 { 617 // BAD! The metrics have a different lifetime than the group. 618 PW_METRIC(root, temperature, "temperature_f", 72.3f); 619 PW_METRIC(root, humidity, "humidity_relative_percent", 33.2f); 620 } 621 // OOPS! root now has a linked list that points to the destructed 622 // "humidity" object. 623 } 624 625.. attention:: 626 627 **Don't destruct metrics**. Metrics are designed to be registered / 628 structured upfront, then manipulated during a device's active phase. They do 629 not support destruction. 630 631----------------- 632Exporting metrics 633----------------- 634Collecting metrics on a device is not useful without a mechanism to export 635those metrics for analysis and debugging. ``pw_metric`` offers an optional RPC 636service library (``:metric_service_nanopb``) that enables exporting a 637user-supplied set of on-device metrics via RPC. This facility is intended to 638function from the early stages of device bringup through production in the 639field. 640 641The metrics are fetched by calling the ``MetricService.Get`` RPC method, which 642streams all registered metrics to the caller in batches (server streaming RPC). 643Batching the returned metrics avoids requiring a large buffer or large RPC MTU. 644 645The returned metric objects have flattened paths to the root. For example, the 646returned metrics (post detokenization and jsonified) might look something like: 647 648.. code:: none 649 650 { 651 "/i2c1/failed_txns": 17, 652 "/i2c1/total_txns": 2013, 653 "/i2c1/gyro/resets": 24, 654 "/i2c1/gyro/hangs": 1, 655 "/spi1/thermocouple/reads": 242, 656 "/spi1/thermocouple/temp_celcius": 34.52, 657 } 658 659Note that there is no nesting of the groups; the nesting is implied from the 660path. 661 662RPC service setup 663----------------- 664To expose a ``MetricService`` in your application, do the following: 665 6661. Define metrics around the system, and put them in a group or list of 667 metrics. Easy choices include for example the ``global_groups`` and 668 ``global_metrics`` variables; or creat your own. 6692. Create an instance of ``pw::metric::MetricService``. 6703. Register the service with your RPC server. 671 672For example: 673 674.. code:: 675 676 #include "pw_rpc/server.h" 677 #include "pw_metric/metric.h" 678 #include "pw_metric/global.h" 679 #include "pw_metric/metric_service_nanopb.h" 680 681 // Note: You must customize the RPC server setup; see pw_rpc. 682 Channel channels[] = { 683 Channel::Create<1>(&uart_output), 684 }; 685 Server server(channels); 686 687 // Metric service instance, pointing to the global metric objects. 688 // This could also point to custom per-product or application objects. 689 pw::metric::MetricService metric_service( 690 pw::metric::global_metrics, 691 pw::metric::global_groups); 692 693 void RegisterServices() { 694 server.RegisterService(metric_service); 695 // Register other services here. 696 } 697 698 void main() { 699 // ... system initialization ... 700 701 RegisterServices(); 702 703 // ... start your applcation ... 704 } 705 706.. attention:: 707 708 Take care when exporting metrics. Ensure **appropriate access control** is in 709 place. In some cases it may make sense to entirely disable metrics export for 710 production builds. Although reading metrics via RPC won't influence the 711 device, in some cases the metrics could expose sensitive information if 712 product owners are not careful. 713 714.. attention:: 715 716 **MetricService::Get is a synchronous RPC method** 717 718 Calls to is ``MetricService::Get`` are blocking and will send all metrics 719 immediately, even though it is a server-streaming RPC. This will work fine if 720 the device doesn't have too many metics, or doesn't have concurrent RPCs like 721 logging, but could be a problem in some cases. 722 723 We plan to offer an async version where the application is responsible for 724 pumping the metrics into the streaming response. This gives flow control to 725 the application. 726 727----------- 728Size report 729----------- 730The below size report shows the cost in code and memory for a few examples of 731metrics. This does not include the RPC service. 732 733.. include:: metric_size_report 734 735.. attention:: 736 737 At time of writing, **the above sizes show an unexpectedly large flash 738 impact**. We are investigating why GCC is inserting large global static 739 constructors per group, when all the logic should be reused across objects. 740 741---------------- 742Design tradeoffs 743---------------- 744There are many possible approaches to metrics collection and aggregation. We've 745chosen some points on the tradeoff curve: 746 747- **Atomic-sized metrics** - Using simple metric objects with just uint32/float 748 enables atomic operations. While it might be nice to support larger types, it 749 is more useful to have safe metrics increment from interrupt subroutines. 750 751- **No aggregate metrics (yet)** - Aggregate metrics (e.g. average, max, min, 752 histograms) are not supported, and must be built on top of the simple base 753 metrics. By taking this route, we can considerably simplify the core metrics 754 system and have aggregation logic in separate modules. Those modules can then 755 feed into the metrics system - for example by creating multiple metrics for a 756 single underlying metric. For example: "foo", "foo_max", "foo_min" and so on. 757 758 The other problem with automatic aggregation is that what period the 759 aggregation happens over is often important, and it can be hard to design 760 this cleanly into the API. Instead, this responsibility is pushed to the user 761 who must take more care. 762 763 Note that we will add helpers for aggregated metrics. 764 765- **No virtual metrics** - An alternate approach to the concrete Metric class 766 in the current module is to have a virtual interface for metrics, and then 767 allow those metrics to have their own storage. This is attractive but can 768 lead to many vtables and excess memory use in simple one-metric use cases. 769 770- **Linked list registration** - Using linked lists for registration is a 771 tradeoff, accepting some memory overhead in exchange for flexibility. Other 772 alternatives include a global table of metrics, which has the disadvantage of 773 requiring centralizing the metrics -- an impossibility for middleware like 774 Pigweed. 775 776- **Synchronization** - The only synchronization guarantee provided by 777 pw_metric is that increment and set are atomic. Other than that, users are on 778 their own to synchonize metric collection and updating. 779 780- **No fast metric lookup** - The current design does not make it fast to 781 lookup a metric at runtime; instead, one must run a linear search of the tree 782 to find the matching metric. In most non-dynamic use cases, this is fine in 783 practice, and saves having a more involved hash table. Metric updates will be 784 through direct member or variable accesses. 785 786- **Relying on C++ static initialization** - In short, the convenience 787 outweighs the cost and risk. Without static initializers, it would be 788 impossible to automatically collect the metrics without post-processing the 789 C++ code to find the metrics; a huge and debatably worthwhile approach. We 790 have carefully analyzed the static initializer behaviour of Pigweed's 791 IntrusiveList and are confident it is correct. 792 793- **Both local & global support** - Potentially just one approach (the local or 794 global one) could be offered, making the module less complex. However, we 795 feel the additional complexity is worthwhile since there are legimitate use 796 cases for both e.g. ``PW_METRIC`` and ``PW_METRIC_GLOBAL``. We'd prefer to 797 have a well-tested upstream solution for these use cases rather than have 798 customers re-implement one of these. 799 800---------------- 801Roadmap & Status 802---------------- 803- **String metric names** - ``pw_metric`` stores metric names as tokens. On one 804 hand, this is great for production where having a compact binary is often a 805 requirement to fit the application in the given part. However, in early 806 development before flash is a constraint, string names are more convenient to 807 work with since there is no need for host-side detokenization. We plan to add 808 optional support for using supporting strings. 809 810- **Aggregate metrics** - We plan to add support for aggregate metrics on top 811 of the simple metric mechanism, either as another module or as additional 812 functionality inside this one. Likely examples include min/max, 813 814- **Selectively enable or disable metrics** - Currently the metrics are always 815 enabled once included. In practice this is not ideal since many times only a 816 few metrics are wanted in production, but having to strip all the metrics 817 code is error prone. Instead, we will add support for controlling what 818 metrics are enabled or disabled at compile time. This may rely on of C++20's 819 support for zero-sized members to fully remove the cost. 820 821- **Async RCPC** - The current RPC service exports the metrics by streaming 822 them to the client in batches. However, the current solution streams all the 823 metrics to completion; this may block the RPC thread. In the future we will 824 have an async solution where the user is in control of flow priority. 825 826- **Timer integration** - We would like to add a stopwatch type mechanism to 827 time multiple in-flight events. 828 829- **C support** - In practice it's often useful or necessary to instrument 830 C-only code. While it will be impossible to support the global registration 831 system that the C++ version supports, we will figure out a solution to make 832 instrumenting C code relatively smooth. 833 834- **Global counter** - We may add a global metric counter to help detect cases 835 where post-initialization metrics manipulations are done. 836 837- **Proto structure** - It may be possible to directly map metrics to a custom 838 proto structure, where instead of a name or token field, a tag field is 839 provided. This could result in elegant export to an easily machine parsable 840 and compact representation on the host. We may investigate this in the 841 future. 842 843- **Safer data structures** - At a cost of 4B per metric and 4B per group, it 844 may be possible to make metric structure instantiation safe even in static 845 constructors, and also make it safe to remove metrics dynamically. We will 846 consider whether this tradeoff is the right one, since a 4B cost per metric 847 is substantial on projects with many metrics. 848