1# Structured Data Plugins
2
3This document describes an infrastructural feature called Structured Data
4plugins.  See the DarwinLog documentation for a description of one such plugin
5that makes use of this feature.
6
7StructuredDataPlugin instances have the following characteristics:
8
9* Each plugin instance is bound to a single Process instance.
10
11* Each StructuredData feature has a type name that identifies the
12  feature. For instance, the type name for the DarwinLog feature is
13  "DarwinLog". This feature type name is used in various places.
14
15* The process monitor reports the list of supported StructuredData
16  features advertised by the process monitor. Process goes through the
17  list of supported feature type names, and asks each known
18  StructuredDataPlugin if it can handle the feature. The first plugin
19  that supports the feature is mapped to that Process instance for
20  that feature.  Plugins are only mapped when the process monitor
21  advertises that a feature is supported.
22
23* The feature may send asynchronous messages in StructuredData format
24  to the Process instance. Process instances route the asynchronous
25  structured data messages to the plugin mapped to that feature type,
26  if one exists.
27
28* Plugins can request that the Process instance forward on
29  configuration data to the process monitor if the plugin needs/wants
30  to configure the feature. Plugins may call the new Process method
31
32  ```C++
33  virtual Error
34  ConfigureStructuredData(ConstString type_name,
35                          const StructuredData::ObjectSP &config_sp)
36  ```
37
38  where `type_name` is the feature name and `config_sp` points to the
39  configuration structured data, which may be nullptr.
40
41* Plugins for features present in a process are notified when modules
42  are loaded into the Process instance via this StructuredDataPlugin
43  method:
44
45  ```C++
46  virtual void
47  ModulesDidLoad(Process &process, ModuleList &module_list);
48  ```
49
50* Plugins may optionally broadcast their received structured data as
51  an LLDB process-level event via the following new Process call:
52
53  ```C++
54  void
55  BroadcastStructuredData(const StructuredData::ObjectSP &object_sp,
56                          const lldb::StructuredDataPluginSP &plugin_sp);
57  ```
58
59  IDE clients might use this feature to receive information about the
60  process as it is running to monitor memory usage, CPU usage, and
61  logging.
62
63  Internally, the event type created is an instance of
64  EventDataStructuredData.
65
66* In the case where a plugin chooses to broadcast a received
67  StructuredData event, the command-line LLDB Debugger instance
68  listens for them. The Debugger instance then gives the plugin an
69  opportunity to display info to either the debugger output or error
70  stream at a time that is safe to write to them. The plugin can
71  choose to display something appropriate regarding the structured
72  data that time.
73
74* Plugins can provide a ProcessLaunchInfo filter method when the
75  plugin is registered.  If such a filter method is provided, then
76  when a process is about to be launched for debugging, the filter
77  callback is invoked, given both the launch info and the target.  The
78  plugin may then alter the launch info if needed to better support
79  the feature of the plugin.
80
81* The plugin is entirely independent of the type of Process-derived
82  class that it is working with. The only requirements from the
83  process monitor are the following feature-agnostic elements:
84
85  * Provide a way to discover features supported by the process
86    monitor for the current process.
87
88  * Specify the list of supported feature type names to Process.
89    The process monitor does this by calling the following new
90    method on Process:
91
92    ```C++
93    void
94    MapSupportedStructuredDataPlugins(const StructuredData::Array
95                                      &supported_type_names)
96    ```
97
98    The `supported_type_names` specifies an array of string entries,
99    where each entry specifies the name of a StructuredData feature.
100
101  * Provide a way to forward on configuration data for a feature type
102    to the process monitor.  This is the manner by which LLDB can
103    configure a feature, perhaps based on settings or commands from
104    the user.  The following virtual method on Process (described
105    earlier) does the job:
106
107    ```C++
108    virtual Error
109    ConfigureStructuredData(ConstString type_name,
110                            const StructuredData::ObjectSP &config_sp)
111    ```
112
113  * Listen for asynchronous structured data packets from the process
114    monitor, and forward them on to Process via this new Process
115    member method:
116
117    ```C++
118    bool
119    RouteAsyncStructuredData(const StructuredData::ObjectSP object_sp)
120    ```
121
122* StructuredData producers must send their top-level data as a
123  Dictionary type, with a key called 'type' specifying a string value,
124  where the value is equal to the StructuredData feature/type name
125  previously advertised. Everything else about the content of the
126  dictionary is entirely up to the feature.
127
128* StructuredDataPlugin commands show up under `plugin structured-data
129  plugin-name`.
130
131* StructuredDataPlugin settings show up under
132  `plugin.structured-data.{plugin-name}`.
133
134## StructuredDataDarwinLog feature
135
136The DarwinLog feature supports logging `os_log`*() and `NSLog`() messages
137to the command-line lldb console, as well as making those messages
138available to LLDB clients via the event system.  Starting with fall
1392016 OSes, Apple platforms introduce a new fire-hose, stream-style
140logging system where the bulk of the log processing happens on the log
141consumer side.  This reduces logging impact on the system when there
142are no consumers, making it cheaper to include logging at all times.
143However, it also increases the work needed on the consumer end when
144log messages are desired.
145
146The debugserver binary has been modified to support collection of
147`os_log`*()/`NSLog`() messages, selection of which messages appear in the
148stream, and fine-grained filtering of what gets passed on to the LLDB
149client.  DarwinLog also tracks the activity chain (i.e. `os_activity`()
150hierarchy) in effect at the time the log messages were issued.  The
151user is able to configure a number of aspects related to the
152formatting of the log message header fields.
153
154The DarwinLog support is written in a way which should support the
155lldb client side on non-Apple clients talking to an Apple device or
156macOS system; hence, the plugin support is built into all LLDB
157clients, not just those built on an Apple platform.
158
159StructuredDataDarwinLog implements the 'DarwinLog' feature type, and
160the plugin name for it shows up as `darwin-log`.
161
162The user interface to the darwin-log support is via the following:
163
164* `plugin structured-data darwin-log enable` command
165
166  This is the main entry point for enabling the command.  It can be
167  set before launching a process or while the process is running.
168  If the user wants to squelch seeing info-level or debug-level
169  messages, which is the default behavior, then the enable command
170  must be made prior to launching the process; otherwise, the
171  info-level and debug-level messages will always show up.  Also,
172  there is a similar "echo os_log()/NSLog() messages to target
173  process stderr" mechanism which is properly disabled when enabling
174  the DarwinLog support prior to launch.  This cannot be squelched
175  if enabling DarwinLog after launch.
176
177  See the help for this command.  There are a number of options
178  to shrink or expand the number of messages that are processed
179  on the remote side and sent over to the client, and other
180  options to control the formatting of messages displayed.
181
182  This command is sticky.  Once enabled, it will stay enabled for
183  future process launches.
184
185* `plugin structured-data darwin-log disable` command
186
187  Executing this command disables os_log() capture in the currently
188  running process and signals LLDB to stop attempting to launch
189  new processes with DarwinLog support enabled.
190
191* `settings set
192  plugin.structured-data.darwin-log.enable-on-startup true`
193
194  and
195
196  `settings set
197  plugin.structured-data.darwin-log.auto-enable-options -- `{options}
198
199  When `enable-on-startup` is set to `true`, then LLDB will automatically
200  enable DarwinLog on startup of relevant processes.  It will use the
201  content provided in the auto-enable-options settings as the
202  options to pass to the enable command.
203
204  Note the `--` required after auto-enable-command.  That is necessary
205  for raw commands like settings set.  The `--` will not become part
206  of the options for the enable command.
207
208### Message flow and related performance considerations
209
210`os_log`()-style collection is not free.  The more data that must be
211processed, the slower it will be.  There are several knobs available
212to the developer to limit how much data goes through the pipe, and how
213much data ultimately goes over the wire to the LLDB client.  The
214user's goal should be to ensure he or she only collects as many log
215messages are needed, but no more.
216
217The flow of data looks like the following:
218
2191. Data comes into debugserver from the low-level OS facility that
220   receives log messages.  The data that comes through this pipe can
221   be limited or expanded by the `--debug`, `--info` and
222   `--all-processes` options of the `plugin structured-data darwin-log
223   enable` command options.  Exclude as many categories as possible
224   here (also the default).  The knobs here are very coarse - for
225   example, whether to include `os_log_info()`-level or
226   `os_log_debug()`-level info, or to include callstacks in the log
227   message event data.
228
2292. The debugserver process filters the messages that arrive through a
230   message log filter that may be fully customized by the user.  It
231   works similar to a rules-based packet filter: a set of rules are
232   matched against the log message, each rule tried in sequential
233   order.  The first rule that matches then either accepts or rejects
234   the message.  If the log message does not match any rule, then the
235   message gets the no-match (i.e. fall-through) action.  The no-match
236   action defaults to accepting but may be set to reject.
237
238   Filters can be added via the enable command's '`--filter`
239   {filter-spec}' option.  Filters are added in order, and multiple
240   `--filter` entries can be provided to the enable command.
241
242   Filters take the following form:
243```
244   {action} {attribute} {op}
245
246   {action} :=
247       accept |
248       reject
249
250   {attribute} :=
251       category       |   // The log message category
252       subsystem      |   // The log message subsystem
253       activity       |   // The child-most activity in force
254                          // at the time the message was logged.
255       activity-chain |   // The complete activity chain, specified
256                          // as {parent-activity}:{child-activity}:
257                          // {grandchild-activity}
258       message        |   // The fully expanded message contents.
259                          // Note this one is expensive because it
260                          // requires expanding the message.  Avoid
261                          // this if possible, or add it further
262                          // down the filter chain.
263
264   {op} :=
265              match {exact-match-text} |
266              regex {search-regex}        // uses C++ std::regex
267                                          // ECMAScript variant.
268```
269   e.g.
270   `--filter "accept subsystem match com.example.mycompany.myproduct"`
271   `--filter "accept subsystem regex com.example.+"`
272   `--filter "reject category regex spammy-system-[[:digit:]]+"`
273
2743. Messages that are accepted by the log message filter get sent to
275   the lldb client, where they are mapped to the
276   StructuredDataDarwinLog plugin.  By default, command-line lldb will
277   issue a Process-level event containing the log message content, and
278   will request the plugin to print the message if the plugin is
279   enabled to do so.
280
281### Log message display
282
283Several settings control aspects of displaying log messages in
284command-line LLDB.  See the `enable` command's help for a description
285of these.
286
287
288## StructuredDataDarwinLog feature
289
290The DarwinLog feature supports logging `os_log`*() and `NSLog`() messages
291to the command-line lldb console, as well as making those messages
292available to LLDB clients via the event system.  Starting with fall
2932016 OSes, Apple platforms introduce a new fire-hose, stream-style
294logging system where the bulk of the log processing happens on the log
295consumer side.  This reduces logging impact on the system when there
296are no consumers, making it cheaper to include logging at all times.
297However, it also increases the work needed on the consumer end when
298log messages are desired.
299
300The debugserver binary has been modified to support collection of
301`os_log`*()/`NSLog`() messages, selection of which messages appear in the
302stream, and fine-grained filtering of what gets passed on to the LLDB
303client.  DarwinLog also tracks the activity chain (i.e. `os_activity`()
304hierarchy) in effect at the time the log messages were issued.  The
305user is able to configure a number of aspects related to the
306formatting of the log message header fields.
307
308The DarwinLog support is written in a way which should support the
309lldb client side on non-Apple clients talking to an Apple device or
310macOS system; hence, the plugin support is built into all LLDB
311clients, not just those built on an Apple platform.
312
313StructuredDataDarwinLog implements the 'DarwinLog' feature type, and
314the plugin name for it shows up as `darwin-log`.
315
316The user interface to the darwin-log support is via the following:
317
318* `plugin structured-data darwin-log enable` command
319
320  This is the main entry point for enabling the command.  It can be
321  set before launching a process or while the process is running.
322  If the user wants to squelch seeing info-level or debug-level
323  messages, which is the default behavior, then the enable command
324  must be made prior to launching the process; otherwise, the
325  info-level and debug-level messages will always show up.  Also,
326  there is a similar "echo os_log()/NSLog() messages to target
327  process stderr" mechanism which is properly disabled when enabling
328  the DarwinLog support prior to launch.  This cannot be squelched
329  if enabling DarwinLog after launch.
330
331  See the help for this command.  There are a number of options
332  to shrink or expand the number of messages that are processed
333  on the remote side and sent over to the client, and other
334  options to control the formatting of messages displayed.
335
336  This command is sticky.  Once enabled, it will stay enabled for
337  future process launches.
338
339* `plugin structured-data darwin-log disable` command
340
341  Executing this command disables os_log() capture in the currently
342  running process and signals LLDB to stop attempting to launch
343  new processes with DarwinLog support enabled.
344
345* `settings set
346  plugin.structured-data.darwin-log.enable-on-startup true`
347
348  and
349
350  `settings set
351  plugin.structured-data.darwin-log.auto-enable-options -- `{options}
352
353  When `enable-on-startup` is set to `true`, then LLDB will automatically
354  enable DarwinLog on startup of relevant processes.  It will use the
355  content provided in the auto-enable-options settings as the
356  options to pass to the enable command.
357
358  Note the `--` required after auto-enable-command.  That is necessary
359  for raw commands like settings set.  The `--` will not become part
360  of the options for the enable command.
361
362### Message flow and related performance considerations
363
364`os_log`()-style collection is not free.  The more data that must be
365processed, the slower it will be.  There are several knobs available
366to the developer to limit how much data goes through the pipe, and how
367much data ultimately goes over the wire to the LLDB client.  The
368user's goal should be to ensure he or she only collects as many log
369messages are needed, but no more.
370
371The flow of data looks like the following:
372
3731. Data comes into debugserver from the low-level OS facility that
374   receives log messages.  The data that comes through this pipe can
375   be limited or expanded by the `--debug`, `--info` and
376   `--all-processes` options of the `plugin structured-data darwin-log
377   enable` command options.  Exclude as many categories as possible
378   here (also the default).  The knobs here are very coarse - for
379   example, whether to include `os_log_info()`-level or
380   `os_log_debug()`-level info, or to include callstacks in the log
381   message event data.
382
3832. The debugserver process filters the messages that arrive through a
384   message log filter that may be fully customized by the user.  It
385   works similar to a rules-based packet filter: a set of rules are
386   matched against the log message, each rule tried in sequential
387   order.  The first rule that matches then either accepts or rejects
388   the message.  If the log message does not match any rule, then the
389   message gets the no-match (i.e. fall-through) action.  The no-match
390   action defaults to accepting but may be set to reject.
391
392   Filters can be added via the enable command's '`--filter`
393   {filter-spec}' option.  Filters are added in order, and multiple
394   `--filter` entries can be provided to the enable command.
395
396   Filters take the following form:
397```
398   {action} {attribute} {op}
399
400   {action} :=
401       accept |
402       reject
403
404   {attribute} :=
405       category       |   // The log message category
406       subsystem      |   // The log message subsystem
407       activity       |   // The child-most activity in force
408                          // at the time the message was logged.
409       activity-chain |   // The complete activity chain, specified
410                          // as {parent-activity}:{child-activity}:
411                          // {grandchild-activity}
412       message        |   // The fully expanded message contents.
413                          // Note this one is expensive because it
414                          // requires expanding the message.  Avoid
415                          // this if possible, or add it further
416                          // down the filter chain.
417
418   {op} :=
419              match {exact-match-text} |
420              regex {search-regex}        // uses C++ std::regex
421                                          // ECMAScript variant.
422```
423   e.g.
424   `--filter "accept subsystem match com.example.mycompany.myproduct"`
425   `--filter "accept subsystem regex com.example.+"`
426   `--filter "reject category regex spammy-system-[[:digit:]]+"`
427
4283. Messages that are accepted by the log message filter get sent to
429   the lldb client, where they are mapped to the
430   StructuredDataDarwinLog plugin.  By default, command-line lldb will
431   issue a Process-level event containing the log message content, and
432   will request the plugin to print the message if the plugin is
433   enabled to do so.
434
435### Log message display
436
437Several settings control aspects of displaying log messages in
438command-line LLDB.  See the `enable` command's help for a description
439of these.
440
441
442
443