• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..--

README.mdD22-Nov-20238.8 KiB217176

cppbor.hD22-Nov-202327.5 KiB

cppbor_parse.hD22-Nov-20236.2 KiB13427

README.md

1CppBor: A Modern C++ CBOR Parser and Generator
2==============================================
3
4CppBor provides a natural and easy-to-use syntax for constructing and
5parsing CBOR messages.  It does not (yet) support all features of
6CBOR, nor (yet) support validation against CDDL schemata, though both
7are planned.  CBOR features that aren't supported include:
8
9* Indefinite length values
10* Semantic tagging
11* Floating point
12
13CppBor requires C++-17.
14
15## CBOR representation
16
17CppBor represents CBOR data items as instances of the `Item` class or,
18more precisely, as instances of subclasses of `Item`, since `Item` is a
19pure interface.  The subclasses of `Item` correspond almost one-to-one
20with CBOR major types, and are named to match the CDDL names to which
21they correspond.  They are:
22
23* `Uint` corresponds to major type 0, and can hold unsigned integers
24  up through (2^64 - 1).
25* `Nint` corresponds to major type 1.  It can only hold values from -1
26  to -(2^63 - 1), since it's internal representation is an int64_t.
27  This can be fixed, but it seems unlikely that applications will need
28  the omitted range from -(2^63) to (2^64 - 1), since it's
29  inconvenient to represent them in many programming languages.
30* `Int` is an abstract base of `Uint` and `Nint` that facilitates
31  working with all signed integers representable with int64_t.
32* `Bstr` corresponds to major type 2, a byte string.
33* `Tstr` corresponds to major type 3, a text string.
34* `Array` corresponds to major type 4, an Array.  It holds a
35  variable-length array of `Item`s.
36* `Map` corresponds to major type 5, a Map.  It holds a
37  variable-length array of pairs of `Item`s.
38* `Simple` corresponds to major type 7.  It's an abstract class since
39  items require more specific type.
40* `Bool` is the only currently-implemented subclass of `Simple`.
41
42Note that major type 6, semantic tag, is not yet implemented.
43
44In practice, users of CppBor will rarely use most of these classes
45when generating CBOR encodings.  This is because CppBor provides
46straightforward conversions from the obvious normal C++ types.
47Specifically, the following conversions are provided in appropriate
48contexts:
49
50* Signed and unsigned integers convert to `Uint` or `Nint`, as
51  appropriate.
52* `std::string`, `std::string_view`, `const char*` and
53  `std::pair<char iterator, char iterator>` convert to `Tstr`.
54* `std::vector<uint8_t>`, `std::pair<uint8_t iterator, uint8_t
55  iterator>` and `std::pair<uint8_t*, size_t>` convert to `Bstr`.
56* `bool` converts to `Bool`.
57
58## CBOR generation
59
60### Complete tree generation
61
62The set of `encode` methods in `Item` provide the interface for
63producing encoded CBOR.  The basic process for "complete tree"
64generation (as opposed to "incremental" generation, which is discussed
65below) is to construct an `Item` which models the data to be encoded,
66and then call one of the `encode` methods, whichever is convenient for
67the encoding destination.  A trivial example:
68
69```
70cppbor::Uint val(0);
71std::vector<uint8_t> encoding = val.encode();
72```
73
74    It's relatively rare that single values are encoded as above.  More often, the
75    "root" data item will be an `Array` or `Map` which contains a more complex structure.For example
76    :
77
78``` using cppbor::Map;
79using cppbor::Array;
80
81std::vector<uint8_t> vec =  // ...
82    Map val("key1", Array(Map("key_a", 99 "key_b", vec), "foo"), "key2", true);
83std::vector<uint8_t> encoding = val.encode();
84```
85
86This creates a map with two entries, with `Tstr` keys "Outer1" and
87"Outer2", respectively.  The "Outer1" entry has as its value an
88`Array` containing a `Map` and a `Tstr`.  The "Outer2" entry has a
89`Bool` value.
90
91This example demonstrates how automatic conversion of C++ types to
92CppBor `Item` subclass instances is done.  Where the caller provides a
93C++ or C string, a `Tstr` entry is added.  Where the caller provides
94an integer literal or variable, a `Uint` or `Nint` is added, depending
95on whether the value is positive or negative.
96
97As an alternative, a more fluent-style API is provided for building up
98structures.  For example:
99
100```
101using cppbor::Map;
102using cppbor::Array;
103
104std::vector<uint8_t> vec =  // ...
105    Map val();
106val.add("key1", Array().add(Map().add("key_a", 99).add("key_b", vec)).add("foo")).add("key2", true);
107std::vector<uint8_t> encoding = val.encode();
108```
109
110    An advantage of this interface over the constructor -
111    based creation approach above is that it need not be done all at once
112        .The `add` methods return a reference to the object added to to allow calls to be chained,
113    but chaining is not necessary; calls can be made
114sequentially, as the data to add is available.
115
116#### `encode` methods
117
118There are several variations of `Item::encode`, all of which
119accomplish the same task but output the encoded data in different
120ways, and with somewhat different performance characteristics.  The
121provided options are:
122
123* `bool encode(uint8\_t** pos, const uint8\_t* end)` encodes into the
124  buffer referenced by the range [`*pos`, end).  `*pos` is moved.  If
125  the encoding runs out of buffer space before finishing, the method
126  returns false.  This is the most efficient way to encode, into an
127  already-allocated buffer.
128* `void encode(EncodeCallback encodeCallback)` calls `encodeCallback`
129  for each encoded byte.  It's the responsibility of the implementor
130  of the callback to behave safely in the event that the output buffer
131  (if applicable) is exhausted.  This is less efficient than the prior
132  method because it imposes an additional function call for each byte.
133* `template </*...*/> void encode(OutputIterator i)`
134  encodes into the provided iterator.  SFINAE ensures that the
135  template doesn't match for non-iterators.  The implementation
136  actually uses the callback-based method, plus has whatever overhead
137  the iterator adds.
138* `std::vector<uint8_t> encode()` creates a new std::vector, reserves
139  sufficient capacity to hold the encoding, and inserts the encoded
140  bytes with a std::pushback_iterator and the previous method.
141* `std::string toString()` does the same as the previous method, but
142  returns a string instead of a vector.
143
144### Incremental generation
145
146Incremental generation requires deeper understanding of CBOR, because
147the library can't do as much to ensure that the output is valid.  The
148basic tool for intcremental generation is the `encodeHeader`
149function.  There are two variations, one which writes into a buffer,
150and one which uses a callback.  Both simply write out the bytes of a
151header.  To construct the same map as in the above examples,
152incrementally, one might write:
153
154```
155using namespace cppbor;  // For example brevity
156
157std::vector encoding;
158auto iter = std::back_inserter(result);
159encodeHeader(MAP, 2 /* # of map entries */, iter);
160std::string s = "key1";
161encodeHeader(TSTR, s.size(), iter);
162std::copy(s.begin(), s.end(), iter);
163encodeHeader(ARRAY, 2 /* # of array entries */, iter);
164Map().add("key_a", 99).add("key_b", vec).encode(iter)
165s = "foo";
166encodeHeader(TSTR, foo.size(), iter);
167std::copy(s.begin(), s.end(), iter);
168s = "key2";
169encodeHeader(TSTR, foo.size(), iter);
170std::copy(s.begin(), s.end(), iter);
171encodeHeader(SIMPLE, TRUE, iter);
172```
173
174As the above example demonstrates, the styles can be mixed -- Note the
175creation and encoding of the inner Map using the fluent style.
176
177## Parsing
178
179CppBor also supports parsing of encoded CBOR data, with the same
180feature set as encoding.  There are two basic approaches to parsing,
181"full" and "stream"
182
183### Full parsing
184
185Full parsing means completely parsing a (possibly-compound) data
186item from a byte buffer.  The `parse` functions that do not take a
187`ParseClient` pointer do this.  They return a `ParseResult` which is a
188tuple of three values:
189
190* std::unique_ptr<Item> that points to the parsed item, or is nullptr
191  if there was a parse error.
192* const uint8_t* that points to the byte after the end of the decoded
193  item, or to the first unparseable byte in the event of an error.
194* std::string that is empty on success or contains an error message if
195  a parse error occurred.
196
197Assuming a successful parse, you can then use `Item::type()` to
198discover the type of the parsed item (e.g. MAP), and then use the
199appropriate `Item::as*()` method (e.g. `Item::asMap()`) to get a
200pointer to an interface which allows you to retrieve specific values.
201
202### Stream parsing
203
204Stream parsing is more complex, but more flexible.  To use
205StreamParsing, you must create your own subclass of `ParseClient` and
206call one of the `parse` functions that accepts it.  See the
207`ParseClient` methods docstrings for details.
208
209One unusual feature of stream parsing is that the `ParseClient`
210callback methods not only provide the parsed Item, but also pointers
211to the portion of the buffer that encode that Item.  This is useful
212if, for example, you want to find an element inside of a structure,
213and then copy the encoding of that sub-structure, without bothering to
214parse the rest.
215
216The full parser is implemented with the stream parser.
217