1FlatBuffers white paper {#flatbuffers_white_paper} 2======================= 3 4This document tries to shed some light on to the "why" of FlatBuffers, a 5new serialization library. 6 7## Motivation 8 9Back in the good old days, performance was all about instructions and 10cycles. Nowadays, processing units have run so far ahead of the memory 11subsystem, that making an efficient application should start and finish 12with thinking about memory. How much you use of it. How you lay it out 13and access it. How you allocate it. When you copy it. 14 15Serialization is a pervasive activity in a lot programs, and a common 16source of memory inefficiency, with lots of temporary data structures 17needed to parse and represent data, and inefficient allocation patterns 18and locality. 19 20If it would be possible to do serialization with no temporary objects, 21no additional allocation, no copying, and good locality, this could be 22of great value. The reason serialization systems usually don't manage 23this is because it goes counter to forwards/backwards compatability, and 24platform specifics like endianness and alignment. 25 26FlatBuffers is what you get if you try anyway. 27 28In particular, FlatBuffers focus is on mobile hardware (where memory 29size and memory bandwidth is even more constrained than on desktop 30hardware), and applications that have the highest performance needs: 31games. 32 33## FlatBuffers 34 35*This is a summary of FlatBuffers functionality, with some rationale. 36A more detailed description can be found in the FlatBuffers 37documentation.* 38 39### Summary 40 41A FlatBuffer is a binary buffer containing nested objects (structs, 42tables, vectors,..) organized using offsets so that the data can be 43traversed in-place just like any pointer-based data structure. Unlike 44most in-memory data structures however, it uses strict rules of 45alignment and endianness (always little) to ensure these buffers are 46cross platform. Additionally, for objects that are tables, FlatBuffers 47provides forwards/backwards compatibility and general optionality of 48fields, to support most forms of format evolution. 49 50You define your object types in a schema, which can then be compiled to 51C++ or Java for low to zero overhead reading & writing. 52Optionally, JSON data can be dynamically parsed into buffers. 53 54### Tables 55 56Tables are the cornerstone of FlatBuffers, since format evolution is 57essential for most applications of serialization. Typically, dealing 58with format changes is something that can be done transparently during 59the parsing process of most serialization solutions out there. 60But a FlatBuffer isn't parsed before it is accessed. 61 62Tables get around this by using an extra indirection to access fields, 63through a *vtable*. Each table comes with a vtable (which may be shared 64between multiple tables with the same layout), and contains information 65where fields for this particular kind of instance of vtable are stored. 66The vtable may also indicate that the field is not present (because this 67FlatBuffer was written with an older version of the software, of simply 68because the information was not necessary for this instance, or deemed 69deprecated), in which case a default value is returned. 70 71Tables have a low overhead in memory (since vtables are small and 72shared) and in access cost (an extra indirection), but provide great 73flexibility. Tables may even cost less memory than the equivalent 74struct, since fields do not need to be stored when they are equal to 75their default. 76 77FlatBuffers additionally offers "naked" structs, which do not offer 78forwards/backwards compatibility, but can be even smaller (useful for 79very small objects that are unlikely to change, like e.g. a coordinate 80pair or a RGBA color). 81 82### Schemas 83 84While schemas reduce some generality (you can't just read any data 85without having its schema), they have a lot of upsides: 86 87- Most information about the format can be factored into the generated 88 code, reducing memory needed to store data, and time to access it. 89 90- The strong typing of the data definitions means less error 91 checking/handling at runtime (less can go wrong). 92 93- A schema enables us to access a buffer without parsing. 94 95FlatBuffer schemas are fairly similar to those of the incumbent, 96Protocol Buffers, and generally should be readable to those familiar 97with the C family of languages. We chose to improve upon the features 98offered by .proto files in the following ways: 99 100- Deprecation of fields instead of manual field id assignment. 101 Extending an object in a .proto means hunting for a free slot among 102 the numbers (preferring lower numbers since they have a more compact 103 representation). Besides being inconvenient, it also makes removing 104 fields problematic: you either have to keep them, not making it 105 obvious that this field shouldn't be read/written anymore, and still 106 generating accessors. Or you remove it, but now you risk that 107 there's still old data around that uses that field by the time 108 someone reuses that field id, with nasty consequences. 109 110- Differentiating between tables and structs (see above). Effectively 111 all table fields are `optional`, and all struct fields are 112 `required`. 113 114- Having a native vector type instead of `repeated`. This gives you a 115 length without having to collect all items, and in the case of 116 scalars provides for a more compact representation, and one that 117 guarantees adjacency. 118 119- Having a native `union` type instead of using a series of `optional` 120 fields, all of which must be checked individually. 121 122- Being able to define defaults for all scalars, instead of having to 123 deal with their optionality at each access. 124 125- A parser that can deal with both schemas and data definitions (JSON 126 compatible) uniformly. 127 128<br> 129