1# Stream
2
3In RapidJSON, `rapidjson::Stream` is a concept for reading/writing JSON. Here we first show how to use streams provided. And then see how to create a custom stream.
4
5[TOC]
6
7# Memory Streams {#MemoryStreams}
8
9Memory streams store JSON in memory.
10
11## StringStream (Input) {#StringStream}
12
13`StringStream` is the most basic input stream. It represents a complete, read-only JSON stored in memory. It is defined in `rapidjson/rapidjson.h`.
14
15~~~~~~~~~~cpp
16#include "rapidjson/document.h" // will include "rapidjson/rapidjson.h"
17
18using namespace rapidjson;
19
20// ...
21const char json[] = "[1, 2, 3, 4]";
22StringStream s(json);
23
24Document d;
25d.ParseStream(s);
26~~~~~~~~~~
27
28Since this is very common usage, `Document::Parse(const char*)` is provided to do exactly the same as above:
29
30~~~~~~~~~~cpp
31// ...
32const char json[] = "[1, 2, 3, 4]";
33Document d;
34d.Parse(json);
35~~~~~~~~~~
36
37Note that, `StringStream` is a typedef of `GenericStringStream<UTF8<> >`, user may use another encodings to represent the character set of the stream.
38
39## StringBuffer (Output) {#StringBuffer}
40
41`StringBuffer` is a simple output stream. It allocates a memory buffer for writing the whole JSON. Use `GetString()` to obtain the buffer.
42
43~~~~~~~~~~cpp
44#include "rapidjson/stringbuffer.h"
45
46StringBuffer buffer;
47Writer<StringBuffer> writer(buffer);
48d.Accept(writer);
49
50const char* output = buffer.GetString();
51~~~~~~~~~~
52
53When the buffer is full, it will increases the capacity automatically. The default capacity is 256 characters (256 bytes for UTF8, 512 bytes for UTF16, etc.). User can provide an allocator and a initial capacity.
54
55~~~~~~~~~~cpp
56StringBuffer buffer1(0, 1024); // Use its allocator, initial size = 1024
57StringBuffer buffer2(allocator, 1024);
58~~~~~~~~~~
59
60By default, `StringBuffer` will instantiate an internal allocator.
61
62Similarly, `StringBuffer` is a typedef of `GenericStringBuffer<UTF8<> >`.
63
64# File Streams {#FileStreams}
65
66When parsing a JSON from file, you may read the whole JSON into memory and use ``StringStream`` above.
67
68However, if the JSON is big, or memory is limited, you can use `FileReadStream`. It only read a part of JSON from file into buffer, and then let the part be parsed. If it runs out of characters in the buffer, it will read the next part from file.
69
70## FileReadStream (Input) {#FileReadStream}
71
72`FileReadStream` reads the file via a `FILE` pointer. And user need to provide a buffer.
73
74~~~~~~~~~~cpp
75#include "rapidjson/filereadstream.h"
76#include <cstdio>
77
78using namespace rapidjson;
79
80FILE* fp = fopen("big.json", "rb"); // non-Windows use "r"
81
82char readBuffer[65536];
83FileReadStream is(fp, readBuffer, sizeof(readBuffer));
84
85Document d;
86d.ParseStream(is);
87
88fclose(fp);
89~~~~~~~~~~
90
91Different from string streams, `FileReadStream` is byte stream. It does not handle encodings. If the file is not UTF-8, the byte stream can be wrapped in a `EncodedInputStream`. It will be discussed very soon.
92
93Apart from reading file, user can also use `FileReadStream` to read `stdin`.
94
95## FileWriteStream (Output) {#FileWriteStream}
96
97`FileWriteStream` is buffered output stream. Its usage is very similar to `FileReadStream`.
98
99~~~~~~~~~~cpp
100#include "rapidjson/filewritestream.h"
101#include <cstdio>
102
103using namespace rapidjson;
104
105Document d;
106d.Parse(json);
107// ...
108
109FILE* fp = fopen("output.json", "wb"); // non-Windows use "w"
110
111char writeBuffer[65536];
112FileWriteStream os(fp, writeBuffer, sizeof(writeBuffer));
113
114Writer<FileWriteStream> writer(os);
115d.Accept(writer);
116
117fclose(fp);
118~~~~~~~~~~
119
120It can also directs the output to `stdout`.
121
122# Encoded Streams {#EncodedStreams}
123
124Encoded streams do not contain JSON itself, but they wrap byte streams to provide basic encoding/decoding function.
125
126As mentioned above, UTF-8 byte streams can be read directly. However, UTF-16 and UTF-32 have endian issue. To handle endian correctly, it needs to convert bytes into characters (e.g. `wchar_t` for UTF-16) while reading, and characters into bytes while writing.
127
128Besides, it also need to handle [byte order mark (BOM)](http://en.wikipedia.org/wiki/Byte_order_mark). When reading from a byte stream, it is needed to detect or just consume the BOM if exists. When writing to a byte stream, it can optionally write BOM.
129
130If the encoding of stream is known in compile-time, you may use `EncodedInputStream` and `EncodedOutputStream`. If the stream can be UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE JSON, and it is only known in runtime, you may use `AutoUTFInputStream` and `AutoUTFOutputStream`. These streams are defined in `rapidjson/encodedstream.h`.
131
132Note that, these encoded streams can be applied to streams other than file. For example, you may have a file in memory, or a custom byte stream, be wrapped in encoded streams.
133
134## EncodedInputStream {#EncodedInputStream}
135
136`EncodedInputStream` has two template parameters. The first one is a `Encoding` class, such as `UTF8`, `UTF16LE`, defined in `rapidjson/encodings.h`. The second one is the class of stream to be wrapped.
137
138~~~~~~~~~~cpp
139#include "rapidjson/document.h"
140#include "rapidjson/filereadstream.h"   // FileReadStream
141#include "rapidjson/encodedstream.h"    // EncodedInputStream
142#include <cstdio>
143
144using namespace rapidjson;
145
146FILE* fp = fopen("utf16le.json", "rb"); // non-Windows use "r"
147
148char readBuffer[256];
149FileReadStream bis(fp, readBuffer, sizeof(readBuffer));
150
151EncodedInputStream<UTF16LE<>, FileReadStream> eis(bis);  // wraps bis into eis
152
153Document d; // Document is GenericDocument<UTF8<> >
154d.ParseStream<0, UTF16LE<> >(eis);  // Parses UTF-16LE file into UTF-8 in memory
155
156fclose(fp);
157~~~~~~~~~~
158
159## EncodedOutputStream {#EncodedOutputStream}
160
161`EncodedOutputStream` is similar but it has a `bool putBOM` parameter in the constructor, controlling whether to write BOM into output byte stream.
162
163~~~~~~~~~~cpp
164#include "rapidjson/filewritestream.h"  // FileWriteStream
165#include "rapidjson/encodedstream.h"    // EncodedOutputStream
166#include <cstdio>
167
168Document d;         // Document is GenericDocument<UTF8<> >
169// ...
170
171FILE* fp = fopen("output_utf32le.json", "wb"); // non-Windows use "w"
172
173char writeBuffer[256];
174FileWriteStream bos(fp, writeBuffer, sizeof(writeBuffer));
175
176typedef EncodedOutputStream<UTF32LE<>, FileWriteStream> OutputStream;
177OutputStream eos(bos, true);   // Write BOM
178
179Writer<OutputStream, UTF32LE<>, UTF8<>> writer(eos);
180d.Accept(writer);   // This generates UTF32-LE file from UTF-8 in memory
181
182fclose(fp);
183~~~~~~~~~~
184
185## AutoUTFInputStream {#AutoUTFInputStream}
186
187Sometimes an application may want to handle all supported JSON encoding. `AutoUTFInputStream` will detection encoding by BOM first. If BOM is unavailable, it will use  characteristics of valid JSON to make detection. If neither method success, it falls back to the UTF type provided in constructor.
188
189Since the characters (code units) may be 8-bit, 16-bit or 32-bit. `AutoUTFInputStream` requires a character type which can hold at least 32-bit. We may use `unsigned`, as in the template parameter:
190
191~~~~~~~~~~cpp
192#include "rapidjson/document.h"
193#include "rapidjson/filereadstream.h"   // FileReadStream
194#include "rapidjson/encodedstream.h"    // AutoUTFInputStream
195#include <cstdio>
196
197using namespace rapidjson;
198
199FILE* fp = fopen("any.json", "rb"); // non-Windows use "r"
200
201char readBuffer[256];
202FileReadStream bis(fp, readBuffer, sizeof(readBuffer));
203
204AutoUTFInputStream<unsigned, FileReadStream> eis(bis);  // wraps bis into eis
205
206Document d;         // Document is GenericDocument<UTF8<> >
207d.ParseStream<0, AutoUTF<unsigned> >(eis); // This parses any UTF file into UTF-8 in memory
208
209fclose(fp);
210~~~~~~~~~~
211
212When specifying the encoding of stream, uses `AutoUTF<CharType>` as in `ParseStream()` above.
213
214You can obtain the type of UTF via `UTFType GetType()`. And check whether a BOM is found by `HasBOM()`
215
216## AutoUTFOutputStream {#AutoUTFOutputStream}
217
218Similarly, to choose encoding for output during runtime, we can use `AutoUTFOutputStream`. This class is not automatic *per se*. You need to specify the UTF type and whether to write BOM in runtime.
219
220~~~~~~~~~~cpp
221using namespace rapidjson;
222
223void WriteJSONFile(FILE* fp, UTFType type, bool putBOM, const Document& d) {
224    char writeBuffer[256];
225    FileWriteStream bos(fp, writeBuffer, sizeof(writeBuffer));
226
227    typedef AutoUTFOutputStream<unsigned, FileWriteStream> OutputStream;
228    OutputStream eos(bos, type, putBOM);
229
230    Writer<OutputStream, UTF8<>, AutoUTF<> > writer;
231    d.Accept(writer);
232}
233~~~~~~~~~~
234
235`AutoUTFInputStream` and `AutoUTFOutputStream` is more convenient than `EncodedInputStream` and `EncodedOutputStream`. They just incur a little bit runtime overheads.
236
237# Custom Stream {#CustomStream}
238
239In addition to memory/file streams, user can create their own stream classes which fits RapidJSON's API. For example, you may create network stream, stream from compressed file, etc.
240
241RapidJSON combines different types using templates. A class containing all required interface can be a stream. The Stream interface is defined in comments of `rapidjson/rapidjson.h`:
242
243~~~~~~~~~~cpp
244concept Stream {
245    typename Ch;    //!< Character type of the stream.
246
247    //! Read the current character from stream without moving the read cursor.
248    Ch Peek() const;
249
250    //! Read the current character from stream and moving the read cursor to next character.
251    Ch Take();
252
253    //! Get the current read cursor.
254    //! \return Number of characters read from start.
255    size_t Tell();
256
257    //! Begin writing operation at the current read pointer.
258    //! \return The begin writer pointer.
259    Ch* PutBegin();
260
261    //! Write a character.
262    void Put(Ch c);
263
264    //! Flush the buffer.
265    void Flush();
266
267    //! End the writing operation.
268    //! \param begin The begin write pointer returned by PutBegin().
269    //! \return Number of characters written.
270    size_t PutEnd(Ch* begin);
271}
272~~~~~~~~~~
273
274For input stream, they must implement `Peek()`, `Take()` and `Tell()`.
275For output stream, they must implement `Put()` and `Flush()`.
276There are two special interface, `PutBegin()` and `PutEnd()`, which are only for *in situ* parsing. Normal streams do not implement them. However, if the interface is not needed for a particular stream, it is still need to a dummy implementation, otherwise will generate compilation error.
277
278## Example: istream wrapper {#ExampleIStreamWrapper}
279
280The following example is a wrapper of `std::istream`, which only implements 3 functions.
281
282~~~~~~~~~~cpp
283class IStreamWrapper {
284public:
285    typedef char Ch;
286
287    IStreamWrapper(std::istream& is) : is_(is) {
288    }
289
290    Ch Peek() const { // 1
291        int c = is_.peek();
292        return c == std::char_traits<char>::eof() ? '\0' : (Ch)c;
293    }
294
295    Ch Take() { // 2
296        int c = is_.get();
297        return c == std::char_traits<char>::eof() ? '\0' : (Ch)c;
298    }
299
300    size_t Tell() const { return (size_t)is_.tellg(); } // 3
301
302    Ch* PutBegin() { assert(false); return 0; }
303    void Put(Ch) { assert(false); }
304    void Flush() { assert(false); }
305    size_t PutEnd(Ch*) { assert(false); return 0; }
306
307private:
308    IStreamWrapper(const IStreamWrapper&);
309    IStreamWrapper& operator=(const IStreamWrapper&);
310
311    std::istream& is_;
312};
313~~~~~~~~~~
314
315User can use it to wrap instances of `std::stringstream`, `std::ifstream`.
316
317~~~~~~~~~~cpp
318const char* json = "[1,2,3,4]";
319std::stringstream ss(json);
320IStreamWrapper is(ss);
321
322Document d;
323d.ParseStream(is);
324~~~~~~~~~~
325
326Note that, this implementation may not be as efficient as RapidJSON's memory or file streams, due to internal overheads of the standard library.
327
328## Example: ostream wrapper {#ExampleOStreamWrapper}
329
330The following example is a wrapper of `std::istream`, which only implements 2 functions.
331
332~~~~~~~~~~cpp
333class OStreamWrapper {
334public:
335    typedef char Ch;
336
337    OStreamWrapper(std::ostream& os) : os_(os) {
338    }
339
340    Ch Peek() const { assert(false); return '\0'; }
341    Ch Take() { assert(false); return '\0'; }
342    size_t Tell() const {  }
343
344    Ch* PutBegin() { assert(false); return 0; }
345    void Put(Ch c) { os_.put(c); }                  // 1
346    void Flush() { os_.flush(); }                   // 2
347    size_t PutEnd(Ch*) { assert(false); return 0; }
348
349private:
350    OStreamWrapper(const OStreamWrapper&);
351    OStreamWrapper& operator=(const OStreamWrapper&);
352
353    std::ostream& os_;
354};
355~~~~~~~~~~
356
357User can use it to wrap instances of `std::stringstream`, `std::ofstream`.
358
359~~~~~~~~~~cpp
360Document d;
361// ...
362
363std::stringstream ss;
364OSStreamWrapper os(ss);
365
366Writer<OStreamWrapper> writer(os);
367d.Accept(writer);
368~~~~~~~~~~
369
370Note that, this implementation may not be as efficient as RapidJSON's memory or file streams, due to internal overheads of the standard library.
371
372# Summary {#Summary}
373
374This section describes stream classes available in RapidJSON. Memory streams are simple. File stream can reduce the memory required during JSON parsing and generation, if the JSON is stored in file system. Encoded streams converts between byte streams and character streams. Finally, user may create custom streams using a simple interface.
375