1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
2<html>
3    <head>
4        <link rel="stylesheet" type="text/css" href="opus_in_isobmff.css"/>
5        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
6        <title>Encapsulation of Opus in ISO Base Media File Format</title>
7    </head>
8    <body bgcolor="0x333333" text="#60B0C0">
9        <b><u>Encapsulation of Opus in ISO Base Media File Format</u></b><br>
10        <font size="2">last updated: August 28, 2018</font><br>
11        <br>
12        <div class="normal_link pre frame_box">
13
14                                Encapsulation of Opus in ISO Base Media File Format
15                                        Version 0.8.1 (incomplete)
16
17
18Table of Contents
19<a href="#1">1</a> Scope
20<a href="#2">2</a> Normative References
21<a href="#3">3</a> Terms and Definitions
22<a href="#4">4</a> Design Rules of Encapsulation
23    <a href="#4.1">4.1</a> File Type Identification
24    <a href="#4.2">4.2</a> Overview of Track Structure
25    <a href="#4.3">4.3</a> Definitions of Opus sample
26        <a href="#4.3.1">4.3.1</a> Sample entry format
27        <a href="#4.3.2">4.3.2</a> Opus Specific Box
28        <a href="#4.3.3">4.3.3</a> Sample format
29        <a href="#4.3.4">4.3.4</a> Duration of Opus sample
30        <a href="#4.3.5">4.3.5</a> Sub-sample
31        <a href="#4.3.6">4.3.6</a> Random Access
32            <a href="#4.3.6.1">4.3.6.1</a> Random Access Point
33            <a href="#4.3.6.2">4.3.6.2</a> Pre-roll
34    <a href="#4.4">4.4</a> Trimming of Actual Duration
35    <a href="#4.5">4.5</a> Channel Mapping
36        <a href="#4.5.1">4.5.1</a> ISO Base Media native Channel Mapping
37        <a href="#4.5.2">4.5.2</a> Composition on all active tracks (informative)
38    <a href="#4.6">4.6</a> Basic Structure (informative)
39        <a href="#4.6.1">4.6.2</a> Initial Movie
40        <a href="#4.6.2">4.6.3</a> Movie Fragments
41    <a href="#4.7">4.7</a> Example of Encapsulation (informative)
42<a href="#5">5</a> Author's Address
43
44<a name="1"></a>
451 Scope
46    This specification specifies the fundamental way of the encapsulation of Opus coded bitstreams in ISO Base Media
47    file format and its derivatives. The encapsulation of Opus coded bitstreams in QuickTime file format is outside
48    the scope of this specification.
49
50<a name="2"></a>
512 Normative References
52    [1] ISO/IEC 14496-12:2015 Corrected version
53        Information technology — Coding of audio-visual objects — Part 12: ISO base media file format
54
55    [2] RFC 6716
56        Definition of the Opus Audio Codec
57
58    [3] RFC 7845
59        Ogg Encapsulation for the Opus Audio Codec
60
61<a name="3"></a>
623 Terms and Definitions
63    3.1 active track
64        enabled track from the non-alternate group or selected track from alternate group
65
66    3.2 actual duration
67        duration constructed from valid samples
68
69    3.3 edit
70        entry in the Edit List Box
71
72    3.4 padded samples
73        PCM samples after decoding Opus sample(s) which are not valid samples
74        An Opus bitstream always contains them partially at the beginning and may contain them in part at the end, as
75        long as not physically removed yet at the beginning and/or the end.
76
77    3.5 priming samples
78        padded samples at the beginning of the Opus bitstream
79
80    3.6 sample-accurate
81        for any PCM sample, a timestamp exactly matching its sampling timestamp is present in the media timeline.
82
83    3.7 valid samples
84        PCM samples after decoding Opus sample(s) corresponding to input PCM samples
85
86<a name="4"></a>
874 Design Rules of Encapsulation
88    4.1 File Type Identification<a name="4.1"></a>
89        This specification defines the brand 'Opus' to declare files are conformant to this specification. Additionally,
90        files conformant to this specification shall contain at least one brand, which supports the requirements and the
91        requirements described in this clause without contradiction, in the compatible brands list of the File Type Box.
92        As an example, the minimal support of the encapsulation of Opus bitstreams in ISO Base Media file format requires
93        the 'iso2' brand in the compatible brands list since support of roll groups is required.
94<a name="4.2"></a>
95    4.2 Overview of Track Structure
96        This clause summarizes requirements of the encapsulation of Opus coded bitstream as media data in audio tracks
97        in file formats compliant with the ISO Base Media File Format. The details are described in clauses after this
98        clause.
99            + The handler_type field in the Handler Reference Box shall be set to 'soun'.
100            + The Media Information Box shall contain the Sound Media Header Box.
101            + The codingname of the sample entry is 'Opus'.
102                This specification does not define any encapsulation using MP4AudioSampleEntry with objectTypeIndication
103                specified by the MPEG-4 Registration Authority (http://www.mp4ra.org/).
104                See 4.3.1 Sample entry format to get the details about the sample entry.
105            + The 'dOps' box is added to the sample entry to convey initializing information for the decoder.
106                See 4.3.2 Opus Specific Box to get the details.
107            + An Opus sample is exactly one Opus packet for each of different Opus bitstreams.
108                See 4.3.3 Sample format to get the details.
109            + Every Opus sample is a sync sample but requires pre-roll for every random access to get correct output.
110                See 4.3.6 Random Access to get the details.
111<a name="4.3"></a>
112    4.3 Definitions of Opus sample
113        4.3.1 Sample entry format<a name="4.3.1"></a>
114            For any track containing Opus bitstreams, at least one sample entry describing corresponding Opus bitstream
115            shall be present inside the Sample Table Box. This version of the specification defines only one sample
116            entry format named OpusSampleEntry whose codingname is 'Opus'. This sample entry includes exactly one Opus
117            Specific Box defined in 4.3.2 as a mandatory box and indicates that Opus samples described by this sample
118            entry are stored by the sample format described in 4.3.3.
119
120            The syntax and semantics of the OpusSampleEntry is shown as follows.
121
122            class OpusSampleEntry() extends AudioSampleEntry ('Opus') {
123                OpusSpecificBox();
124            }
125
126            + channelcount:
127                The channelcount field indicates the number of output channels and shall be set to the same value of
128                the OutputChannelCount in the OpusDecoderConfigurationRecord. The value of this field may be used in
129                the ChannelLayout if any as described in 4.5.1.
130            + samplesize:
131                The samplesize field shall be set to 16.
132            + samplerate:
133                The samplerate field shall be set to 48000&lt&lt16.
134            + OpusSpecificBox
135                This box contains initializing information for the decoder as defined in 4.3.2.
136
137        4.3.2 Opus Specific Box<a name="4.3.2"></a>
138            Exactly one Opus Specific Box shall be present in each OpusSampleEntry.
139            The Opus Specific Box contains an OpusDecoderConfigurationRecord which contains the Version field and
140            this specification defines version 0 of this record. If incompatible changes occured in the fields after
141            the Version field within the OpusDecoderConfigurationRecord in the future versions of this specification,
142            another version will be defined.
143            This box refers to Ogg Opus [3] at many parts but all the data are stored as big-endian format.
144
145            The syntax and semantics of the Opus Specific Box is shown as follows.
146
147            class ChannelMappingTable (unsigned int(8) OutputChannelCount) {
148                unsigned int(8) StreamCount;
149                unsigned int(8) CoupledCount;
150                unsigned int(8 * OutputChannelCount) ChannelMapping;
151            }
152
153            aligned(8) class OpusDecoderConfigurationRecord {
154                unsigned int(8) Version;
155                unsigned int(8) OutputChannelCount;
156                unsigned int(16) PreSkip;
157                unsigned int(32) InputSampleRate;
158                signed int(16) OutputGain;
159                unsigned int(8) ChannelMappingFamily;
160                if (ChannelMappingFamily != 0) {
161                    ChannelMappingTable(OutputChannelCount);
162                }
163            }
164
165            class OpusSpecificBox extends Box('dOps') {
166                OpusDecoderConfigurationRecord() OpusConfig;
167            }
168
169            + Version:
170                The Version field shall be set to 0.
171                In the future versions of this specification, this field may be set to other values. And without support
172                of those values, the reader shall not read the fields after this within the OpusSpecificBox.
173            + OutputChannelCount:
174                The OutputChannelCount field shall be set to the same value as the *Output Channel Count* field in the
175                identification header defined in Ogg Opus [3].
176            + PreSkip:
177                The PreSkip field indicates the number of the priming samples, that is, the number of samples at 48000 Hz
178                to discard from the decoder output when starting playback. The value of the PreSkip field shall be at least
179                80 milliseconds' worth of PCM samples even when removing any number of Opus samples which may or may not
180                contain the priming samples. The PreSkip field is not used for discarding the priming samples at the whole
181                playback at all since it is informative only, and that task falls on the Edit List Box.
182            + InputSampleRate:
183                The InputSampleRate field shall be set to the same value as the *Input Sample Rate* field in the
184                identification header defined in Ogg Opus [3].
185            + OutputGain:
186                The OutputGain field shall be set to the same value as the *Output Gain* field in the identification
187                header define in Ogg Opus [3]. Note that the value is stored as 8.8 fixed-point.
188            + ChannelMappingFamily:
189                The ChannelMappingFamily field shall be set to the same value as the *Channel Mapping Family* field in
190                the identification header defined in Ogg Opus [3]. Note that the value 255 may be used for an alternative
191                to map channels by ISO Base Media native mapping. The details are described in 4.5.1.
192            + StreamCount:
193                The StreamCount field shall be set to the same value as the *Stream Count* field in the identification
194                header defined in Ogg Opus [3].
195            + CoupledCount:
196                The CoupledCount field shall be set to the same value as the *Coupled Count* field in the identification
197                header defined in Ogg Opus [3].
198            + ChannelMapping:
199                The ChannelMapping field shall be set to the same octet string as *Channel Mapping* field in the identi-
200                fication header defined in Ogg Opus [3].
201
202        4.3.3 Sample format<a name="4.3.3"></a>
203            An Opus sample is exactly one Opus packet for each of different Opus bitstreams. Due to support more than
204            two channels, an Opus sample can contain frames from multiple Opus bitstreams but all Opus packets shall
205            share with the total of frame sizes in a single Opus sample. The way of how to pack an Opus packet from
206            each of Opus bitstreams into a single Opus sample follows Appendix B. in RFC 6716 [2].
207            The endianness has nothing to do with any Opus sample since every Opus packet is processed byte-by-byte.
208            In this specification, 'sample' means 'Opus sample' except for 'padded samples', 'priming samples', 'valid
209            sample' and 'sample-accurate', i.e. 'sample' is 'sample' in the term defined in ISO/IEC 14496-12 [1].
210
211                +-----------------------------------------+-------------------------------------+
212                | Opus packet 0 (self-delimiting framing) | Opus packet 1 (undelimited framing) |
213                +-----------------------------------------+-------------------------------------+
214                |<---------------------------- the size of Opus sample ------------------------>|
215
216                    Figure 1 - Example structure of an Opus sample containing two Opus bitstreams
217
218        4.3.4 Duration of Opus sample<a name="4.3.4"></a>
219            The duration of Opus sample is given by multiplying the total of frame sizes for a single Opus bitstream
220            expressed in seconds by the value of the timescale field in the Media Header Box.
221            Let's say an Opus sample consists of two Opus bitstreams, where the frame size of one bitstream is 40 milli-
222            seconds and the frame size of another is 60 milliseconds, and the timescale field in the Media Header Box
223            is set to 48000, then the duration of that Opus sample shall be 120 milliseconds since three 40 millisecond
224            frame and two 60 millisecond frames shall be contained because of the maximum duration of Opus packet, 120
225            milliseconds, and 5760 in the timescale indicated in the Media Header Box.
226
227            To indicate the valid samples excluding the padded samples at the end of Opus bitstream, the duration of
228            the last Opus sample of an Opus bitstream is given by multiplying the number of the valid samples by the
229            value produced by dividing the value of the timescale field in the Media Header Box by 48000.
230
231        4.3.5 Sub-sample<a name="4.3.5"></a>
232            The structure of the last Opus packet in an Opus sample is different from the others in the same Opus sample,
233            and the others are invalid Opus packets as an Opus sample because of self-delimiting framing. To avoid
234            complexities, sub-sample is not defined for Opus sample in this specification.
235
236        4.3.6 Random Access<a name="4.3.6"></a>
237            This subclause describes the nature of the random access of Opus sample.
238
239            4.3.6.1 Random Access Point<a name="4.3.6.1"></a>
240                All Opus samples can be independently decoded i.e. every Opus sample is a sync sample. Therefore, the
241                Sync Sample Box shall not be present as long as there are no samples other than Opus samples in the same
242                track. And the sample_is_non_sync_sample field for Opus samples shall be set to 0.
243
244            4.3.6.2 Pre-roll<a name="4.3.6.2"></a>
245                Opus bitstream requires at least 80 millisecond pre-roll after each random access to get correct output.
246                Pre-roll is indicated by the roll_distance field in AudioRollRecoveryEntry. AudioPreRollEntry shall not
247                be used since every Opus sample is a sync sample in Opus bitstream. Note that roll_distance is expressed
248                in sample units in a term of ISO Base Media File Format, and always takes negative values.
249
250                For any track containing Opus bitstreams, at least one Sample Group Description Box and at least one
251                Sample to Group Box within the Sample Table Box shall be present and these have the grouping_type field
252                set to 'roll'. If any Opus sample is contained in a track fragment, the Sample to Group Box with the
253                grouping_type field set to 'roll' shall be present for that track fragment.
254
255                For the requirement of AudioRollRecoveryEntry, the compatible_brands field in the File Type Box shall
256                contain at least one brand which requires support for roll groups.
257<a name="4.4"></a>
258    4.4 Trimming of Actual Duration
259        Due to the priming samples (or the padding at the beginning) derived from the pre-roll for the startup and the
260        padded samples at the end, we need trim from media to get the actual duration. An edit in the Edit List Box can
261        achieve this demand, and the Edit Box and the Edit List Box shall be present.
262
263        For sample-accurate trimming, proper timescale should be set to the timescale field in the Movie Header Box
264        and the Media Header Box inside Track Box(es) for Opus bitstream. The timescale field in the Media Header Box is
265        typically set to 48000. It is recommended that the timescale field in the Movie Header Box be set to the same
266        value of the timescale field in the Media Header Box in order to avoid the rounding problem when specifying
267        duration of edit if the timescales in all of the Media Header Boxes are set to the same value.
268
269        For example, to indicate the actual duration of an Opus bitstream in a track with the timescale fields of both
270        the Movie Header Box and the Media Header Box set to 48000, we would use the following edit:
271            segment_duration = the number of the valid samples
272            media_time = the number of the priming samples
273            media_rate = 1 &lt&lt 16
274
275        The Edit List Box is applied to whole movie including all movie fragments. Therefore, it is impossible to tell
276        the actual duration in the case producing movie fragments on the fly such as live-streaming. In such cases,
277        the duration of the last Opus sample may be helpful by setting zero to the segment_duration field since the
278        value 0 represents implicit duration equal to the sum of the duration of all samples.
279<a name="4.5"></a>
280    4.5 Channel Mapping
281        4.5.1 ISO Base Media native Channel Mapping<a name="4.5.1"></a>
282            ISO Base Media File Format, that is ISO/IEC 14496-12 [1], defines an extension ChannelLayout to the
283            AudioSampleEntry, which conveys information of mapping channels to loudspeaker positions. The ChannelLayout
284            enables to specify the channel layout more flexibly than the predefined layouts of the ChannelMappingFamily.
285
286            To utilize the ChannelLayout for OpusSampleEntry, the ChannelMappingFamily field should be set to 255.
287            Even when the ChannelMappingFamily field is set to another value, the assignment of each output channel to
288            loudspeaker position specified by the ChannelMappingFamily would be changed as specified by the ChannelLayout.
289            The procedure of the assignment is the following.
290
291                1. Decoded channels are mapped to output channels according to the ChannelMappingTable.
292                2. Output channels are mapped to loudspeaker positions according to the ChannelLayout.
293
294            In this way, the parameters of the Opus Specific Box are processed before the ChannelLayout, and the
295            ChannelLayout shall follow the Opus Specific Box.
296
297        4.5.2 Composition on all active tracks (informative)<a name="4.5.2"></a>
298            By the application of alternate_group in the Track Header Box, whole audio channels in all active tracks from
299            non-alternate group and/or different alternate group from each other are composited into the presentation. If
300            an Opus sample consists of multiple Opus bitstreams, it can be splitted into individual Opus bitstreams and
301            reconstructed into new Opus samples as long as every Opus bitstream has the same total duration in each Opus
302            sample. This nature can be utilized to encapsulate a single Opus bitstream in each track without breaking the
303            original channel layout.
304
305            As an example, let's say there is a following track:
306                OutputChannelCount = 6;
307                StreamCount        = 4;
308                CoupledCount       = 2;
309                ChannelMapping     = {0, 4, 1, 2, 3, 5}; // front left, front center, front right,
310                                                         // rear left, rear right, LFE
311            Here, to couple front left to front right channels into the first stream, and couple rear left to rear right
312            channels into the second stream, reordering is needed since coupled streams must precede any non-coupled
313            stream. You extract the four Opus bitstreams from this track and you encapsulate two of the four into a track
314            and the others into another track. The former track is as follows.
315                OutputChannelCount = 6;
316                StreamCount        = 2;
317                CoupledCount       = 2;
318                ChannelMapping     = {0, 255, 1, 2, 3, 255}; // front left, front center, front right,
319                                                             // rear left, rear right, LFE
320            And the latter track is as follows.
321                OutputChannelCount = 6;
322                StreamCount        = 2;
323                CoupledCount       = 0;
324                ChannelMapping     = {255, 0, 255, 255, 255, 1}; // front left, front center, front right,
325                                                                 // rear left, rear right, LFE
326            In addition, the value of the alternate_group field in the both tracks is set to 0. As the result, the player
327            may play as if channels with 255 are not present, and play the presentation constructed from the both tracks
328            in the same channel layout as the one of the original track. Keep in mind that the way of the composition, i.e.
329            the mixing for playback, is not defined here, and maybe different results could occur except for the channel
330            layout of the original, depending on an implementation or the definition of a derived file format.
331
332            Note that some derived file formats may specify the restriction to ignore alternate grouping. In the context
333            of such file formats, this application is not available. This unavailability does not mean incompatibilities
334            among file formats unless the restriction to the value of the alternate_group field is specified and brings
335            about any conflict among their definitions.
336<a name="4.6"></a>
337    4.6 Basic Structure (informative)
338        4.6.1 Initial Movie<a name="4.6.1"></a>
339            This subclause shows a basic structure of the Movie Box as follows:
340
341            +----+----+----+----+----+----+----+----+------------------------------+
342            |moov|    |    |    |    |    |    |    | Movie Box                    |
343            +----+----+----+----+----+----+----+----+------------------------------+
344            |    |mvhd|    |    |    |    |    |    | Movie Header Box             |
345            +----+----+----+----+----+----+----+----+------------------------------+
346            |    |trak|    |    |    |    |    |    | Track Box                    |
347            +----+----+----+----+----+----+----+----+------------------------------+
348            |    |    |tkhd|    |    |    |    |    | Track Header Box             |
349            +----+----+----+----+----+----+----+----+------------------------------+
350            |    |    |edts|    |    |    |    |    | Edit Box                     |
351            +----+----+----+----+----+----+----+----+------------------------------+
352            |    |    |    |elst|    |    |    |    | Edit List Box                |
353            +----+----+----+----+----+----+----+----+------------------------------+
354            |    |    |mdia|    |    |    |    |    | Media Box                    |
355            +----+----+----+----+----+----+----+----+------------------------------+
356            |    |    |    |mdhd|    |    |    |    | Media Header Box             |
357            +----+----+----+----+----+----+----+----+------------------------------+
358            |    |    |    |hdlr|    |    |    |    | Handler Reference Box        |
359            +----+----+----+----+----+----+----+----+------------------------------+
360            |    |    |    |minf|    |    |    |    | Media Information Box        |
361            +----+----+----+----+----+----+----+----+------------------------------+
362            |    |    |    |    |smhd|    |    |    | Sound Media Header Box       |
363            +----+----+----+----+----+----+----+----+------------------------------+
364            |    |    |    |    |dinf|    |    |    | Data Information Box         |
365            +----+----+----+----+----+----+----+----+------------------------------+
366            |    |    |    |    |    |dref|    |    | Data Reference Box           |
367            +----+----+----+----+----+----+----+----+------------------------------+
368            |    |    |    |    |    |    |url |    | DataEntryUrlBox              |
369            +----+----+----+----+----+----+ or +----+------------------------------+
370            |    |    |    |    |    |    |urn |    | DataEntryUrnBox              |
371            +----+----+----+----+----+----+----+----+------------------------------+
372            |    |    |    |    |stbl|    |    |    | Sample Table                 |
373            +----+----+----+----+----+----+----+----+------------------------------+
374            |    |    |    |    |    |stsd|    |    | Sample Description Box       |
375            +----+----+----+----+----+----+----+----+------------------------------+
376            |    |    |    |    |    |    |Opus|    | OpusSampleEntry              |
377            +----+----+----+----+----+----+----+----+------------------------------+
378            |    |    |    |    |    |    |    |dOps| Opus Specific Box            |
379            +----+----+----+----+----+----+----+----+------------------------------+
380            |    |    |    |    |    |stts|    |    | Decoding Time to Sample Box  |
381            +----+----+----+----+----+----+----+----+------------------------------+
382            |    |    |    |    |    |stsc|    |    | Sample To Chunk Box          |
383            +----+----+----+----+----+----+----+----+------------------------------+
384            |    |    |    |    |    |stsz|    |    | Sample Size Box              |
385            +----+----+----+----+----+ or +----+----+------------------------------+
386            |    |    |    |    |    |stz2|    |    | Compact Sample Size Box      |
387            +----+----+----+----+----+----+----+----+------------------------------+
388            |    |    |    |    |    |stco|    |    | Chunk Offset Box             |
389            +----+----+----+----+----+ or +----+----+------------------------------+
390            |    |    |    |    |    |co64|    |    | Chunk Large Offset Box       |
391            +----+----+----+----+----+----+----+----+------------------------------+
392            |    |    |    |    |    |sgpd|    |    | Sample Group Description Box |
393            +----+----+----+----+----+----+----+----+------------------------------+
394            |    |    |    |    |    |sbgp|    |    | Sample to Group Box          |
395            +----+----+----+----+----+----+----+----+------------------------------+
396            |    |mvex|*   |    |    |    |    |    | Movie Extends Box            |
397            +----+----+----+----+----+----+----+----+------------------------------+
398            |    |    |trex|*   |    |    |    |    | Track Extends Box            |
399            +----+----+----+----+----+----+----+----+------------------------------+
400
401                    Figure 2 - Basic structure of Movie Box
402
403            It is strongly recommended that the order of boxes should follow the above structure.
404            Boxes marked with an asterisk (*) may be present.
405            For most boxes listed above, the definition is as is defined in ISO/IEC 14496-12 [1]. The additional boxes
406            and the additional requirements, restrictions and recommendations to the other boxes are described in this
407            specification.
408
409        4.6.2 Movie Fragments<a name="4.6.2"></a>
410            This subclause shows a basic structure of the Movie Fragment Box as follows:
411
412            +----+----+----+----+----+----+----+----+------------------------------+
413            |moof|    |    |    |    |    |    |    | Movie Fragment Box           |
414            +----+----+----+----+----+----+----+----+------------------------------+
415            |    |mfhd|    |    |    |    |    |    | Movie Fragment Header Box    |
416            +----+----+----+----+----+----+----+----+------------------------------+
417            |    |traf|    |    |    |    |    |    | Track Fragment Box           |
418            +----+----+----+----+----+----+----+----+------------------------------+
419            |    |    |tfhd|    |    |    |    |    | Track Fragment Header Box    |
420            +----+----+----+----+----+----+----+----+------------------------------+
421            |    |    |trun|    |    |    |    |    | Track Fragment Run Box       |
422            +----+----+----+----+----+----+----+----+------------------------------+
423            |    |    |sgpd|*   |    |    |    |    | Sample Group Description Box |
424            +----+----+----+----+----+----+----+----+------------------------------+
425            |    |    |sbgp|    |    |    |    |    | Sample to Group Box          |
426            +----+----+----+----+----+----+----+----+------------------------------+
427
428                    Figure 3 - Basic structure of Movie Fragment Box
429
430            It is strongly recommended that the Movie Fragment Header Box and the Track Fragment Header Box be
431            placed first in their container.
432            Boxes marked with an asterisk (*) may be present.
433            For the boxes listed above, the definition is as is defined in ISO/IEC 14496-12 [1].
434<a name="4.7"></a>
435    4.7 Example of Encapsulation (informative)
436        [File]
437            size = 17757
438            [ftyp: File Type Box]
439                position = 0
440                size = 24
441                major_brand = Opus : Opus audio coding
442                minor_version = 0
443                compatible_brands
444                    brand[0] = Opus : Opus audio coding
445                    brand[1] = iso2 : ISO Base Media file format version 2
446            [moov: Movie Box]
447                position = 24
448                size = 757
449                [mvhd: Movie Header Box]
450                    position = 32
451                    size = 108
452                    version = 0
453                    flags = 0x000000
454                    creation_time = UTC 2014/12/12, 18:41:19
455                    modification_time = UTC 2014/12/12, 18:41:19
456                    timescale = 48000
457                    duration = 33600 (00:00:00.700)
458                    rate = 1.000000
459                    volume = 1.000000
460                    reserved = 0x0000
461                    reserved = 0x00000000
462                    reserved = 0x00000000
463                    transformation matrix
464                        | a, b, u |   | 1.000000, 0.000000, 0.000000 |
465                        | c, d, v | = | 0.000000, 1.000000, 0.000000 |
466                        | x, y, w |   | 0.000000, 0.000000, 1.000000 |
467                    pre_defined = 0x00000000
468                    pre_defined = 0x00000000
469                    pre_defined = 0x00000000
470                    pre_defined = 0x00000000
471                    pre_defined = 0x00000000
472                    pre_defined = 0x00000000
473                    next_track_ID = 2
474                [trak: Track Box]
475                    position = 140
476                    size = 608
477                    [tkhd: Track Header Box]
478                        position = 148
479                        size = 92
480                        version = 0
481                        flags = 0x000007
482                            Track enabled
483                            Track in movie
484                            Track in preview
485                        creation_time = UTC 2014/12/12, 18:41:19
486                        modification_time = UTC 2014/12/12, 18:41:19
487                        track_ID = 1
488                        reserved = 0x00000000
489                        duration = 33600 (00:00:00.700)
490                        reserved = 0x00000000
491                        reserved = 0x00000000
492                        layer = 0
493                        alternate_group = 0
494                        volume = 1.000000
495                        reserved = 0x0000
496                        transformation matrix
497                            | a, b, u |   | 1.000000, 0.000000, 0.000000 |
498                            | c, d, v | = | 0.000000, 1.000000, 0.000000 |
499                            | x, y, w |   | 0.000000, 0.000000, 1.000000 |
500                        width = 0.000000
501                        height = 0.000000
502                    [edts: Edit Box]
503                        position = 240
504                        size = 36
505                        [elst: Edit List Box]
506                            position = 281
507                            size = 28
508                            version = 0
509                            flags = 0x000000
510                            entry_count = 1
511                            entry[0]
512                                segment_duration = 33600
513                                media_time = 312
514                                media_rate = 1.000000
515                    [mdia: Media Box]
516                        position = 276
517                        size = 472
518                        [mdhd: Media Header Box]
519                            position = 284
520                            size = 32
521                            version = 0
522                            flags = 0x000000
523                            creation_time = UTC 2014/12/12, 18:41:19
524                            modification_time = UTC 2014/12/12, 18:41:19
525                            timescale = 48000
526                            duration = 34560 (00:00:00.720)
527                            language = und
528                            pre_defined = 0x0000
529                        [hdlr: Handler Reference Box]
530                            position = 316
531                            size = 51
532                            version = 0
533                            flags = 0x000000
534                            pre_defined = 0x00000000
535                            handler_type = soun
536                            reserved = 0x00000000
537                            reserved = 0x00000000
538                            reserved = 0x00000000
539                            name = Xiph Audio Handler
540                        [minf: Media Information Box]
541                            position = 367
542                            size = 381
543                            [smhd: Sound Media Header Box]
544                                position = 375
545                                size = 16
546                                version = 0
547                                flags = 0x000000
548                                balance = 0.000000
549                                reserved = 0x0000
550                            [dinf: Data Information Box]
551                                position = 391
552                                size = 36
553                                [dref: Data Reference Box]
554                                    position = 399
555                                    size = 28
556                                    version = 0
557                                    flags = 0x000000
558                                    entry_count = 1
559                                    [url : Data Entry Url Box]
560                                        position = 415
561                                        size = 12
562                                        version = 0
563                                        flags = 0x000001
564                                        location = in the same file
565                            [stbl: Sample Table Box]
566                                position = 427
567                                size = 321
568                                [stsd: Sample Description Box]
569                                    position = 435
570                                    size = 79
571                                    version = 0
572                                    flags = 0x000000
573                                    entry_count = 1
574                                    [Opus: Audio Description]
575                                        position = 451
576                                        size = 63
577                                        reserved = 0x000000000000
578                                        data_reference_index = 1
579                                        reserved = 0x0000
580                                        reserved = 0x0000
581                                        reserved = 0x00000000
582                                        channelcount = 6
583                                        samplesize = 16
584                                        pre_defined = 0
585                                        reserved = 0
586                                        samplerate = 48000.000000
587                                        [dOps: Opus Specific Box]
588                                            position = 487
589                                            size = 27
590                                            Version = 0
591                                            OutputChannelCount = 6
592                                            PreSkip = 312
593                                            InputSampleRate = 48000
594                                            OutputGain = 0
595                                            ChannelMappingFamily = 1
596                                            StreamCount = 4
597                                            CoupledCount = 2
598                                            ChannelMapping
599                                                0 -> 0: front left
600                                                1 -> 4: fron center
601                                                2 -> 1: front right
602                                                3 -> 2: side left
603                                                4 -> 3: side right
604                                                5 -> 5: rear center
605                                [stts: Decoding Time to Sample Box]
606                                    position = 514
607                                    size = 24
608                                    version = 0
609                                    flags = 0x000000
610                                    entry_count = 1
611                                    entry[0]
612                                        sample_count = 18
613                                        sample_delta = 1920
614                                [stsc: Sample To Chunk Box]
615                                    position = 538
616                                    size = 40
617                                    version = 0
618                                    flags = 0x000000
619                                    entry_count = 2
620                                    entry[0]
621                                        first_chunk = 1
622                                        samples_per_chunk = 13
623                                        sample_description_index = 1
624                                    entry[1]
625                                        first_chunk = 2
626                                        samples_per_chunk = 5
627                                        sample_description_index = 1
628                                [stsz: Sample Size Box]
629                                    position = 578
630                                    size = 92
631                                    version = 0
632                                    flags = 0x000000
633                                    sample_size = 0 (variable)
634                                    sample_count = 18
635                                    entry_size[0] = 977
636                                    entry_size[1] = 938
637                                    entry_size[2] = 939
638                                    entry_size[3] = 938
639                                    entry_size[4] = 934
640                                    entry_size[5] = 945
641                                    entry_size[6] = 948
642                                    entry_size[7] = 956
643                                    entry_size[8] = 955
644                                    entry_size[9] = 930
645                                    entry_size[10] = 933
646                                    entry_size[11] = 934
647                                    entry_size[12] = 972
648                                    entry_size[13] = 977
649                                    entry_size[14] = 958
650                                    entry_size[15] = 949
651                                    entry_size[16] = 962
652                                    entry_size[17] = 848
653                                [stco: Chunk Offset Box]
654                                    position = 670
655                                    size = 24
656                                    version = 0
657                                    flags = 0x000000
658                                    entry_count = 2
659                                    chunk_offset[0] = 797
660                                    chunk_offset[1] = 13096
661                                [sgpd: Sample Group Description Box]
662                                    position = 694
663                                    size = 26
664                                    version = 1
665                                    flags = 0x000000
666                                    grouping_type = roll
667                                    default_length = 2 (constant)
668                                    entry_count = 1
669                                    roll_distance[0] = -2
670                                [sbgp: Sample to Group Box]
671                                    position = 720
672                                    size = 28
673                                    version = 0
674                                    flags = 0x000000
675                                    grouping_type = roll
676                                    entry_count = 1
677                                    entry[0]
678                                        sample_count = 18
679                                        group_description_index = 1
680            [free: Free Space Box]
681                position = 748
682                size = 8
683            [mdat: Media Data Box]
684                position = 756
685                size = 17001
686<a name="5"></a>
6875 Authors' Address
688    Yusuke Nakamura
689        Email: muken.the.vfrmaniac |at| gmail.com
690        </div>
691    </body>
692</html>
693