1.. highlightlang:: c
2
3.. index::
4   single: buffer protocol
5   single: buffer interface; (see buffer protocol)
6   single: buffer object; (see buffer protocol)
7
8.. _bufferobjects:
9
10Buffer Protocol
11---------------
12
13.. sectionauthor:: Greg Stein <gstein@lyra.org>
14.. sectionauthor:: Benjamin Peterson
15.. sectionauthor:: Stefan Krah
16
17
18Certain objects available in Python wrap access to an underlying memory
19array or *buffer*.  Such objects include the built-in :class:`bytes` and
20:class:`bytearray`, and some extension types like :class:`array.array`.
21Third-party libraries may define their own types for special purposes, such
22as image processing or numeric analysis.
23
24While each of these types have their own semantics, they share the common
25characteristic of being backed by a possibly large memory buffer.  It is
26then desirable, in some situations, to access that buffer directly and
27without intermediate copying.
28
29Python provides such a facility at the C level in the form of the :ref:`buffer
30protocol <bufferobjects>`.  This protocol has two sides:
31
32.. index:: single: PyBufferProcs
33
34- on the producer side, a type can export a "buffer interface" which allows
35  objects of that type to expose information about their underlying buffer.
36  This interface is described in the section :ref:`buffer-structs`;
37
38- on the consumer side, several means are available to obtain a pointer to
39  the raw underlying data of an object (for example a method parameter).
40
41Simple objects such as :class:`bytes` and :class:`bytearray` expose their
42underlying buffer in byte-oriented form.  Other forms are possible; for example,
43the elements exposed by an :class:`array.array` can be multi-byte values.
44
45An example consumer of the buffer interface is the :meth:`~io.BufferedIOBase.write`
46method of file objects: any object that can export a series of bytes through
47the buffer interface can be written to a file.  While :meth:`write` only
48needs read-only access to the internal contents of the object passed to it,
49other methods such as :meth:`~io.BufferedIOBase.readinto` need write access
50to the contents of their argument.  The buffer interface allows objects to
51selectively allow or reject exporting of read-write and read-only buffers.
52
53There are two ways for a consumer of the buffer interface to acquire a buffer
54over a target object:
55
56* call :c:func:`PyObject_GetBuffer` with the right parameters;
57
58* call :c:func:`PyArg_ParseTuple` (or one of its siblings) with one of the
59  ``y*``, ``w*`` or ``s*`` :ref:`format codes <arg-parsing>`.
60
61In both cases, :c:func:`PyBuffer_Release` must be called when the buffer
62isn't needed anymore.  Failure to do so could lead to various issues such as
63resource leaks.
64
65
66.. _buffer-structure:
67
68Buffer structure
69================
70
71Buffer structures (or simply "buffers") are useful as a way to expose the
72binary data from another object to the Python programmer.  They can also be
73used as a zero-copy slicing mechanism.  Using their ability to reference a
74block of memory, it is possible to expose any data to the Python programmer
75quite easily.  The memory could be a large, constant array in a C extension,
76it could be a raw block of memory for manipulation before passing to an
77operating system library, or it could be used to pass around structured data
78in its native, in-memory format.
79
80Contrary to most data types exposed by the Python interpreter, buffers
81are not :c:type:`PyObject` pointers but rather simple C structures.  This
82allows them to be created and copied very simply.  When a generic wrapper
83around a buffer is needed, a :ref:`memoryview <memoryview-objects>` object
84can be created.
85
86For short instructions how to write an exporting object, see
87:ref:`Buffer Object Structures <buffer-structs>`. For obtaining
88a buffer, see :c:func:`PyObject_GetBuffer`.
89
90.. c:type:: Py_buffer
91
92   .. c:member:: void \*buf
93
94      A pointer to the start of the logical structure described by the buffer
95      fields. This can be any location within the underlying physical memory
96      block of the exporter. For example, with negative :c:member:`~Py_buffer.strides`
97      the value may point to the end of the memory block.
98
99      For :term:`contiguous` arrays, the value points to the beginning of
100      the memory block.
101
102   .. c:member:: void \*obj
103
104      A new reference to the exporting object. The reference is owned by
105      the consumer and automatically decremented and set to *NULL* by
106      :c:func:`PyBuffer_Release`. The field is the equivalent of the return
107      value of any standard C-API function.
108
109      As a special case, for *temporary* buffers that are wrapped by
110      :c:func:`PyMemoryView_FromBuffer` or :c:func:`PyBuffer_FillInfo`
111      this field is *NULL*. In general, exporting objects MUST NOT
112      use this scheme.
113
114   .. c:member:: Py_ssize_t len
115
116      ``product(shape) * itemsize``. For contiguous arrays, this is the length
117      of the underlying memory block. For non-contiguous arrays, it is the length
118      that the logical structure would have if it were copied to a contiguous
119      representation.
120
121      Accessing ``((char *)buf)[0] up to ((char *)buf)[len-1]`` is only valid
122      if the buffer has been obtained by a request that guarantees contiguity. In
123      most cases such a request will be :c:macro:`PyBUF_SIMPLE` or :c:macro:`PyBUF_WRITABLE`.
124
125   .. c:member:: int readonly
126
127      An indicator of whether the buffer is read-only. This field is controlled
128      by the :c:macro:`PyBUF_WRITABLE` flag.
129
130   .. c:member:: Py_ssize_t itemsize
131
132      Item size in bytes of a single element. Same as the value of :func:`struct.calcsize`
133      called on non-NULL :c:member:`~Py_buffer.format` values.
134
135      Important exception: If a consumer requests a buffer without the
136      :c:macro:`PyBUF_FORMAT` flag, :c:member:`~Py_buffer.format` will
137      be set to  *NULL*,  but :c:member:`~Py_buffer.itemsize` still has
138      the value for the original format.
139
140      If :c:member:`~Py_buffer.shape` is present, the equality
141      ``product(shape) * itemsize == len`` still holds and the consumer
142      can use :c:member:`~Py_buffer.itemsize` to navigate the buffer.
143
144      If :c:member:`~Py_buffer.shape` is *NULL* as a result of a :c:macro:`PyBUF_SIMPLE`
145      or a :c:macro:`PyBUF_WRITABLE` request, the consumer must disregard
146      :c:member:`~Py_buffer.itemsize` and assume ``itemsize == 1``.
147
148   .. c:member:: const char \*format
149
150      A *NUL* terminated string in :mod:`struct` module style syntax describing
151      the contents of a single item. If this is *NULL*, ``"B"`` (unsigned bytes)
152      is assumed.
153
154      This field is controlled by the :c:macro:`PyBUF_FORMAT` flag.
155
156   .. c:member:: int ndim
157
158      The number of dimensions the memory represents as an n-dimensional array.
159      If it is ``0``, :c:member:`~Py_buffer.buf` points to a single item representing
160      a scalar. In this case, :c:member:`~Py_buffer.shape`, :c:member:`~Py_buffer.strides`
161      and :c:member:`~Py_buffer.suboffsets` MUST be *NULL*.
162
163      The macro :c:macro:`PyBUF_MAX_NDIM` limits the maximum number of dimensions
164      to 64. Exporters MUST respect this limit, consumers of multi-dimensional
165      buffers SHOULD be able to handle up to :c:macro:`PyBUF_MAX_NDIM` dimensions.
166
167   .. c:member:: Py_ssize_t \*shape
168
169      An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim`
170      indicating the shape of the memory as an n-dimensional array. Note that
171      ``shape[0] * ... * shape[ndim-1] * itemsize`` MUST be equal to
172      :c:member:`~Py_buffer.len`.
173
174      Shape values are restricted to ``shape[n] >= 0``. The case
175      ``shape[n] == 0`` requires special attention. See `complex arrays`_
176      for further information.
177
178      The shape array is read-only for the consumer.
179
180   .. c:member:: Py_ssize_t \*strides
181
182      An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim`
183      giving the number of bytes to skip to get to a new element in each
184      dimension.
185
186      Stride values can be any integer. For regular arrays, strides are
187      usually positive, but a consumer MUST be able to handle the case
188      ``strides[n] <= 0``. See `complex arrays`_ for further information.
189
190      The strides array is read-only for the consumer.
191
192   .. c:member:: Py_ssize_t \*suboffsets
193
194      An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim`.
195      If ``suboffsets[n] >= 0``, the values stored along the nth dimension are
196      pointers and the suboffset value dictates how many bytes to add to each
197      pointer after de-referencing. A suboffset value that is negative
198      indicates that no de-referencing should occur (striding in a contiguous
199      memory block).
200
201      If all suboffsets are negative (i.e. no de-referencing is needed), then
202      this field must be NULL (the default value).
203
204      This type of array representation is used by the Python Imaging Library
205      (PIL). See `complex arrays`_ for further information how to access elements
206      of such an array.
207
208      The suboffsets array is read-only for the consumer.
209
210   .. c:member:: void \*internal
211
212      This is for use internally by the exporting object. For example, this
213      might be re-cast as an integer by the exporter and used to store flags
214      about whether or not the shape, strides, and suboffsets arrays must be
215      freed when the buffer is released. The consumer MUST NOT alter this
216      value.
217
218.. _buffer-request-types:
219
220Buffer request types
221====================
222
223Buffers are usually obtained by sending a buffer request to an exporting
224object via :c:func:`PyObject_GetBuffer`. Since the complexity of the logical
225structure of the memory can vary drastically, the consumer uses the *flags*
226argument to specify the exact buffer type it can handle.
227
228All :c:data:`Py_buffer` fields are unambiguously defined by the request
229type.
230
231request-independent fields
232~~~~~~~~~~~~~~~~~~~~~~~~~~
233The following fields are not influenced by *flags* and must always be filled in
234with the correct values: :c:member:`~Py_buffer.obj`, :c:member:`~Py_buffer.buf`,
235:c:member:`~Py_buffer.len`, :c:member:`~Py_buffer.itemsize`, :c:member:`~Py_buffer.ndim`.
236
237
238readonly, format
239~~~~~~~~~~~~~~~~
240
241   .. c:macro:: PyBUF_WRITABLE
242
243      Controls the :c:member:`~Py_buffer.readonly` field. If set, the exporter
244      MUST provide a writable buffer or else report failure. Otherwise, the
245      exporter MAY provide either a read-only or writable buffer, but the choice
246      MUST be consistent for all consumers.
247
248   .. c:macro:: PyBUF_FORMAT
249
250      Controls the :c:member:`~Py_buffer.format` field. If set, this field MUST
251      be filled in correctly. Otherwise, this field MUST be *NULL*.
252
253
254:c:macro:`PyBUF_WRITABLE` can be \|'d to any of the flags in the next section.
255Since :c:macro:`PyBUF_SIMPLE` is defined as 0, :c:macro:`PyBUF_WRITABLE`
256can be used as a stand-alone flag to request a simple writable buffer.
257
258:c:macro:`PyBUF_FORMAT` can be \|'d to any of the flags except :c:macro:`PyBUF_SIMPLE`.
259The latter already implies format ``B`` (unsigned bytes).
260
261
262shape, strides, suboffsets
263~~~~~~~~~~~~~~~~~~~~~~~~~~
264
265The flags that control the logical structure of the memory are listed
266in decreasing order of complexity. Note that each flag contains all bits
267of the flags below it.
268
269.. tabularcolumns:: |p{0.35\linewidth}|l|l|l|
270
271+-----------------------------+-------+---------+------------+
272|  Request                    | shape | strides | suboffsets |
273+=============================+=======+=========+============+
274| .. c:macro:: PyBUF_INDIRECT |  yes  |   yes   | if needed  |
275+-----------------------------+-------+---------+------------+
276| .. c:macro:: PyBUF_STRIDES  |  yes  |   yes   |    NULL    |
277+-----------------------------+-------+---------+------------+
278| .. c:macro:: PyBUF_ND       |  yes  |   NULL  |    NULL    |
279+-----------------------------+-------+---------+------------+
280| .. c:macro:: PyBUF_SIMPLE   |  NULL |   NULL  |    NULL    |
281+-----------------------------+-------+---------+------------+
282
283
284.. index:: contiguous, C-contiguous, Fortran contiguous
285
286contiguity requests
287~~~~~~~~~~~~~~~~~~~
288
289C or Fortran :term:`contiguity <contiguous>` can be explicitly requested,
290with and without stride information. Without stride information, the buffer
291must be C-contiguous.
292
293.. tabularcolumns:: |p{0.35\linewidth}|l|l|l|l|
294
295+-----------------------------------+-------+---------+------------+--------+
296|  Request                          | shape | strides | suboffsets | contig |
297+===================================+=======+=========+============+========+
298| .. c:macro:: PyBUF_C_CONTIGUOUS   |  yes  |   yes   |    NULL    |   C    |
299+-----------------------------------+-------+---------+------------+--------+
300| .. c:macro:: PyBUF_F_CONTIGUOUS   |  yes  |   yes   |    NULL    |   F    |
301+-----------------------------------+-------+---------+------------+--------+
302| .. c:macro:: PyBUF_ANY_CONTIGUOUS |  yes  |   yes   |    NULL    | C or F |
303+-----------------------------------+-------+---------+------------+--------+
304| .. c:macro:: PyBUF_ND             |  yes  |   NULL  |    NULL    |   C    |
305+-----------------------------------+-------+---------+------------+--------+
306
307
308compound requests
309~~~~~~~~~~~~~~~~~
310
311All possible requests are fully defined by some combination of the flags in
312the previous section. For convenience, the buffer protocol provides frequently
313used combinations as single flags.
314
315In the following table *U* stands for undefined contiguity. The consumer would
316have to call :c:func:`PyBuffer_IsContiguous` to determine contiguity.
317
318.. tabularcolumns:: |p{0.35\linewidth}|l|l|l|l|l|l|
319
320+-------------------------------+-------+---------+------------+--------+----------+--------+
321|  Request                      | shape | strides | suboffsets | contig | readonly | format |
322+===============================+=======+=========+============+========+==========+========+
323| .. c:macro:: PyBUF_FULL       |  yes  |   yes   | if needed  |   U    |     0    |  yes   |
324+-------------------------------+-------+---------+------------+--------+----------+--------+
325| .. c:macro:: PyBUF_FULL_RO    |  yes  |   yes   | if needed  |   U    |  1 or 0  |  yes   |
326+-------------------------------+-------+---------+------------+--------+----------+--------+
327| .. c:macro:: PyBUF_RECORDS    |  yes  |   yes   |    NULL    |   U    |     0    |  yes   |
328+-------------------------------+-------+---------+------------+--------+----------+--------+
329| .. c:macro:: PyBUF_RECORDS_RO |  yes  |   yes   |    NULL    |   U    |  1 or 0  |  yes   |
330+-------------------------------+-------+---------+------------+--------+----------+--------+
331| .. c:macro:: PyBUF_STRIDED    |  yes  |   yes   |    NULL    |   U    |     0    |  NULL  |
332+-------------------------------+-------+---------+------------+--------+----------+--------+
333| .. c:macro:: PyBUF_STRIDED_RO |  yes  |   yes   |    NULL    |   U    |  1 or 0  |  NULL  |
334+-------------------------------+-------+---------+------------+--------+----------+--------+
335| .. c:macro:: PyBUF_CONTIG     |  yes  |   NULL  |    NULL    |   C    |     0    |  NULL  |
336+-------------------------------+-------+---------+------------+--------+----------+--------+
337| .. c:macro:: PyBUF_CONTIG_RO  |  yes  |   NULL  |    NULL    |   C    |  1 or 0  |  NULL  |
338+-------------------------------+-------+---------+------------+--------+----------+--------+
339
340
341Complex arrays
342==============
343
344NumPy-style: shape and strides
345~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
346
347The logical structure of NumPy-style arrays is defined by :c:member:`~Py_buffer.itemsize`,
348:c:member:`~Py_buffer.ndim`, :c:member:`~Py_buffer.shape` and :c:member:`~Py_buffer.strides`.
349
350If ``ndim == 0``, the memory location pointed to by :c:member:`~Py_buffer.buf` is
351interpreted as a scalar of size :c:member:`~Py_buffer.itemsize`. In that case,
352both :c:member:`~Py_buffer.shape` and :c:member:`~Py_buffer.strides` are *NULL*.
353
354If :c:member:`~Py_buffer.strides` is *NULL*, the array is interpreted as
355a standard n-dimensional C-array. Otherwise, the consumer must access an
356n-dimensional array as follows:
357
358   ``ptr = (char *)buf + indices[0] * strides[0] + ... + indices[n-1] * strides[n-1]``
359   ``item = *((typeof(item) *)ptr);``
360
361
362As noted above, :c:member:`~Py_buffer.buf` can point to any location within
363the actual memory block. An exporter can check the validity of a buffer with
364this function:
365
366.. code-block:: python
367
368   def verify_structure(memlen, itemsize, ndim, shape, strides, offset):
369       """Verify that the parameters represent a valid array within
370          the bounds of the allocated memory:
371              char *mem: start of the physical memory block
372              memlen: length of the physical memory block
373              offset: (char *)buf - mem
374       """
375       if offset % itemsize:
376           return False
377       if offset < 0 or offset+itemsize > memlen:
378           return False
379       if any(v % itemsize for v in strides):
380           return False
381
382       if ndim <= 0:
383           return ndim == 0 and not shape and not strides
384       if 0 in shape:
385           return True
386
387       imin = sum(strides[j]*(shape[j]-1) for j in range(ndim)
388                  if strides[j] <= 0)
389       imax = sum(strides[j]*(shape[j]-1) for j in range(ndim)
390                  if strides[j] > 0)
391
392       return 0 <= offset+imin and offset+imax+itemsize <= memlen
393
394
395PIL-style: shape, strides and suboffsets
396~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
397
398In addition to the regular items, PIL-style arrays can contain pointers
399that must be followed in order to get to the next element in a dimension.
400For example, the regular three-dimensional C-array ``char v[2][2][3]`` can
401also be viewed as an array of 2 pointers to 2 two-dimensional arrays:
402``char (*v[2])[2][3]``. In suboffsets representation, those two pointers
403can be embedded at the start of :c:member:`~Py_buffer.buf`, pointing
404to two ``char x[2][3]`` arrays that can be located anywhere in memory.
405
406
407Here is a function that returns a pointer to the element in an N-D array
408pointed to by an N-dimensional index when there are both non-NULL strides
409and suboffsets::
410
411   void *get_item_pointer(int ndim, void *buf, Py_ssize_t *strides,
412                          Py_ssize_t *suboffsets, Py_ssize_t *indices) {
413       char *pointer = (char*)buf;
414       int i;
415       for (i = 0; i < ndim; i++) {
416           pointer += strides[i] * indices[i];
417           if (suboffsets[i] >=0 ) {
418               pointer = *((char**)pointer) + suboffsets[i];
419           }
420       }
421       return (void*)pointer;
422   }
423
424
425Buffer-related functions
426========================
427
428.. c:function:: int PyObject_CheckBuffer(PyObject *obj)
429
430   Return ``1`` if *obj* supports the buffer interface otherwise ``0``.  When ``1`` is
431   returned, it doesn't guarantee that :c:func:`PyObject_GetBuffer` will
432   succeed.  This function always succeeds.
433
434
435.. c:function:: int PyObject_GetBuffer(PyObject *exporter, Py_buffer *view, int flags)
436
437   Send a request to *exporter* to fill in *view* as specified by  *flags*.
438   If the exporter cannot provide a buffer of the exact type, it MUST raise
439   :c:data:`PyExc_BufferError`, set :c:member:`view->obj` to *NULL* and
440   return ``-1``.
441
442   On success, fill in *view*, set :c:member:`view->obj` to a new reference
443   to *exporter* and return 0. In the case of chained buffer providers
444   that redirect requests to a single object, :c:member:`view->obj` MAY
445   refer to this object instead of *exporter* (See :ref:`Buffer Object Structures <buffer-structs>`).
446
447   Successful calls to :c:func:`PyObject_GetBuffer` must be paired with calls
448   to :c:func:`PyBuffer_Release`, similar to :c:func:`malloc` and :c:func:`free`.
449   Thus, after the consumer is done with the buffer, :c:func:`PyBuffer_Release`
450   must be called exactly once.
451
452
453.. c:function:: void PyBuffer_Release(Py_buffer *view)
454
455   Release the buffer *view* and decrement the reference count for
456   :c:member:`view->obj`. This function MUST be called when the buffer
457   is no longer being used, otherwise reference leaks may occur.
458
459   It is an error to call this function on a buffer that was not obtained via
460   :c:func:`PyObject_GetBuffer`.
461
462
463.. c:function:: Py_ssize_t PyBuffer_SizeFromFormat(const char *)
464
465   Return the implied :c:data:`~Py_buffer.itemsize` from :c:data:`~Py_buffer.format`.
466   This function is not yet implemented.
467
468
469.. c:function:: int PyBuffer_IsContiguous(Py_buffer *view, char order)
470
471   Return ``1`` if the memory defined by the *view* is C-style (*order* is
472   ``'C'``) or Fortran-style (*order* is ``'F'``) :term:`contiguous` or either one
473   (*order* is ``'A'``).  Return ``0`` otherwise.  This function always succeeds.
474
475
476.. c:function:: int PyBuffer_ToContiguous(void *buf, Py_buffer *src, Py_ssize_t len, char order)
477
478   Copy *len* bytes from *src* to its contiguous representation in *buf*.
479   *order* can be ``'C'`` or ``'F'`` (for C-style or Fortran-style ordering).
480   ``0`` is returned on success, ``-1`` on error.
481
482   This function fails if *len* != *src->len*.
483
484
485.. c:function:: void PyBuffer_FillContiguousStrides(int ndims, Py_ssize_t *shape, Py_ssize_t *strides, int itemsize, char order)
486
487   Fill the *strides* array with byte-strides of a :term:`contiguous` (C-style if
488   *order* is ``'C'`` or Fortran-style if *order* is ``'F'``) array of the
489   given shape with the given number of bytes per element.
490
491
492.. c:function:: int PyBuffer_FillInfo(Py_buffer *view, PyObject *exporter, void *buf, Py_ssize_t len, int readonly, int flags)
493
494   Handle buffer requests for an exporter that wants to expose *buf* of size *len*
495   with writability set according to *readonly*. *buf* is interpreted as a sequence
496   of unsigned bytes.
497
498   The *flags* argument indicates the request type. This function always fills in
499   *view* as specified by flags, unless *buf* has been designated as read-only
500   and :c:macro:`PyBUF_WRITABLE` is set in *flags*.
501
502   On success, set :c:member:`view->obj` to a new reference to *exporter* and
503   return 0. Otherwise, raise :c:data:`PyExc_BufferError`, set
504   :c:member:`view->obj` to *NULL* and return ``-1``;
505
506   If this function is used as part of a :ref:`getbufferproc <buffer-structs>`,
507   *exporter* MUST be set to the exporting object and *flags* must be passed
508   unmodified. Otherwise, *exporter* MUST be NULL.
509