1.. highlightlang:: c 2 3.. index:: 4 single: buffer protocol 5 single: buffer interface; (see buffer protocol) 6 single: buffer object; (see buffer protocol) 7 8.. _bufferobjects: 9 10Buffer Protocol 11--------------- 12 13.. sectionauthor:: Greg Stein <gstein@lyra.org> 14.. sectionauthor:: Benjamin Peterson 15.. sectionauthor:: Stefan Krah 16 17 18Certain objects available in Python wrap access to an underlying memory 19array or *buffer*. Such objects include the built-in :class:`bytes` and 20:class:`bytearray`, and some extension types like :class:`array.array`. 21Third-party libraries may define their own types for special purposes, such 22as image processing or numeric analysis. 23 24While each of these types have their own semantics, they share the common 25characteristic of being backed by a possibly large memory buffer. It is 26then desirable, in some situations, to access that buffer directly and 27without intermediate copying. 28 29Python provides such a facility at the C level in the form of the :ref:`buffer 30protocol <bufferobjects>`. This protocol has two sides: 31 32.. index:: single: PyBufferProcs 33 34- on the producer side, a type can export a "buffer interface" which allows 35 objects of that type to expose information about their underlying buffer. 36 This interface is described in the section :ref:`buffer-structs`; 37 38- on the consumer side, several means are available to obtain a pointer to 39 the raw underlying data of an object (for example a method parameter). 40 41Simple objects such as :class:`bytes` and :class:`bytearray` expose their 42underlying buffer in byte-oriented form. Other forms are possible; for example, 43the elements exposed by an :class:`array.array` can be multi-byte values. 44 45An example consumer of the buffer interface is the :meth:`~io.BufferedIOBase.write` 46method of file objects: any object that can export a series of bytes through 47the buffer interface can be written to a file. While :meth:`write` only 48needs read-only access to the internal contents of the object passed to it, 49other methods such as :meth:`~io.BufferedIOBase.readinto` need write access 50to the contents of their argument. The buffer interface allows objects to 51selectively allow or reject exporting of read-write and read-only buffers. 52 53There are two ways for a consumer of the buffer interface to acquire a buffer 54over a target object: 55 56* call :c:func:`PyObject_GetBuffer` with the right parameters; 57 58* call :c:func:`PyArg_ParseTuple` (or one of its siblings) with one of the 59 ``y*``, ``w*`` or ``s*`` :ref:`format codes <arg-parsing>`. 60 61In both cases, :c:func:`PyBuffer_Release` must be called when the buffer 62isn't needed anymore. Failure to do so could lead to various issues such as 63resource leaks. 64 65 66.. _buffer-structure: 67 68Buffer structure 69================ 70 71Buffer structures (or simply "buffers") are useful as a way to expose the 72binary data from another object to the Python programmer. They can also be 73used as a zero-copy slicing mechanism. Using their ability to reference a 74block of memory, it is possible to expose any data to the Python programmer 75quite easily. The memory could be a large, constant array in a C extension, 76it could be a raw block of memory for manipulation before passing to an 77operating system library, or it could be used to pass around structured data 78in its native, in-memory format. 79 80Contrary to most data types exposed by the Python interpreter, buffers 81are not :c:type:`PyObject` pointers but rather simple C structures. This 82allows them to be created and copied very simply. When a generic wrapper 83around a buffer is needed, a :ref:`memoryview <memoryview-objects>` object 84can be created. 85 86For short instructions how to write an exporting object, see 87:ref:`Buffer Object Structures <buffer-structs>`. For obtaining 88a buffer, see :c:func:`PyObject_GetBuffer`. 89 90.. c:type:: Py_buffer 91 92 .. c:member:: void \*buf 93 94 A pointer to the start of the logical structure described by the buffer 95 fields. This can be any location within the underlying physical memory 96 block of the exporter. For example, with negative :c:member:`~Py_buffer.strides` 97 the value may point to the end of the memory block. 98 99 For :term:`contiguous` arrays, the value points to the beginning of 100 the memory block. 101 102 .. c:member:: void \*obj 103 104 A new reference to the exporting object. The reference is owned by 105 the consumer and automatically decremented and set to *NULL* by 106 :c:func:`PyBuffer_Release`. The field is the equivalent of the return 107 value of any standard C-API function. 108 109 As a special case, for *temporary* buffers that are wrapped by 110 :c:func:`PyMemoryView_FromBuffer` or :c:func:`PyBuffer_FillInfo` 111 this field is *NULL*. In general, exporting objects MUST NOT 112 use this scheme. 113 114 .. c:member:: Py_ssize_t len 115 116 ``product(shape) * itemsize``. For contiguous arrays, this is the length 117 of the underlying memory block. For non-contiguous arrays, it is the length 118 that the logical structure would have if it were copied to a contiguous 119 representation. 120 121 Accessing ``((char *)buf)[0] up to ((char *)buf)[len-1]`` is only valid 122 if the buffer has been obtained by a request that guarantees contiguity. In 123 most cases such a request will be :c:macro:`PyBUF_SIMPLE` or :c:macro:`PyBUF_WRITABLE`. 124 125 .. c:member:: int readonly 126 127 An indicator of whether the buffer is read-only. This field is controlled 128 by the :c:macro:`PyBUF_WRITABLE` flag. 129 130 .. c:member:: Py_ssize_t itemsize 131 132 Item size in bytes of a single element. Same as the value of :func:`struct.calcsize` 133 called on non-NULL :c:member:`~Py_buffer.format` values. 134 135 Important exception: If a consumer requests a buffer without the 136 :c:macro:`PyBUF_FORMAT` flag, :c:member:`~Py_buffer.format` will 137 be set to *NULL*, but :c:member:`~Py_buffer.itemsize` still has 138 the value for the original format. 139 140 If :c:member:`~Py_buffer.shape` is present, the equality 141 ``product(shape) * itemsize == len`` still holds and the consumer 142 can use :c:member:`~Py_buffer.itemsize` to navigate the buffer. 143 144 If :c:member:`~Py_buffer.shape` is *NULL* as a result of a :c:macro:`PyBUF_SIMPLE` 145 or a :c:macro:`PyBUF_WRITABLE` request, the consumer must disregard 146 :c:member:`~Py_buffer.itemsize` and assume ``itemsize == 1``. 147 148 .. c:member:: const char \*format 149 150 A *NUL* terminated string in :mod:`struct` module style syntax describing 151 the contents of a single item. If this is *NULL*, ``"B"`` (unsigned bytes) 152 is assumed. 153 154 This field is controlled by the :c:macro:`PyBUF_FORMAT` flag. 155 156 .. c:member:: int ndim 157 158 The number of dimensions the memory represents as an n-dimensional array. 159 If it is ``0``, :c:member:`~Py_buffer.buf` points to a single item representing 160 a scalar. In this case, :c:member:`~Py_buffer.shape`, :c:member:`~Py_buffer.strides` 161 and :c:member:`~Py_buffer.suboffsets` MUST be *NULL*. 162 163 The macro :c:macro:`PyBUF_MAX_NDIM` limits the maximum number of dimensions 164 to 64. Exporters MUST respect this limit, consumers of multi-dimensional 165 buffers SHOULD be able to handle up to :c:macro:`PyBUF_MAX_NDIM` dimensions. 166 167 .. c:member:: Py_ssize_t \*shape 168 169 An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim` 170 indicating the shape of the memory as an n-dimensional array. Note that 171 ``shape[0] * ... * shape[ndim-1] * itemsize`` MUST be equal to 172 :c:member:`~Py_buffer.len`. 173 174 Shape values are restricted to ``shape[n] >= 0``. The case 175 ``shape[n] == 0`` requires special attention. See `complex arrays`_ 176 for further information. 177 178 The shape array is read-only for the consumer. 179 180 .. c:member:: Py_ssize_t \*strides 181 182 An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim` 183 giving the number of bytes to skip to get to a new element in each 184 dimension. 185 186 Stride values can be any integer. For regular arrays, strides are 187 usually positive, but a consumer MUST be able to handle the case 188 ``strides[n] <= 0``. See `complex arrays`_ for further information. 189 190 The strides array is read-only for the consumer. 191 192 .. c:member:: Py_ssize_t \*suboffsets 193 194 An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim`. 195 If ``suboffsets[n] >= 0``, the values stored along the nth dimension are 196 pointers and the suboffset value dictates how many bytes to add to each 197 pointer after de-referencing. A suboffset value that is negative 198 indicates that no de-referencing should occur (striding in a contiguous 199 memory block). 200 201 If all suboffsets are negative (i.e. no de-referencing is needed), then 202 this field must be NULL (the default value). 203 204 This type of array representation is used by the Python Imaging Library 205 (PIL). See `complex arrays`_ for further information how to access elements 206 of such an array. 207 208 The suboffsets array is read-only for the consumer. 209 210 .. c:member:: void \*internal 211 212 This is for use internally by the exporting object. For example, this 213 might be re-cast as an integer by the exporter and used to store flags 214 about whether or not the shape, strides, and suboffsets arrays must be 215 freed when the buffer is released. The consumer MUST NOT alter this 216 value. 217 218.. _buffer-request-types: 219 220Buffer request types 221==================== 222 223Buffers are usually obtained by sending a buffer request to an exporting 224object via :c:func:`PyObject_GetBuffer`. Since the complexity of the logical 225structure of the memory can vary drastically, the consumer uses the *flags* 226argument to specify the exact buffer type it can handle. 227 228All :c:data:`Py_buffer` fields are unambiguously defined by the request 229type. 230 231request-independent fields 232~~~~~~~~~~~~~~~~~~~~~~~~~~ 233The following fields are not influenced by *flags* and must always be filled in 234with the correct values: :c:member:`~Py_buffer.obj`, :c:member:`~Py_buffer.buf`, 235:c:member:`~Py_buffer.len`, :c:member:`~Py_buffer.itemsize`, :c:member:`~Py_buffer.ndim`. 236 237 238readonly, format 239~~~~~~~~~~~~~~~~ 240 241 .. c:macro:: PyBUF_WRITABLE 242 243 Controls the :c:member:`~Py_buffer.readonly` field. If set, the exporter 244 MUST provide a writable buffer or else report failure. Otherwise, the 245 exporter MAY provide either a read-only or writable buffer, but the choice 246 MUST be consistent for all consumers. 247 248 .. c:macro:: PyBUF_FORMAT 249 250 Controls the :c:member:`~Py_buffer.format` field. If set, this field MUST 251 be filled in correctly. Otherwise, this field MUST be *NULL*. 252 253 254:c:macro:`PyBUF_WRITABLE` can be \|'d to any of the flags in the next section. 255Since :c:macro:`PyBUF_SIMPLE` is defined as 0, :c:macro:`PyBUF_WRITABLE` 256can be used as a stand-alone flag to request a simple writable buffer. 257 258:c:macro:`PyBUF_FORMAT` can be \|'d to any of the flags except :c:macro:`PyBUF_SIMPLE`. 259The latter already implies format ``B`` (unsigned bytes). 260 261 262shape, strides, suboffsets 263~~~~~~~~~~~~~~~~~~~~~~~~~~ 264 265The flags that control the logical structure of the memory are listed 266in decreasing order of complexity. Note that each flag contains all bits 267of the flags below it. 268 269.. tabularcolumns:: |p{0.35\linewidth}|l|l|l| 270 271+-----------------------------+-------+---------+------------+ 272| Request | shape | strides | suboffsets | 273+=============================+=======+=========+============+ 274| .. c:macro:: PyBUF_INDIRECT | yes | yes | if needed | 275+-----------------------------+-------+---------+------------+ 276| .. c:macro:: PyBUF_STRIDES | yes | yes | NULL | 277+-----------------------------+-------+---------+------------+ 278| .. c:macro:: PyBUF_ND | yes | NULL | NULL | 279+-----------------------------+-------+---------+------------+ 280| .. c:macro:: PyBUF_SIMPLE | NULL | NULL | NULL | 281+-----------------------------+-------+---------+------------+ 282 283 284.. index:: contiguous, C-contiguous, Fortran contiguous 285 286contiguity requests 287~~~~~~~~~~~~~~~~~~~ 288 289C or Fortran :term:`contiguity <contiguous>` can be explicitly requested, 290with and without stride information. Without stride information, the buffer 291must be C-contiguous. 292 293.. tabularcolumns:: |p{0.35\linewidth}|l|l|l|l| 294 295+-----------------------------------+-------+---------+------------+--------+ 296| Request | shape | strides | suboffsets | contig | 297+===================================+=======+=========+============+========+ 298| .. c:macro:: PyBUF_C_CONTIGUOUS | yes | yes | NULL | C | 299+-----------------------------------+-------+---------+------------+--------+ 300| .. c:macro:: PyBUF_F_CONTIGUOUS | yes | yes | NULL | F | 301+-----------------------------------+-------+---------+------------+--------+ 302| .. c:macro:: PyBUF_ANY_CONTIGUOUS | yes | yes | NULL | C or F | 303+-----------------------------------+-------+---------+------------+--------+ 304| .. c:macro:: PyBUF_ND | yes | NULL | NULL | C | 305+-----------------------------------+-------+---------+------------+--------+ 306 307 308compound requests 309~~~~~~~~~~~~~~~~~ 310 311All possible requests are fully defined by some combination of the flags in 312the previous section. For convenience, the buffer protocol provides frequently 313used combinations as single flags. 314 315In the following table *U* stands for undefined contiguity. The consumer would 316have to call :c:func:`PyBuffer_IsContiguous` to determine contiguity. 317 318.. tabularcolumns:: |p{0.35\linewidth}|l|l|l|l|l|l| 319 320+-------------------------------+-------+---------+------------+--------+----------+--------+ 321| Request | shape | strides | suboffsets | contig | readonly | format | 322+===============================+=======+=========+============+========+==========+========+ 323| .. c:macro:: PyBUF_FULL | yes | yes | if needed | U | 0 | yes | 324+-------------------------------+-------+---------+------------+--------+----------+--------+ 325| .. c:macro:: PyBUF_FULL_RO | yes | yes | if needed | U | 1 or 0 | yes | 326+-------------------------------+-------+---------+------------+--------+----------+--------+ 327| .. c:macro:: PyBUF_RECORDS | yes | yes | NULL | U | 0 | yes | 328+-------------------------------+-------+---------+------------+--------+----------+--------+ 329| .. c:macro:: PyBUF_RECORDS_RO | yes | yes | NULL | U | 1 or 0 | yes | 330+-------------------------------+-------+---------+------------+--------+----------+--------+ 331| .. c:macro:: PyBUF_STRIDED | yes | yes | NULL | U | 0 | NULL | 332+-------------------------------+-------+---------+------------+--------+----------+--------+ 333| .. c:macro:: PyBUF_STRIDED_RO | yes | yes | NULL | U | 1 or 0 | NULL | 334+-------------------------------+-------+---------+------------+--------+----------+--------+ 335| .. c:macro:: PyBUF_CONTIG | yes | NULL | NULL | C | 0 | NULL | 336+-------------------------------+-------+---------+------------+--------+----------+--------+ 337| .. c:macro:: PyBUF_CONTIG_RO | yes | NULL | NULL | C | 1 or 0 | NULL | 338+-------------------------------+-------+---------+------------+--------+----------+--------+ 339 340 341Complex arrays 342============== 343 344NumPy-style: shape and strides 345~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 346 347The logical structure of NumPy-style arrays is defined by :c:member:`~Py_buffer.itemsize`, 348:c:member:`~Py_buffer.ndim`, :c:member:`~Py_buffer.shape` and :c:member:`~Py_buffer.strides`. 349 350If ``ndim == 0``, the memory location pointed to by :c:member:`~Py_buffer.buf` is 351interpreted as a scalar of size :c:member:`~Py_buffer.itemsize`. In that case, 352both :c:member:`~Py_buffer.shape` and :c:member:`~Py_buffer.strides` are *NULL*. 353 354If :c:member:`~Py_buffer.strides` is *NULL*, the array is interpreted as 355a standard n-dimensional C-array. Otherwise, the consumer must access an 356n-dimensional array as follows: 357 358 ``ptr = (char *)buf + indices[0] * strides[0] + ... + indices[n-1] * strides[n-1]`` 359 ``item = *((typeof(item) *)ptr);`` 360 361 362As noted above, :c:member:`~Py_buffer.buf` can point to any location within 363the actual memory block. An exporter can check the validity of a buffer with 364this function: 365 366.. code-block:: python 367 368 def verify_structure(memlen, itemsize, ndim, shape, strides, offset): 369 """Verify that the parameters represent a valid array within 370 the bounds of the allocated memory: 371 char *mem: start of the physical memory block 372 memlen: length of the physical memory block 373 offset: (char *)buf - mem 374 """ 375 if offset % itemsize: 376 return False 377 if offset < 0 or offset+itemsize > memlen: 378 return False 379 if any(v % itemsize for v in strides): 380 return False 381 382 if ndim <= 0: 383 return ndim == 0 and not shape and not strides 384 if 0 in shape: 385 return True 386 387 imin = sum(strides[j]*(shape[j]-1) for j in range(ndim) 388 if strides[j] <= 0) 389 imax = sum(strides[j]*(shape[j]-1) for j in range(ndim) 390 if strides[j] > 0) 391 392 return 0 <= offset+imin and offset+imax+itemsize <= memlen 393 394 395PIL-style: shape, strides and suboffsets 396~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 397 398In addition to the regular items, PIL-style arrays can contain pointers 399that must be followed in order to get to the next element in a dimension. 400For example, the regular three-dimensional C-array ``char v[2][2][3]`` can 401also be viewed as an array of 2 pointers to 2 two-dimensional arrays: 402``char (*v[2])[2][3]``. In suboffsets representation, those two pointers 403can be embedded at the start of :c:member:`~Py_buffer.buf`, pointing 404to two ``char x[2][3]`` arrays that can be located anywhere in memory. 405 406 407Here is a function that returns a pointer to the element in an N-D array 408pointed to by an N-dimensional index when there are both non-NULL strides 409and suboffsets:: 410 411 void *get_item_pointer(int ndim, void *buf, Py_ssize_t *strides, 412 Py_ssize_t *suboffsets, Py_ssize_t *indices) { 413 char *pointer = (char*)buf; 414 int i; 415 for (i = 0; i < ndim; i++) { 416 pointer += strides[i] * indices[i]; 417 if (suboffsets[i] >=0 ) { 418 pointer = *((char**)pointer) + suboffsets[i]; 419 } 420 } 421 return (void*)pointer; 422 } 423 424 425Buffer-related functions 426======================== 427 428.. c:function:: int PyObject_CheckBuffer(PyObject *obj) 429 430 Return ``1`` if *obj* supports the buffer interface otherwise ``0``. When ``1`` is 431 returned, it doesn't guarantee that :c:func:`PyObject_GetBuffer` will 432 succeed. This function always succeeds. 433 434 435.. c:function:: int PyObject_GetBuffer(PyObject *exporter, Py_buffer *view, int flags) 436 437 Send a request to *exporter* to fill in *view* as specified by *flags*. 438 If the exporter cannot provide a buffer of the exact type, it MUST raise 439 :c:data:`PyExc_BufferError`, set :c:member:`view->obj` to *NULL* and 440 return ``-1``. 441 442 On success, fill in *view*, set :c:member:`view->obj` to a new reference 443 to *exporter* and return 0. In the case of chained buffer providers 444 that redirect requests to a single object, :c:member:`view->obj` MAY 445 refer to this object instead of *exporter* (See :ref:`Buffer Object Structures <buffer-structs>`). 446 447 Successful calls to :c:func:`PyObject_GetBuffer` must be paired with calls 448 to :c:func:`PyBuffer_Release`, similar to :c:func:`malloc` and :c:func:`free`. 449 Thus, after the consumer is done with the buffer, :c:func:`PyBuffer_Release` 450 must be called exactly once. 451 452 453.. c:function:: void PyBuffer_Release(Py_buffer *view) 454 455 Release the buffer *view* and decrement the reference count for 456 :c:member:`view->obj`. This function MUST be called when the buffer 457 is no longer being used, otherwise reference leaks may occur. 458 459 It is an error to call this function on a buffer that was not obtained via 460 :c:func:`PyObject_GetBuffer`. 461 462 463.. c:function:: Py_ssize_t PyBuffer_SizeFromFormat(const char *) 464 465 Return the implied :c:data:`~Py_buffer.itemsize` from :c:data:`~Py_buffer.format`. 466 This function is not yet implemented. 467 468 469.. c:function:: int PyBuffer_IsContiguous(Py_buffer *view, char order) 470 471 Return ``1`` if the memory defined by the *view* is C-style (*order* is 472 ``'C'``) or Fortran-style (*order* is ``'F'``) :term:`contiguous` or either one 473 (*order* is ``'A'``). Return ``0`` otherwise. This function always succeeds. 474 475 476.. c:function:: int PyBuffer_ToContiguous(void *buf, Py_buffer *src, Py_ssize_t len, char order) 477 478 Copy *len* bytes from *src* to its contiguous representation in *buf*. 479 *order* can be ``'C'`` or ``'F'`` (for C-style or Fortran-style ordering). 480 ``0`` is returned on success, ``-1`` on error. 481 482 This function fails if *len* != *src->len*. 483 484 485.. c:function:: void PyBuffer_FillContiguousStrides(int ndims, Py_ssize_t *shape, Py_ssize_t *strides, int itemsize, char order) 486 487 Fill the *strides* array with byte-strides of a :term:`contiguous` (C-style if 488 *order* is ``'C'`` or Fortran-style if *order* is ``'F'``) array of the 489 given shape with the given number of bytes per element. 490 491 492.. c:function:: int PyBuffer_FillInfo(Py_buffer *view, PyObject *exporter, void *buf, Py_ssize_t len, int readonly, int flags) 493 494 Handle buffer requests for an exporter that wants to expose *buf* of size *len* 495 with writability set according to *readonly*. *buf* is interpreted as a sequence 496 of unsigned bytes. 497 498 The *flags* argument indicates the request type. This function always fills in 499 *view* as specified by flags, unless *buf* has been designated as read-only 500 and :c:macro:`PyBUF_WRITABLE` is set in *flags*. 501 502 On success, set :c:member:`view->obj` to a new reference to *exporter* and 503 return 0. Otherwise, raise :c:data:`PyExc_BufferError`, set 504 :c:member:`view->obj` to *NULL* and return ``-1``; 505 506 If this function is used as part of a :ref:`getbufferproc <buffer-structs>`, 507 *exporter* MUST be set to the exporting object and *flags* must be passed 508 unmodified. Otherwise, *exporter* MUST be NULL. 509