1.. highlight:: c 2 3.. _defining-new-types: 4 5********************************** 6Defining Extension Types: Tutorial 7********************************** 8 9.. sectionauthor:: Michael Hudson <mwh@python.net> 10.. sectionauthor:: Dave Kuhlman <dkuhlman@rexx.com> 11.. sectionauthor:: Jim Fulton <jim@zope.com> 12 13 14Python allows the writer of a C extension module to define new types that 15can be manipulated from Python code, much like the built-in :class:`str` 16and :class:`list` types. The code for all extension types follows a 17pattern, but there are some details that you need to understand before you 18can get started. This document is a gentle introduction to the topic. 19 20 21.. _dnt-basics: 22 23The Basics 24========== 25 26The :term:`CPython` runtime sees all Python objects as variables of type 27:c:type:`PyObject\*`, which serves as a "base type" for all Python objects. 28The :c:type:`PyObject` structure itself only contains the object's 29:term:`reference count` and a pointer to the object's "type object". 30This is where the action is; the type object determines which (C) functions 31get called by the interpreter when, for instance, an attribute gets looked up 32on an object, a method called, or it is multiplied by another object. These 33C functions are called "type methods". 34 35So, if you want to define a new extension type, you need to create a new type 36object. 37 38This sort of thing can only be explained by example, so here's a minimal, but 39complete, module that defines a new type named :class:`Custom` inside a C 40extension module :mod:`custom`: 41 42.. note:: 43 What we're showing here is the traditional way of defining *static* 44 extension types. It should be adequate for most uses. The C API also 45 allows defining heap-allocated extension types using the 46 :c:func:`PyType_FromSpec` function, which isn't covered in this tutorial. 47 48.. literalinclude:: ../includes/custom.c 49 50Now that's quite a bit to take in at once, but hopefully bits will seem familiar 51from the previous chapter. This file defines three things: 52 53#. What a :class:`Custom` **object** contains: this is the ``CustomObject`` 54 struct, which is allocated once for each :class:`Custom` instance. 55#. How the :class:`Custom` **type** behaves: this is the ``CustomType`` struct, 56 which defines a set of flags and function pointers that the interpreter 57 inspects when specific operations are requested. 58#. How to initialize the :mod:`custom` module: this is the ``PyInit_custom`` 59 function and the associated ``custommodule`` struct. 60 61The first bit is:: 62 63 typedef struct { 64 PyObject_HEAD 65 } CustomObject; 66 67This is what a Custom object will contain. ``PyObject_HEAD`` is mandatory 68at the start of each object struct and defines a field called ``ob_base`` 69of type :c:type:`PyObject`, containing a pointer to a type object and a 70reference count (these can be accessed using the macros :c:macro:`Py_REFCNT` 71and :c:macro:`Py_TYPE` respectively). The reason for the macro is to 72abstract away the layout and to enable additional fields in debug builds. 73 74.. note:: 75 There is no semicolon above after the :c:macro:`PyObject_HEAD` macro. 76 Be wary of adding one by accident: some compilers will complain. 77 78Of course, objects generally store additional data besides the standard 79``PyObject_HEAD`` boilerplate; for example, here is the definition for 80standard Python floats:: 81 82 typedef struct { 83 PyObject_HEAD 84 double ob_fval; 85 } PyFloatObject; 86 87The second bit is the definition of the type object. :: 88 89 static PyTypeObject CustomType = { 90 PyVarObject_HEAD_INIT(NULL, 0) 91 .tp_name = "custom.Custom", 92 .tp_doc = "Custom objects", 93 .tp_basicsize = sizeof(CustomObject), 94 .tp_itemsize = 0, 95 .tp_flags = Py_TPFLAGS_DEFAULT, 96 .tp_new = PyType_GenericNew, 97 }; 98 99.. note:: 100 We recommend using C99-style designated initializers as above, to 101 avoid listing all the :c:type:`PyTypeObject` fields that you don't care 102 about and also to avoid caring about the fields' declaration order. 103 104The actual definition of :c:type:`PyTypeObject` in :file:`object.h` has 105many more :ref:`fields <type-structs>` than the definition above. The 106remaining fields will be filled with zeros by the C compiler, and it's 107common practice to not specify them explicitly unless you need them. 108 109We're going to pick it apart, one field at a time:: 110 111 PyVarObject_HEAD_INIT(NULL, 0) 112 113This line is mandatory boilerplate to initialize the ``ob_base`` 114field mentioned above. :: 115 116 .tp_name = "custom.Custom", 117 118The name of our type. This will appear in the default textual representation of 119our objects and in some error messages, for example: 120 121.. code-block:: pycon 122 123 >>> "" + custom.Custom() 124 Traceback (most recent call last): 125 File "<stdin>", line 1, in <module> 126 TypeError: can only concatenate str (not "custom.Custom") to str 127 128Note that the name is a dotted name that includes both the module name and the 129name of the type within the module. The module in this case is :mod:`custom` and 130the type is :class:`Custom`, so we set the type name to :class:`custom.Custom`. 131Using the real dotted import path is important to make your type compatible 132with the :mod:`pydoc` and :mod:`pickle` modules. :: 133 134 .tp_basicsize = sizeof(CustomObject), 135 .tp_itemsize = 0, 136 137This is so that Python knows how much memory to allocate when creating 138new :class:`Custom` instances. :c:member:`~PyTypeObject.tp_itemsize` is 139only used for variable-sized objects and should otherwise be zero. 140 141.. note:: 142 143 If you want your type to be subclassable from Python, and your type has the same 144 :c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple 145 inheritance. A Python subclass of your type will have to list your type first 146 in its :attr:`~class.__bases__`, or else it will not be able to call your type's 147 :meth:`__new__` method without getting an error. You can avoid this problem by 148 ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its 149 base type does. Most of the time, this will be true anyway, because either your 150 base type will be :class:`object`, or else you will be adding data members to 151 your base type, and therefore increasing its size. 152 153We set the class flags to :const:`Py_TPFLAGS_DEFAULT`. :: 154 155 .tp_flags = Py_TPFLAGS_DEFAULT, 156 157All types should include this constant in their flags. It enables all of the 158members defined until at least Python 3.3. If you need further members, 159you will need to OR the corresponding flags. 160 161We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. :: 162 163 .tp_doc = "Custom objects", 164 165To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new` 166handler. This is the equivalent of the Python method :meth:`__new__`, but 167has to be specified explicitly. In this case, we can just use the default 168implementation provided by the API function :c:func:`PyType_GenericNew`. :: 169 170 .tp_new = PyType_GenericNew, 171 172Everything else in the file should be familiar, except for some code in 173:c:func:`PyInit_custom`:: 174 175 if (PyType_Ready(&CustomType) < 0) 176 return; 177 178This initializes the :class:`Custom` type, filling in a number of members 179to the appropriate default values, including :attr:`ob_type` that we initially 180set to ``NULL``. :: 181 182 Py_INCREF(&CustomType); 183 if (PyModule_AddObject(m, "Custom", (PyObject *) &CustomType) < 0) { 184 Py_DECREF(&CustomType); 185 Py_DECREF(m); 186 return NULL; 187 } 188 189This adds the type to the module dictionary. This allows us to create 190:class:`Custom` instances by calling the :class:`Custom` class: 191 192.. code-block:: pycon 193 194 >>> import custom 195 >>> mycustom = custom.Custom() 196 197That's it! All that remains is to build it; put the above code in a file called 198:file:`custom.c` and: 199 200.. code-block:: python 201 202 from distutils.core import setup, Extension 203 setup(name="custom", version="1.0", 204 ext_modules=[Extension("custom", ["custom.c"])]) 205 206in a file called :file:`setup.py`; then typing 207 208.. code-block:: shell-session 209 210 $ python setup.py build 211 212at a shell should produce a file :file:`custom.so` in a subdirectory; move to 213that directory and fire up Python --- you should be able to ``import custom`` and 214play around with Custom objects. 215 216That wasn't so hard, was it? 217 218Of course, the current Custom type is pretty uninteresting. It has no data and 219doesn't do anything. It can't even be subclassed. 220 221.. note:: 222 While this documentation showcases the standard :mod:`distutils` module 223 for building C extensions, it is recommended in real-world use cases to 224 use the newer and better-maintained ``setuptools`` library. Documentation 225 on how to do this is out of scope for this document and can be found in 226 the `Python Packaging User's Guide <https://packaging.python.org/tutorials/distributing-packages/>`_. 227 228 229Adding data and methods to the Basic example 230============================================ 231 232Let's extend the basic example to add some data and methods. Let's also make 233the type usable as a base class. We'll create a new module, :mod:`custom2` that 234adds these capabilities: 235 236.. literalinclude:: ../includes/custom2.c 237 238 239This version of the module has a number of changes. 240 241We've added an extra include:: 242 243 #include <structmember.h> 244 245This include provides declarations that we use to handle attributes, as 246described a bit later. 247 248The :class:`Custom` type now has three data attributes in its C struct, 249*first*, *last*, and *number*. The *first* and *last* variables are Python 250strings containing first and last names. The *number* attribute is a C integer. 251 252The object structure is updated accordingly:: 253 254 typedef struct { 255 PyObject_HEAD 256 PyObject *first; /* first name */ 257 PyObject *last; /* last name */ 258 int number; 259 } CustomObject; 260 261Because we now have data to manage, we have to be more careful about object 262allocation and deallocation. At a minimum, we need a deallocation method:: 263 264 static void 265 Custom_dealloc(CustomObject *self) 266 { 267 Py_XDECREF(self->first); 268 Py_XDECREF(self->last); 269 Py_TYPE(self)->tp_free((PyObject *) self); 270 } 271 272which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member:: 273 274 .tp_dealloc = (destructor) Custom_dealloc, 275 276This method first clears the reference counts of the two Python attributes. 277:c:func:`Py_XDECREF` correctly handles the case where its argument is 278``NULL`` (which might happen here if ``tp_new`` failed midway). It then 279calls the :c:member:`~PyTypeObject.tp_free` member of the object's type 280(computed by ``Py_TYPE(self)``) to free the object's memory. Note that 281the object's type might not be :class:`CustomType`, because the object may 282be an instance of a subclass. 283 284.. note:: 285 The explicit cast to ``destructor`` above is needed because we defined 286 ``Custom_dealloc`` to take a ``CustomObject *`` argument, but the ``tp_dealloc`` 287 function pointer expects to receive a ``PyObject *`` argument. Otherwise, 288 the compiler will emit a warning. This is object-oriented polymorphism, 289 in C! 290 291We want to make sure that the first and last names are initialized to empty 292strings, so we provide a ``tp_new`` implementation:: 293 294 static PyObject * 295 Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds) 296 { 297 CustomObject *self; 298 self = (CustomObject *) type->tp_alloc(type, 0); 299 if (self != NULL) { 300 self->first = PyUnicode_FromString(""); 301 if (self->first == NULL) { 302 Py_DECREF(self); 303 return NULL; 304 } 305 self->last = PyUnicode_FromString(""); 306 if (self->last == NULL) { 307 Py_DECREF(self); 308 return NULL; 309 } 310 self->number = 0; 311 } 312 return (PyObject *) self; 313 } 314 315and install it in the :c:member:`~PyTypeObject.tp_new` member:: 316 317 .tp_new = Custom_new, 318 319The ``tp_new`` handler is responsible for creating (as opposed to initializing) 320objects of the type. It is exposed in Python as the :meth:`__new__` method. 321It is not required to define a ``tp_new`` member, and indeed many extension 322types will simply reuse :c:func:`PyType_GenericNew` as done in the first 323version of the ``Custom`` type above. In this case, we use the ``tp_new`` 324handler to initialize the ``first`` and ``last`` attributes to non-``NULL`` 325default values. 326 327``tp_new`` is passed the type being instantiated (not necessarily ``CustomType``, 328if a subclass is instantiated) and any arguments passed when the type was 329called, and is expected to return the instance created. ``tp_new`` handlers 330always accept positional and keyword arguments, but they often ignore the 331arguments, leaving the argument handling to initializer (a.k.a. ``tp_init`` 332in C or ``__init__`` in Python) methods. 333 334.. note:: 335 ``tp_new`` shouldn't call ``tp_init`` explicitly, as the interpreter 336 will do it itself. 337 338The ``tp_new`` implementation calls the :c:member:`~PyTypeObject.tp_alloc` 339slot to allocate memory:: 340 341 self = (CustomObject *) type->tp_alloc(type, 0); 342 343Since memory allocation may fail, we must check the :c:member:`~PyTypeObject.tp_alloc` 344result against ``NULL`` before proceeding. 345 346.. note:: 347 We didn't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather 348 :c:func:`PyType_Ready` fills it for us by inheriting it from our base class, 349 which is :class:`object` by default. Most types use the default allocation 350 strategy. 351 352.. note:: 353 If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one 354 that calls a base type's :c:member:`~PyTypeObject.tp_new` or :meth:`__new__`), 355 you must *not* try to determine what method to call using method resolution 356 order at runtime. Always statically determine what type you are going to 357 call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via 358 ``type->tp_base->tp_new``. If you do not do this, Python subclasses of your 359 type that also inherit from other Python-defined classes may not work correctly. 360 (Specifically, you may not be able to create instances of such subclasses 361 without getting a :exc:`TypeError`.) 362 363We also define an initialization function which accepts arguments to provide 364initial values for our instance:: 365 366 static int 367 Custom_init(CustomObject *self, PyObject *args, PyObject *kwds) 368 { 369 static char *kwlist[] = {"first", "last", "number", NULL}; 370 PyObject *first = NULL, *last = NULL, *tmp; 371 372 if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist, 373 &first, &last, 374 &self->number)) 375 return -1; 376 377 if (first) { 378 tmp = self->first; 379 Py_INCREF(first); 380 self->first = first; 381 Py_XDECREF(tmp); 382 } 383 if (last) { 384 tmp = self->last; 385 Py_INCREF(last); 386 self->last = last; 387 Py_XDECREF(tmp); 388 } 389 return 0; 390 } 391 392by filling the :c:member:`~PyTypeObject.tp_init` slot. :: 393 394 .tp_init = (initproc) Custom_init, 395 396The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the 397:meth:`__init__` method. It is used to initialize an object after it's 398created. Initializers always accept positional and keyword arguments, 399and they should return either ``0`` on success or ``-1`` on error. 400 401Unlike the ``tp_new`` handler, there is no guarantee that ``tp_init`` 402is called at all (for example, the :mod:`pickle` module by default 403doesn't call :meth:`__init__` on unpickled instances). It can also be 404called multiple times. Anyone can call the :meth:`__init__` method on 405our objects. For this reason, we have to be extra careful when assigning 406the new attribute values. We might be tempted, for example to assign the 407``first`` member like this:: 408 409 if (first) { 410 Py_XDECREF(self->first); 411 Py_INCREF(first); 412 self->first = first; 413 } 414 415But this would be risky. Our type doesn't restrict the type of the 416``first`` member, so it could be any kind of object. It could have a 417destructor that causes code to be executed that tries to access the 418``first`` member; or that destructor could release the 419:term:`Global interpreter Lock <GIL>` and let arbitrary code run in other 420threads that accesses and modifies our object. 421 422To be paranoid and protect ourselves against this possibility, we almost 423always reassign members before decrementing their reference counts. When 424don't we have to do this? 425 426* when we absolutely know that the reference count is greater than 1; 427 428* when we know that deallocation of the object [#]_ will neither release 429 the :term:`GIL` nor cause any calls back into our type's code; 430 431* when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc` 432 handler on a type which doesn't support cyclic garbage collection [#]_. 433 434We want to expose our instance variables as attributes. There are a 435number of ways to do that. The simplest way is to define member definitions:: 436 437 static PyMemberDef Custom_members[] = { 438 {"first", T_OBJECT_EX, offsetof(CustomObject, first), 0, 439 "first name"}, 440 {"last", T_OBJECT_EX, offsetof(CustomObject, last), 0, 441 "last name"}, 442 {"number", T_INT, offsetof(CustomObject, number), 0, 443 "custom number"}, 444 {NULL} /* Sentinel */ 445 }; 446 447and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot:: 448 449 .tp_members = Custom_members, 450 451Each member definition has a member name, type, offset, access flags and 452documentation string. See the :ref:`Generic-Attribute-Management` section 453below for details. 454 455A disadvantage of this approach is that it doesn't provide a way to restrict the 456types of objects that can be assigned to the Python attributes. We expect the 457first and last names to be strings, but any Python objects can be assigned. 458Further, the attributes can be deleted, setting the C pointers to ``NULL``. Even 459though we can make sure the members are initialized to non-``NULL`` values, the 460members can be set to ``NULL`` if the attributes are deleted. 461 462We define a single method, :meth:`Custom.name()`, that outputs the objects name as the 463concatenation of the first and last names. :: 464 465 static PyObject * 466 Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored)) 467 { 468 if (self->first == NULL) { 469 PyErr_SetString(PyExc_AttributeError, "first"); 470 return NULL; 471 } 472 if (self->last == NULL) { 473 PyErr_SetString(PyExc_AttributeError, "last"); 474 return NULL; 475 } 476 return PyUnicode_FromFormat("%S %S", self->first, self->last); 477 } 478 479The method is implemented as a C function that takes a :class:`Custom` (or 480:class:`Custom` subclass) instance as the first argument. Methods always take an 481instance as the first argument. Methods often take positional and keyword 482arguments as well, but in this case we don't take any and don't need to accept 483a positional argument tuple or keyword argument dictionary. This method is 484equivalent to the Python method: 485 486.. code-block:: python 487 488 def name(self): 489 return "%s %s" % (self.first, self.last) 490 491Note that we have to check for the possibility that our :attr:`first` and 492:attr:`last` members are ``NULL``. This is because they can be deleted, in which 493case they are set to ``NULL``. It would be better to prevent deletion of these 494attributes and to restrict the attribute values to be strings. We'll see how to 495do that in the next section. 496 497Now that we've defined the method, we need to create an array of method 498definitions:: 499 500 static PyMethodDef Custom_methods[] = { 501 {"name", (PyCFunction) Custom_name, METH_NOARGS, 502 "Return the name, combining the first and last name" 503 }, 504 {NULL} /* Sentinel */ 505 }; 506 507(note that we used the :const:`METH_NOARGS` flag to indicate that the method 508is expecting no arguments other than *self*) 509 510and assign it to the :c:member:`~PyTypeObject.tp_methods` slot:: 511 512 .tp_methods = Custom_methods, 513 514Finally, we'll make our type usable as a base class for subclassing. We've 515written our methods carefully so far so that they don't make any assumptions 516about the type of the object being created or used, so all we need to do is 517to add the :const:`Py_TPFLAGS_BASETYPE` to our class flag definition:: 518 519 .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, 520 521We rename :c:func:`PyInit_custom` to :c:func:`PyInit_custom2`, update the 522module name in the :c:type:`PyModuleDef` struct, and update the full class 523name in the :c:type:`PyTypeObject` struct. 524 525Finally, we update our :file:`setup.py` file to build the new module: 526 527.. code-block:: python 528 529 from distutils.core import setup, Extension 530 setup(name="custom", version="1.0", 531 ext_modules=[ 532 Extension("custom", ["custom.c"]), 533 Extension("custom2", ["custom2.c"]), 534 ]) 535 536 537Providing finer control over data attributes 538============================================ 539 540In this section, we'll provide finer control over how the :attr:`first` and 541:attr:`last` attributes are set in the :class:`Custom` example. In the previous 542version of our module, the instance variables :attr:`first` and :attr:`last` 543could be set to non-string values or even deleted. We want to make sure that 544these attributes always contain strings. 545 546.. literalinclude:: ../includes/custom3.c 547 548 549To provide greater control, over the :attr:`first` and :attr:`last` attributes, 550we'll use custom getter and setter functions. Here are the functions for 551getting and setting the :attr:`first` attribute:: 552 553 static PyObject * 554 Custom_getfirst(CustomObject *self, void *closure) 555 { 556 Py_INCREF(self->first); 557 return self->first; 558 } 559 560 static int 561 Custom_setfirst(CustomObject *self, PyObject *value, void *closure) 562 { 563 PyObject *tmp; 564 if (value == NULL) { 565 PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute"); 566 return -1; 567 } 568 if (!PyUnicode_Check(value)) { 569 PyErr_SetString(PyExc_TypeError, 570 "The first attribute value must be a string"); 571 return -1; 572 } 573 tmp = self->first; 574 Py_INCREF(value); 575 self->first = value; 576 Py_DECREF(tmp); 577 return 0; 578 } 579 580The getter function is passed a :class:`Custom` object and a "closure", which is 581a void pointer. In this case, the closure is ignored. (The closure supports an 582advanced usage in which definition data is passed to the getter and setter. This 583could, for example, be used to allow a single set of getter and setter functions 584that decide the attribute to get or set based on data in the closure.) 585 586The setter function is passed the :class:`Custom` object, the new value, and the 587closure. The new value may be ``NULL``, in which case the attribute is being 588deleted. In our setter, we raise an error if the attribute is deleted or if its 589new value is not a string. 590 591We create an array of :c:type:`PyGetSetDef` structures:: 592 593 static PyGetSetDef Custom_getsetters[] = { 594 {"first", (getter) Custom_getfirst, (setter) Custom_setfirst, 595 "first name", NULL}, 596 {"last", (getter) Custom_getlast, (setter) Custom_setlast, 597 "last name", NULL}, 598 {NULL} /* Sentinel */ 599 }; 600 601and register it in the :c:member:`~PyTypeObject.tp_getset` slot:: 602 603 .tp_getset = Custom_getsetters, 604 605The last item in a :c:type:`PyGetSetDef` structure is the "closure" mentioned 606above. In this case, we aren't using a closure, so we just pass ``NULL``. 607 608We also remove the member definitions for these attributes:: 609 610 static PyMemberDef Custom_members[] = { 611 {"number", T_INT, offsetof(CustomObject, number), 0, 612 "custom number"}, 613 {NULL} /* Sentinel */ 614 }; 615 616We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only 617allow strings [#]_ to be passed:: 618 619 static int 620 Custom_init(CustomObject *self, PyObject *args, PyObject *kwds) 621 { 622 static char *kwlist[] = {"first", "last", "number", NULL}; 623 PyObject *first = NULL, *last = NULL, *tmp; 624 625 if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist, 626 &first, &last, 627 &self->number)) 628 return -1; 629 630 if (first) { 631 tmp = self->first; 632 Py_INCREF(first); 633 self->first = first; 634 Py_DECREF(tmp); 635 } 636 if (last) { 637 tmp = self->last; 638 Py_INCREF(last); 639 self->last = last; 640 Py_DECREF(tmp); 641 } 642 return 0; 643 } 644 645With these changes, we can assure that the ``first`` and ``last`` members are 646never ``NULL`` so we can remove checks for ``NULL`` values in almost all cases. 647This means that most of the :c:func:`Py_XDECREF` calls can be converted to 648:c:func:`Py_DECREF` calls. The only place we can't change these calls is in 649the ``tp_dealloc`` implementation, where there is the possibility that the 650initialization of these members failed in ``tp_new``. 651 652We also rename the module initialization function and module name in the 653initialization function, as we did before, and we add an extra definition to the 654:file:`setup.py` file. 655 656 657Supporting cyclic garbage collection 658==================================== 659 660Python has a :term:`cyclic garbage collector (GC) <garbage collection>` that 661can identify unneeded objects even when their reference counts are not zero. 662This can happen when objects are involved in cycles. For example, consider: 663 664.. code-block:: pycon 665 666 >>> l = [] 667 >>> l.append(l) 668 >>> del l 669 670In this example, we create a list that contains itself. When we delete it, it 671still has a reference from itself. Its reference count doesn't drop to zero. 672Fortunately, Python's cyclic garbage collector will eventually figure out that 673the list is garbage and free it. 674 675In the second version of the :class:`Custom` example, we allowed any kind of 676object to be stored in the :attr:`first` or :attr:`last` attributes [#]_. 677Besides, in the second and third versions, we allowed subclassing 678:class:`Custom`, and subclasses may add arbitrary attributes. For any of 679those two reasons, :class:`Custom` objects can participate in cycles: 680 681.. code-block:: pycon 682 683 >>> import custom3 684 >>> class Derived(custom3.Custom): pass 685 ... 686 >>> n = Derived() 687 >>> n.some_attribute = n 688 689To allow a :class:`Custom` instance participating in a reference cycle to 690be properly detected and collected by the cyclic GC, our :class:`Custom` type 691needs to fill two additional slots and to enable a flag that enables these slots: 692 693.. literalinclude:: ../includes/custom4.c 694 695 696First, the traversal method lets the cyclic GC know about subobjects that could 697participate in cycles:: 698 699 static int 700 Custom_traverse(CustomObject *self, visitproc visit, void *arg) 701 { 702 int vret; 703 if (self->first) { 704 vret = visit(self->first, arg); 705 if (vret != 0) 706 return vret; 707 } 708 if (self->last) { 709 vret = visit(self->last, arg); 710 if (vret != 0) 711 return vret; 712 } 713 return 0; 714 } 715 716For each subobject that can participate in cycles, we need to call the 717:c:func:`visit` function, which is passed to the traversal method. The 718:c:func:`visit` function takes as arguments the subobject and the extra argument 719*arg* passed to the traversal method. It returns an integer value that must be 720returned if it is non-zero. 721 722Python provides a :c:func:`Py_VISIT` macro that automates calling visit 723functions. With :c:func:`Py_VISIT`, we can minimize the amount of boilerplate 724in ``Custom_traverse``:: 725 726 static int 727 Custom_traverse(CustomObject *self, visitproc visit, void *arg) 728 { 729 Py_VISIT(self->first); 730 Py_VISIT(self->last); 731 return 0; 732 } 733 734.. note:: 735 The :c:member:`~PyTypeObject.tp_traverse` implementation must name its 736 arguments exactly *visit* and *arg* in order to use :c:func:`Py_VISIT`. 737 738Second, we need to provide a method for clearing any subobjects that can 739participate in cycles:: 740 741 static int 742 Custom_clear(CustomObject *self) 743 { 744 Py_CLEAR(self->first); 745 Py_CLEAR(self->last); 746 return 0; 747 } 748 749Notice the use of the :c:func:`Py_CLEAR` macro. It is the recommended and safe 750way to clear data attributes of arbitrary types while decrementing 751their reference counts. If you were to call :c:func:`Py_XDECREF` instead 752on the attribute before setting it to ``NULL``, there is a possibility 753that the attribute's destructor would call back into code that reads the 754attribute again (*especially* if there is a reference cycle). 755 756.. note:: 757 You could emulate :c:func:`Py_CLEAR` by writing:: 758 759 PyObject *tmp; 760 tmp = self->first; 761 self->first = NULL; 762 Py_XDECREF(tmp); 763 764 Nevertheless, it is much easier and less error-prone to always 765 use :c:func:`Py_CLEAR` when deleting an attribute. Don't 766 try to micro-optimize at the expense of robustness! 767 768The deallocator ``Custom_dealloc`` may call arbitrary code when clearing 769attributes. It means the circular GC can be triggered inside the function. 770Since the GC assumes reference count is not zero, we need to untrack the object 771from the GC by calling :c:func:`PyObject_GC_UnTrack` before clearing members. 772Here is our reimplemented deallocator using :c:func:`PyObject_GC_UnTrack` 773and ``Custom_clear``:: 774 775 static void 776 Custom_dealloc(CustomObject *self) 777 { 778 PyObject_GC_UnTrack(self); 779 Custom_clear(self); 780 Py_TYPE(self)->tp_free((PyObject *) self); 781 } 782 783Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags:: 784 785 .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, 786 787That's pretty much it. If we had written custom :c:member:`~PyTypeObject.tp_alloc` or 788:c:member:`~PyTypeObject.tp_free` handlers, we'd need to modify them for cyclic 789garbage collection. Most extensions will use the versions automatically provided. 790 791 792Subclassing other types 793======================= 794 795It is possible to create new extension types that are derived from existing 796types. It is easiest to inherit from the built in types, since an extension can 797easily use the :c:type:`PyTypeObject` it needs. It can be difficult to share 798these :c:type:`PyTypeObject` structures between extension modules. 799 800In this example we will create a :class:`SubList` type that inherits from the 801built-in :class:`list` type. The new type will be completely compatible with 802regular lists, but will have an additional :meth:`increment` method that 803increases an internal counter: 804 805.. code-block:: pycon 806 807 >>> import sublist 808 >>> s = sublist.SubList(range(3)) 809 >>> s.extend(s) 810 >>> print(len(s)) 811 6 812 >>> print(s.increment()) 813 1 814 >>> print(s.increment()) 815 2 816 817.. literalinclude:: ../includes/sublist.c 818 819 820As you can see, the source code closely resembles the :class:`Custom` examples in 821previous sections. We will break down the main differences between them. :: 822 823 typedef struct { 824 PyListObject list; 825 int state; 826 } SubListObject; 827 828The primary difference for derived type objects is that the base type's 829object structure must be the first value. The base type will already include 830the :c:func:`PyObject_HEAD` at the beginning of its structure. 831 832When a Python object is a :class:`SubList` instance, its ``PyObject *`` pointer 833can be safely cast to both ``PyListObject *`` and ``SubListObject *``:: 834 835 static int 836 SubList_init(SubListObject *self, PyObject *args, PyObject *kwds) 837 { 838 if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0) 839 return -1; 840 self->state = 0; 841 return 0; 842 } 843 844We see above how to call through to the :attr:`__init__` method of the base 845type. 846 847This pattern is important when writing a type with custom 848:c:member:`~PyTypeObject.tp_new` and :c:member:`~PyTypeObject.tp_dealloc` 849members. The :c:member:`~PyTypeObject.tp_new` handler should not actually 850create the memory for the object with its :c:member:`~PyTypeObject.tp_alloc`, 851but let the base class handle it by calling its own :c:member:`~PyTypeObject.tp_new`. 852 853The :c:type:`PyTypeObject` struct supports a :c:member:`~PyTypeObject.tp_base` 854specifying the type's concrete base class. Due to cross-platform compiler 855issues, you can't fill that field directly with a reference to 856:c:type:`PyList_Type`; it should be done later in the module initialization 857function:: 858 859 PyMODINIT_FUNC 860 PyInit_sublist(void) 861 { 862 PyObject* m; 863 SubListType.tp_base = &PyList_Type; 864 if (PyType_Ready(&SubListType) < 0) 865 return NULL; 866 867 m = PyModule_Create(&sublistmodule); 868 if (m == NULL) 869 return NULL; 870 871 Py_INCREF(&SubListType); 872 if (PyModule_AddObject(m, "SubList", (PyObject *) &SubListType) < 0) { 873 Py_DECREF(&SubListType); 874 Py_DECREF(m); 875 return NULL; 876 } 877 878 return m; 879 } 880 881Before calling :c:func:`PyType_Ready`, the type structure must have the 882:c:member:`~PyTypeObject.tp_base` slot filled in. When we are deriving an 883existing type, it is not necessary to fill out the :c:member:`~PyTypeObject.tp_alloc` 884slot with :c:func:`PyType_GenericNew` -- the allocation function from the base 885type will be inherited. 886 887After that, calling :c:func:`PyType_Ready` and adding the type object to the 888module is the same as with the basic :class:`Custom` examples. 889 890 891.. rubric:: Footnotes 892 893.. [#] This is true when we know that the object is a basic type, like a string or a 894 float. 895 896.. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler 897 in this example, because our type doesn't support garbage collection. 898 899.. [#] We now know that the first and last members are strings, so perhaps we 900 could be less careful about decrementing their reference counts, however, 901 we accept instances of string subclasses. Even though deallocating normal 902 strings won't call back into our objects, we can't guarantee that deallocating 903 an instance of a string subclass won't call back into our objects. 904 905.. [#] Also, even with our attributes restricted to strings instances, the user 906 could pass arbitrary :class:`str` subclasses and therefore still create 907 reference cycles. 908