1.. highlight:: c
2
3.. _defining-new-types:
4
5**********************************
6Defining Extension Types: Tutorial
7**********************************
8
9.. sectionauthor:: Michael Hudson <mwh@python.net>
10.. sectionauthor:: Dave Kuhlman <dkuhlman@rexx.com>
11.. sectionauthor:: Jim Fulton <jim@zope.com>
12
13
14Python allows the writer of a C extension module to define new types that
15can be manipulated from Python code, much like the built-in :class:`str`
16and :class:`list` types.  The code for all extension types follows a
17pattern, but there are some details that you need to understand before you
18can get started.  This document is a gentle introduction to the topic.
19
20
21.. _dnt-basics:
22
23The Basics
24==========
25
26The :term:`CPython` runtime sees all Python objects as variables of type
27:c:type:`PyObject\*`, which serves as a "base type" for all Python objects.
28The :c:type:`PyObject` structure itself only contains the object's
29:term:`reference count` and a pointer to the object's "type object".
30This is where the action is; the type object determines which (C) functions
31get called by the interpreter when, for instance, an attribute gets looked up
32on an object, a method called, or it is multiplied by another object.  These
33C functions are called "type methods".
34
35So, if you want to define a new extension type, you need to create a new type
36object.
37
38This sort of thing can only be explained by example, so here's a minimal, but
39complete, module that defines a new type named :class:`Custom` inside a C
40extension module :mod:`custom`:
41
42.. note::
43   What we're showing here is the traditional way of defining *static*
44   extension types.  It should be adequate for most uses.  The C API also
45   allows defining heap-allocated extension types using the
46   :c:func:`PyType_FromSpec` function, which isn't covered in this tutorial.
47
48.. literalinclude:: ../includes/custom.c
49
50Now that's quite a bit to take in at once, but hopefully bits will seem familiar
51from the previous chapter.  This file defines three things:
52
53#. What a :class:`Custom` **object** contains: this is the ``CustomObject``
54   struct, which is allocated once for each :class:`Custom` instance.
55#. How the :class:`Custom` **type** behaves: this is the ``CustomType`` struct,
56   which defines a set of flags and function pointers that the interpreter
57   inspects when specific operations are requested.
58#. How to initialize the :mod:`custom` module: this is the ``PyInit_custom``
59   function and the associated ``custommodule`` struct.
60
61The first bit is::
62
63   typedef struct {
64       PyObject_HEAD
65   } CustomObject;
66
67This is what a Custom object will contain.  ``PyObject_HEAD`` is mandatory
68at the start of each object struct and defines a field called ``ob_base``
69of type :c:type:`PyObject`, containing a pointer to a type object and a
70reference count (these can be accessed using the macros :c:macro:`Py_REFCNT`
71and :c:macro:`Py_TYPE` respectively).  The reason for the macro is to
72abstract away the layout and to enable additional fields in debug builds.
73
74.. note::
75   There is no semicolon above after the :c:macro:`PyObject_HEAD` macro.
76   Be wary of adding one by accident: some compilers will complain.
77
78Of course, objects generally store additional data besides the standard
79``PyObject_HEAD`` boilerplate; for example, here is the definition for
80standard Python floats::
81
82   typedef struct {
83       PyObject_HEAD
84       double ob_fval;
85   } PyFloatObject;
86
87The second bit is the definition of the type object. ::
88
89   static PyTypeObject CustomType = {
90       PyVarObject_HEAD_INIT(NULL, 0)
91       .tp_name = "custom.Custom",
92       .tp_doc = "Custom objects",
93       .tp_basicsize = sizeof(CustomObject),
94       .tp_itemsize = 0,
95       .tp_flags = Py_TPFLAGS_DEFAULT,
96       .tp_new = PyType_GenericNew,
97   };
98
99.. note::
100   We recommend using C99-style designated initializers as above, to
101   avoid listing all the :c:type:`PyTypeObject` fields that you don't care
102   about and also to avoid caring about the fields' declaration order.
103
104The actual definition of :c:type:`PyTypeObject` in :file:`object.h` has
105many more :ref:`fields <type-structs>` than the definition above.  The
106remaining fields will be filled with zeros by the C compiler, and it's
107common practice to not specify them explicitly unless you need them.
108
109We're going to pick it apart, one field at a time::
110
111   PyVarObject_HEAD_INIT(NULL, 0)
112
113This line is mandatory boilerplate to initialize the ``ob_base``
114field mentioned above. ::
115
116   .tp_name = "custom.Custom",
117
118The name of our type.  This will appear in the default textual representation of
119our objects and in some error messages, for example:
120
121.. code-block:: pycon
122
123   >>> "" + custom.Custom()
124   Traceback (most recent call last):
125     File "<stdin>", line 1, in <module>
126   TypeError: can only concatenate str (not "custom.Custom") to str
127
128Note that the name is a dotted name that includes both the module name and the
129name of the type within the module. The module in this case is :mod:`custom` and
130the type is :class:`Custom`, so we set the type name to :class:`custom.Custom`.
131Using the real dotted import path is important to make your type compatible
132with the :mod:`pydoc` and :mod:`pickle` modules. ::
133
134   .tp_basicsize = sizeof(CustomObject),
135   .tp_itemsize = 0,
136
137This is so that Python knows how much memory to allocate when creating
138new :class:`Custom` instances.  :c:member:`~PyTypeObject.tp_itemsize` is
139only used for variable-sized objects and should otherwise be zero.
140
141.. note::
142
143   If you want your type to be subclassable from Python, and your type has the same
144   :c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple
145   inheritance.  A Python subclass of your type will have to list your type first
146   in its :attr:`~class.__bases__`, or else it will not be able to call your type's
147   :meth:`__new__` method without getting an error.  You can avoid this problem by
148   ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its
149   base type does.  Most of the time, this will be true anyway, because either your
150   base type will be :class:`object`, or else you will be adding data members to
151   your base type, and therefore increasing its size.
152
153We set the class flags to :const:`Py_TPFLAGS_DEFAULT`. ::
154
155   .tp_flags = Py_TPFLAGS_DEFAULT,
156
157All types should include this constant in their flags.  It enables all of the
158members defined until at least Python 3.3.  If you need further members,
159you will need to OR the corresponding flags.
160
161We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. ::
162
163   .tp_doc = "Custom objects",
164
165To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new`
166handler.  This is the equivalent of the Python method :meth:`__new__`, but
167has to be specified explicitly.  In this case, we can just use the default
168implementation provided by the API function :c:func:`PyType_GenericNew`. ::
169
170   .tp_new = PyType_GenericNew,
171
172Everything else in the file should be familiar, except for some code in
173:c:func:`PyInit_custom`::
174
175   if (PyType_Ready(&CustomType) < 0)
176       return;
177
178This initializes the :class:`Custom` type, filling in a number of members
179to the appropriate default values, including :attr:`ob_type` that we initially
180set to ``NULL``. ::
181
182   Py_INCREF(&CustomType);
183   if (PyModule_AddObject(m, "Custom", (PyObject *) &CustomType) < 0) {
184       Py_DECREF(&CustomType);
185       Py_DECREF(m);
186       return NULL;
187   }
188
189This adds the type to the module dictionary.  This allows us to create
190:class:`Custom` instances by calling the :class:`Custom` class:
191
192.. code-block:: pycon
193
194   >>> import custom
195   >>> mycustom = custom.Custom()
196
197That's it!  All that remains is to build it; put the above code in a file called
198:file:`custom.c` and:
199
200.. code-block:: python
201
202   from distutils.core import setup, Extension
203   setup(name="custom", version="1.0",
204         ext_modules=[Extension("custom", ["custom.c"])])
205
206in a file called :file:`setup.py`; then typing
207
208.. code-block:: shell-session
209
210   $ python setup.py build
211
212at a shell should produce a file :file:`custom.so` in a subdirectory; move to
213that directory and fire up Python --- you should be able to ``import custom`` and
214play around with Custom objects.
215
216That wasn't so hard, was it?
217
218Of course, the current Custom type is pretty uninteresting. It has no data and
219doesn't do anything. It can't even be subclassed.
220
221.. note::
222   While this documentation showcases the standard :mod:`distutils` module
223   for building C extensions, it is recommended in real-world use cases to
224   use the newer and better-maintained ``setuptools`` library.  Documentation
225   on how to do this is out of scope for this document and can be found in
226   the `Python Packaging User's Guide <https://packaging.python.org/tutorials/distributing-packages/>`_.
227
228
229Adding data and methods to the Basic example
230============================================
231
232Let's extend the basic example to add some data and methods.  Let's also make
233the type usable as a base class. We'll create a new module, :mod:`custom2` that
234adds these capabilities:
235
236.. literalinclude:: ../includes/custom2.c
237
238
239This version of the module has a number of changes.
240
241We've added an extra include::
242
243   #include <structmember.h>
244
245This include provides declarations that we use to handle attributes, as
246described a bit later.
247
248The  :class:`Custom` type now has three data attributes in its C struct,
249*first*, *last*, and *number*.  The *first* and *last* variables are Python
250strings containing first and last names.  The *number* attribute is a C integer.
251
252The object structure is updated accordingly::
253
254   typedef struct {
255       PyObject_HEAD
256       PyObject *first; /* first name */
257       PyObject *last;  /* last name */
258       int number;
259   } CustomObject;
260
261Because we now have data to manage, we have to be more careful about object
262allocation and deallocation.  At a minimum, we need a deallocation method::
263
264   static void
265   Custom_dealloc(CustomObject *self)
266   {
267       Py_XDECREF(self->first);
268       Py_XDECREF(self->last);
269       Py_TYPE(self)->tp_free((PyObject *) self);
270   }
271
272which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member::
273
274   .tp_dealloc = (destructor) Custom_dealloc,
275
276This method first clears the reference counts of the two Python attributes.
277:c:func:`Py_XDECREF` correctly handles the case where its argument is
278``NULL`` (which might happen here if ``tp_new`` failed midway).  It then
279calls the :c:member:`~PyTypeObject.tp_free` member of the object's type
280(computed by ``Py_TYPE(self)``) to free the object's memory.  Note that
281the object's type might not be :class:`CustomType`, because the object may
282be an instance of a subclass.
283
284.. note::
285   The explicit cast to ``destructor`` above is needed because we defined
286   ``Custom_dealloc`` to take a ``CustomObject *`` argument, but the ``tp_dealloc``
287   function pointer expects to receive a ``PyObject *`` argument.  Otherwise,
288   the compiler will emit a warning.  This is object-oriented polymorphism,
289   in C!
290
291We want to make sure that the first and last names are initialized to empty
292strings, so we provide a ``tp_new`` implementation::
293
294   static PyObject *
295   Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
296   {
297       CustomObject *self;
298       self = (CustomObject *) type->tp_alloc(type, 0);
299       if (self != NULL) {
300           self->first = PyUnicode_FromString("");
301           if (self->first == NULL) {
302               Py_DECREF(self);
303               return NULL;
304           }
305           self->last = PyUnicode_FromString("");
306           if (self->last == NULL) {
307               Py_DECREF(self);
308               return NULL;
309           }
310           self->number = 0;
311       }
312       return (PyObject *) self;
313   }
314
315and install it in the :c:member:`~PyTypeObject.tp_new` member::
316
317   .tp_new = Custom_new,
318
319The ``tp_new`` handler is responsible for creating (as opposed to initializing)
320objects of the type.  It is exposed in Python as the :meth:`__new__` method.
321It is not required to define a ``tp_new`` member, and indeed many extension
322types will simply reuse :c:func:`PyType_GenericNew` as done in the first
323version of the ``Custom`` type above.  In this case, we use the ``tp_new``
324handler to initialize the ``first`` and ``last`` attributes to non-``NULL``
325default values.
326
327``tp_new`` is passed the type being instantiated (not necessarily ``CustomType``,
328if a subclass is instantiated) and any arguments passed when the type was
329called, and is expected to return the instance created.  ``tp_new`` handlers
330always accept positional and keyword arguments, but they often ignore the
331arguments, leaving the argument handling to initializer (a.k.a. ``tp_init``
332in C or ``__init__`` in Python) methods.
333
334.. note::
335   ``tp_new`` shouldn't call ``tp_init`` explicitly, as the interpreter
336   will do it itself.
337
338The ``tp_new`` implementation calls the :c:member:`~PyTypeObject.tp_alloc`
339slot to allocate memory::
340
341   self = (CustomObject *) type->tp_alloc(type, 0);
342
343Since memory allocation may fail, we must check the :c:member:`~PyTypeObject.tp_alloc`
344result against ``NULL`` before proceeding.
345
346.. note::
347   We didn't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather
348   :c:func:`PyType_Ready` fills it for us by inheriting it from our base class,
349   which is :class:`object` by default.  Most types use the default allocation
350   strategy.
351
352.. note::
353   If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one
354   that calls a base type's :c:member:`~PyTypeObject.tp_new` or :meth:`__new__`),
355   you must *not* try to determine what method to call using method resolution
356   order at runtime.  Always statically determine what type you are going to
357   call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via
358   ``type->tp_base->tp_new``.  If you do not do this, Python subclasses of your
359   type that also inherit from other Python-defined classes may not work correctly.
360   (Specifically, you may not be able to create instances of such subclasses
361   without getting a :exc:`TypeError`.)
362
363We also define an initialization function which accepts arguments to provide
364initial values for our instance::
365
366   static int
367   Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
368   {
369       static char *kwlist[] = {"first", "last", "number", NULL};
370       PyObject *first = NULL, *last = NULL, *tmp;
371
372       if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
373                                        &first, &last,
374                                        &self->number))
375           return -1;
376
377       if (first) {
378           tmp = self->first;
379           Py_INCREF(first);
380           self->first = first;
381           Py_XDECREF(tmp);
382       }
383       if (last) {
384           tmp = self->last;
385           Py_INCREF(last);
386           self->last = last;
387           Py_XDECREF(tmp);
388       }
389       return 0;
390   }
391
392by filling the :c:member:`~PyTypeObject.tp_init` slot. ::
393
394   .tp_init = (initproc) Custom_init,
395
396The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the
397:meth:`__init__` method.  It is used to initialize an object after it's
398created.  Initializers always accept positional and keyword arguments,
399and they should return either ``0`` on success or ``-1`` on error.
400
401Unlike the ``tp_new`` handler, there is no guarantee that ``tp_init``
402is called at all (for example, the :mod:`pickle` module by default
403doesn't call :meth:`__init__` on unpickled instances).  It can also be
404called multiple times.  Anyone can call the :meth:`__init__` method on
405our objects.  For this reason, we have to be extra careful when assigning
406the new attribute values.  We might be tempted, for example to assign the
407``first`` member like this::
408
409   if (first) {
410       Py_XDECREF(self->first);
411       Py_INCREF(first);
412       self->first = first;
413   }
414
415But this would be risky.  Our type doesn't restrict the type of the
416``first`` member, so it could be any kind of object.  It could have a
417destructor that causes code to be executed that tries to access the
418``first`` member; or that destructor could release the
419:term:`Global interpreter Lock <GIL>` and let arbitrary code run in other
420threads that accesses and modifies our object.
421
422To be paranoid and protect ourselves against this possibility, we almost
423always reassign members before decrementing their reference counts.  When
424don't we have to do this?
425
426* when we absolutely know that the reference count is greater than 1;
427
428* when we know that deallocation of the object [#]_ will neither release
429  the :term:`GIL` nor cause any calls back into our type's code;
430
431* when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc`
432  handler on a type which doesn't support cyclic garbage collection [#]_.
433
434We want to expose our instance variables as attributes. There are a
435number of ways to do that. The simplest way is to define member definitions::
436
437   static PyMemberDef Custom_members[] = {
438       {"first", T_OBJECT_EX, offsetof(CustomObject, first), 0,
439        "first name"},
440       {"last", T_OBJECT_EX, offsetof(CustomObject, last), 0,
441        "last name"},
442       {"number", T_INT, offsetof(CustomObject, number), 0,
443        "custom number"},
444       {NULL}  /* Sentinel */
445   };
446
447and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot::
448
449   .tp_members = Custom_members,
450
451Each member definition has a member name, type, offset, access flags and
452documentation string.  See the :ref:`Generic-Attribute-Management` section
453below for details.
454
455A disadvantage of this approach is that it doesn't provide a way to restrict the
456types of objects that can be assigned to the Python attributes.  We expect the
457first and last names to be strings, but any Python objects can be assigned.
458Further, the attributes can be deleted, setting the C pointers to ``NULL``.  Even
459though we can make sure the members are initialized to non-``NULL`` values, the
460members can be set to ``NULL`` if the attributes are deleted.
461
462We define a single method, :meth:`Custom.name()`, that outputs the objects name as the
463concatenation of the first and last names. ::
464
465   static PyObject *
466   Custom_name(CustomObject *self, PyObject *Py_UNUSED(ignored))
467   {
468       if (self->first == NULL) {
469           PyErr_SetString(PyExc_AttributeError, "first");
470           return NULL;
471       }
472       if (self->last == NULL) {
473           PyErr_SetString(PyExc_AttributeError, "last");
474           return NULL;
475       }
476       return PyUnicode_FromFormat("%S %S", self->first, self->last);
477   }
478
479The method is implemented as a C function that takes a :class:`Custom` (or
480:class:`Custom` subclass) instance as the first argument.  Methods always take an
481instance as the first argument. Methods often take positional and keyword
482arguments as well, but in this case we don't take any and don't need to accept
483a positional argument tuple or keyword argument dictionary. This method is
484equivalent to the Python method:
485
486.. code-block:: python
487
488   def name(self):
489       return "%s %s" % (self.first, self.last)
490
491Note that we have to check for the possibility that our :attr:`first` and
492:attr:`last` members are ``NULL``.  This is because they can be deleted, in which
493case they are set to ``NULL``.  It would be better to prevent deletion of these
494attributes and to restrict the attribute values to be strings.  We'll see how to
495do that in the next section.
496
497Now that we've defined the method, we need to create an array of method
498definitions::
499
500   static PyMethodDef Custom_methods[] = {
501       {"name", (PyCFunction) Custom_name, METH_NOARGS,
502        "Return the name, combining the first and last name"
503       },
504       {NULL}  /* Sentinel */
505   };
506
507(note that we used the :const:`METH_NOARGS` flag to indicate that the method
508is expecting no arguments other than *self*)
509
510and assign it to the :c:member:`~PyTypeObject.tp_methods` slot::
511
512   .tp_methods = Custom_methods,
513
514Finally, we'll make our type usable as a base class for subclassing.  We've
515written our methods carefully so far so that they don't make any assumptions
516about the type of the object being created or used, so all we need to do is
517to add the :const:`Py_TPFLAGS_BASETYPE` to our class flag definition::
518
519   .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
520
521We rename :c:func:`PyInit_custom` to :c:func:`PyInit_custom2`, update the
522module name in the :c:type:`PyModuleDef` struct, and update the full class
523name in the :c:type:`PyTypeObject` struct.
524
525Finally, we update our :file:`setup.py` file to build the new module:
526
527.. code-block:: python
528
529   from distutils.core import setup, Extension
530   setup(name="custom", version="1.0",
531         ext_modules=[
532            Extension("custom", ["custom.c"]),
533            Extension("custom2", ["custom2.c"]),
534            ])
535
536
537Providing finer control over data attributes
538============================================
539
540In this section, we'll provide finer control over how the :attr:`first` and
541:attr:`last` attributes are set in the :class:`Custom` example. In the previous
542version of our module, the instance variables :attr:`first` and :attr:`last`
543could be set to non-string values or even deleted. We want to make sure that
544these attributes always contain strings.
545
546.. literalinclude:: ../includes/custom3.c
547
548
549To provide greater control, over the :attr:`first` and :attr:`last` attributes,
550we'll use custom getter and setter functions.  Here are the functions for
551getting and setting the :attr:`first` attribute::
552
553   static PyObject *
554   Custom_getfirst(CustomObject *self, void *closure)
555   {
556       Py_INCREF(self->first);
557       return self->first;
558   }
559
560   static int
561   Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
562   {
563       PyObject *tmp;
564       if (value == NULL) {
565           PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
566           return -1;
567       }
568       if (!PyUnicode_Check(value)) {
569           PyErr_SetString(PyExc_TypeError,
570                           "The first attribute value must be a string");
571           return -1;
572       }
573       tmp = self->first;
574       Py_INCREF(value);
575       self->first = value;
576       Py_DECREF(tmp);
577       return 0;
578   }
579
580The getter function is passed a :class:`Custom` object and a "closure", which is
581a void pointer.  In this case, the closure is ignored.  (The closure supports an
582advanced usage in which definition data is passed to the getter and setter. This
583could, for example, be used to allow a single set of getter and setter functions
584that decide the attribute to get or set based on data in the closure.)
585
586The setter function is passed the :class:`Custom` object, the new value, and the
587closure.  The new value may be ``NULL``, in which case the attribute is being
588deleted.  In our setter, we raise an error if the attribute is deleted or if its
589new value is not a string.
590
591We create an array of :c:type:`PyGetSetDef` structures::
592
593   static PyGetSetDef Custom_getsetters[] = {
594       {"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
595        "first name", NULL},
596       {"last", (getter) Custom_getlast, (setter) Custom_setlast,
597        "last name", NULL},
598       {NULL}  /* Sentinel */
599   };
600
601and register it in the :c:member:`~PyTypeObject.tp_getset` slot::
602
603   .tp_getset = Custom_getsetters,
604
605The last item in a :c:type:`PyGetSetDef` structure is the "closure" mentioned
606above.  In this case, we aren't using a closure, so we just pass ``NULL``.
607
608We also remove the member definitions for these attributes::
609
610   static PyMemberDef Custom_members[] = {
611       {"number", T_INT, offsetof(CustomObject, number), 0,
612        "custom number"},
613       {NULL}  /* Sentinel */
614   };
615
616We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only
617allow strings [#]_ to be passed::
618
619   static int
620   Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
621   {
622       static char *kwlist[] = {"first", "last", "number", NULL};
623       PyObject *first = NULL, *last = NULL, *tmp;
624
625       if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
626                                        &first, &last,
627                                        &self->number))
628           return -1;
629
630       if (first) {
631           tmp = self->first;
632           Py_INCREF(first);
633           self->first = first;
634           Py_DECREF(tmp);
635       }
636       if (last) {
637           tmp = self->last;
638           Py_INCREF(last);
639           self->last = last;
640           Py_DECREF(tmp);
641       }
642       return 0;
643   }
644
645With these changes, we can assure that the ``first`` and ``last`` members are
646never ``NULL`` so we can remove checks for ``NULL`` values in almost all cases.
647This means that most of the :c:func:`Py_XDECREF` calls can be converted to
648:c:func:`Py_DECREF` calls.  The only place we can't change these calls is in
649the ``tp_dealloc`` implementation, where there is the possibility that the
650initialization of these members failed in ``tp_new``.
651
652We also rename the module initialization function and module name in the
653initialization function, as we did before, and we add an extra definition to the
654:file:`setup.py` file.
655
656
657Supporting cyclic garbage collection
658====================================
659
660Python has a :term:`cyclic garbage collector (GC) <garbage collection>` that
661can identify unneeded objects even when their reference counts are not zero.
662This can happen when objects are involved in cycles.  For example, consider:
663
664.. code-block:: pycon
665
666   >>> l = []
667   >>> l.append(l)
668   >>> del l
669
670In this example, we create a list that contains itself. When we delete it, it
671still has a reference from itself. Its reference count doesn't drop to zero.
672Fortunately, Python's cyclic garbage collector will eventually figure out that
673the list is garbage and free it.
674
675In the second version of the :class:`Custom` example, we allowed any kind of
676object to be stored in the :attr:`first` or :attr:`last` attributes [#]_.
677Besides, in the second and third versions, we allowed subclassing
678:class:`Custom`, and subclasses may add arbitrary attributes.  For any of
679those two reasons, :class:`Custom` objects can participate in cycles:
680
681.. code-block:: pycon
682
683   >>> import custom3
684   >>> class Derived(custom3.Custom): pass
685   ...
686   >>> n = Derived()
687   >>> n.some_attribute = n
688
689To allow a :class:`Custom` instance participating in a reference cycle to
690be properly detected and collected by the cyclic GC, our :class:`Custom` type
691needs to fill two additional slots and to enable a flag that enables these slots:
692
693.. literalinclude:: ../includes/custom4.c
694
695
696First, the traversal method lets the cyclic GC know about subobjects that could
697participate in cycles::
698
699   static int
700   Custom_traverse(CustomObject *self, visitproc visit, void *arg)
701   {
702       int vret;
703       if (self->first) {
704           vret = visit(self->first, arg);
705           if (vret != 0)
706               return vret;
707       }
708       if (self->last) {
709           vret = visit(self->last, arg);
710           if (vret != 0)
711               return vret;
712       }
713       return 0;
714   }
715
716For each subobject that can participate in cycles, we need to call the
717:c:func:`visit` function, which is passed to the traversal method. The
718:c:func:`visit` function takes as arguments the subobject and the extra argument
719*arg* passed to the traversal method.  It returns an integer value that must be
720returned if it is non-zero.
721
722Python provides a :c:func:`Py_VISIT` macro that automates calling visit
723functions.  With :c:func:`Py_VISIT`, we can minimize the amount of boilerplate
724in ``Custom_traverse``::
725
726   static int
727   Custom_traverse(CustomObject *self, visitproc visit, void *arg)
728   {
729       Py_VISIT(self->first);
730       Py_VISIT(self->last);
731       return 0;
732   }
733
734.. note::
735   The :c:member:`~PyTypeObject.tp_traverse` implementation must name its
736   arguments exactly *visit* and *arg* in order to use :c:func:`Py_VISIT`.
737
738Second, we need to provide a method for clearing any subobjects that can
739participate in cycles::
740
741   static int
742   Custom_clear(CustomObject *self)
743   {
744       Py_CLEAR(self->first);
745       Py_CLEAR(self->last);
746       return 0;
747   }
748
749Notice the use of the :c:func:`Py_CLEAR` macro.  It is the recommended and safe
750way to clear data attributes of arbitrary types while decrementing
751their reference counts.  If you were to call :c:func:`Py_XDECREF` instead
752on the attribute before setting it to ``NULL``, there is a possibility
753that the attribute's destructor would call back into code that reads the
754attribute again (*especially* if there is a reference cycle).
755
756.. note::
757   You could emulate :c:func:`Py_CLEAR` by writing::
758
759      PyObject *tmp;
760      tmp = self->first;
761      self->first = NULL;
762      Py_XDECREF(tmp);
763
764   Nevertheless, it is much easier and less error-prone to always
765   use :c:func:`Py_CLEAR` when deleting an attribute.  Don't
766   try to micro-optimize at the expense of robustness!
767
768The deallocator ``Custom_dealloc`` may call arbitrary code when clearing
769attributes.  It means the circular GC can be triggered inside the function.
770Since the GC assumes reference count is not zero, we need to untrack the object
771from the GC by calling :c:func:`PyObject_GC_UnTrack` before clearing members.
772Here is our reimplemented deallocator using :c:func:`PyObject_GC_UnTrack`
773and ``Custom_clear``::
774
775   static void
776   Custom_dealloc(CustomObject *self)
777   {
778       PyObject_GC_UnTrack(self);
779       Custom_clear(self);
780       Py_TYPE(self)->tp_free((PyObject *) self);
781   }
782
783Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags::
784
785   .tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC,
786
787That's pretty much it.  If we had written custom :c:member:`~PyTypeObject.tp_alloc` or
788:c:member:`~PyTypeObject.tp_free` handlers, we'd need to modify them for cyclic
789garbage collection.  Most extensions will use the versions automatically provided.
790
791
792Subclassing other types
793=======================
794
795It is possible to create new extension types that are derived from existing
796types. It is easiest to inherit from the built in types, since an extension can
797easily use the :c:type:`PyTypeObject` it needs. It can be difficult to share
798these :c:type:`PyTypeObject` structures between extension modules.
799
800In this example we will create a :class:`SubList` type that inherits from the
801built-in :class:`list` type. The new type will be completely compatible with
802regular lists, but will have an additional :meth:`increment` method that
803increases an internal counter:
804
805.. code-block:: pycon
806
807   >>> import sublist
808   >>> s = sublist.SubList(range(3))
809   >>> s.extend(s)
810   >>> print(len(s))
811   6
812   >>> print(s.increment())
813   1
814   >>> print(s.increment())
815   2
816
817.. literalinclude:: ../includes/sublist.c
818
819
820As you can see, the source code closely resembles the :class:`Custom` examples in
821previous sections. We will break down the main differences between them. ::
822
823   typedef struct {
824       PyListObject list;
825       int state;
826   } SubListObject;
827
828The primary difference for derived type objects is that the base type's
829object structure must be the first value.  The base type will already include
830the :c:func:`PyObject_HEAD` at the beginning of its structure.
831
832When a Python object is a :class:`SubList` instance, its ``PyObject *`` pointer
833can be safely cast to both ``PyListObject *`` and ``SubListObject *``::
834
835   static int
836   SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
837   {
838       if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
839           return -1;
840       self->state = 0;
841       return 0;
842   }
843
844We see above how to call through to the :attr:`__init__` method of the base
845type.
846
847This pattern is important when writing a type with custom
848:c:member:`~PyTypeObject.tp_new` and :c:member:`~PyTypeObject.tp_dealloc`
849members.  The :c:member:`~PyTypeObject.tp_new` handler should not actually
850create the memory for the object with its :c:member:`~PyTypeObject.tp_alloc`,
851but let the base class handle it by calling its own :c:member:`~PyTypeObject.tp_new`.
852
853The :c:type:`PyTypeObject` struct supports a :c:member:`~PyTypeObject.tp_base`
854specifying the type's concrete base class.  Due to cross-platform compiler
855issues, you can't fill that field directly with a reference to
856:c:type:`PyList_Type`; it should be done later in the module initialization
857function::
858
859   PyMODINIT_FUNC
860   PyInit_sublist(void)
861   {
862       PyObject* m;
863       SubListType.tp_base = &PyList_Type;
864       if (PyType_Ready(&SubListType) < 0)
865           return NULL;
866
867       m = PyModule_Create(&sublistmodule);
868       if (m == NULL)
869           return NULL;
870
871       Py_INCREF(&SubListType);
872       if (PyModule_AddObject(m, "SubList", (PyObject *) &SubListType) < 0) {
873           Py_DECREF(&SubListType);
874           Py_DECREF(m);
875           return NULL;
876       }
877
878       return m;
879   }
880
881Before calling :c:func:`PyType_Ready`, the type structure must have the
882:c:member:`~PyTypeObject.tp_base` slot filled in.  When we are deriving an
883existing type, it is not necessary to fill out the :c:member:`~PyTypeObject.tp_alloc`
884slot with :c:func:`PyType_GenericNew` -- the allocation function from the base
885type will be inherited.
886
887After that, calling :c:func:`PyType_Ready` and adding the type object to the
888module is the same as with the basic :class:`Custom` examples.
889
890
891.. rubric:: Footnotes
892
893.. [#] This is true when we know that the object is a basic type, like a string or a
894   float.
895
896.. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler
897   in this example, because our type doesn't support garbage collection.
898
899.. [#] We now know that the first and last members are strings, so perhaps we
900   could be less careful about decrementing their reference counts, however,
901   we accept instances of string subclasses.  Even though deallocating normal
902   strings won't call back into our objects, we can't guarantee that deallocating
903   an instance of a string subclass won't call back into our objects.
904
905.. [#] Also, even with our attributes restricted to strings instances, the user
906   could pass arbitrary :class:`str` subclasses and therefore still create
907   reference cycles.
908