1:mod:`xml.sax` --- Support for SAX2 parsers
2===========================================
3
4.. module:: xml.sax
5   :synopsis: Package containing SAX2 base classes and convenience functions.
6
7.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no>
8.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
9.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
10
11**Source code:** :source:`Lib/xml/sax/__init__.py`
12
13--------------
14
15The :mod:`xml.sax` package provides a number of modules which implement the
16Simple API for XML (SAX) interface for Python.  The package itself provides the
17SAX exceptions and the convenience functions which will be most used by users of
18the SAX API.
19
20
21.. warning::
22
23   The :mod:`xml.sax` module is not secure against maliciously
24   constructed data.  If you need to parse untrusted or unauthenticated data see
25   :ref:`xml-vulnerabilities`.
26
27.. versionchanged:: 3.7.1
28
29   The SAX parser no longer processes general external entities by default
30   to increase security. Before, the parser created network connections
31   to fetch remote files or loaded local files from the file
32   system for DTD and entities. The feature can be enabled again with method
33   :meth:`~xml.sax.xmlreader.XMLReader.setFeature` on the parser object
34   and argument :data:`~xml.sax.handler.feature_external_ges`.
35
36The convenience functions are:
37
38
39.. function:: make_parser(parser_list=[])
40
41   Create and return a SAX :class:`~xml.sax.xmlreader.XMLReader` object.  The
42   first parser found will
43   be used.  If *parser_list* is provided, it must be a list of strings which
44   name modules that have a function named :func:`create_parser`.  Modules listed
45   in *parser_list* will be used before modules in the default list of parsers.
46
47
48.. function:: parse(filename_or_stream, handler, error_handler=handler.ErrorHandler())
49
50   Create a SAX parser and use it to parse a document.  The document, passed in as
51   *filename_or_stream*, can be a filename or a file object.  The *handler*
52   parameter needs to be a SAX :class:`~handler.ContentHandler` instance.  If
53   *error_handler* is given, it must be a SAX :class:`~handler.ErrorHandler`
54   instance; if
55   omitted,  :exc:`SAXParseException` will be raised on all errors.  There is no
56   return value; all work must be done by the *handler* passed in.
57
58
59.. function:: parseString(string, handler, error_handler=handler.ErrorHandler())
60
61   Similar to :func:`parse`, but parses from a buffer *string* received as a
62   parameter.  *string* must be a :class:`str` instance or a
63   :term:`bytes-like object`.
64
65   .. versionchanged:: 3.5
66      Added support of :class:`str` instances.
67
68A typical SAX application uses three kinds of objects: readers, handlers and
69input sources.  "Reader" in this context is another term for parser, i.e. some
70piece of code that reads the bytes or characters from the input source, and
71produces a sequence of events. The events then get distributed to the handler
72objects, i.e. the reader invokes a method on the handler.  A SAX application
73must therefore obtain a reader object, create or open the input sources, create
74the handlers, and connect these objects all together.  As the final step of
75preparation, the reader is called to parse the input. During parsing, methods on
76the handler objects are called based on structural and syntactic events from the
77input data.
78
79For these objects, only the interfaces are relevant; they are normally not
80instantiated by the application itself.  Since Python does not have an explicit
81notion of interface, they are formally introduced as classes, but applications
82may use implementations which do not inherit from the provided classes.  The
83:class:`~xml.sax.xmlreader.InputSource`, :class:`~xml.sax.xmlreader.Locator`,
84:class:`~xml.sax.xmlreader.Attributes`, :class:`~xml.sax.xmlreader.AttributesNS`,
85and :class:`~xml.sax.xmlreader.XMLReader` interfaces are defined in the
86module :mod:`xml.sax.xmlreader`.  The handler interfaces are defined in
87:mod:`xml.sax.handler`.  For convenience,
88:class:`~xml.sax.xmlreader.InputSource` (which is often
89instantiated directly) and the handler classes are also available from
90:mod:`xml.sax`.  These interfaces are described below.
91
92In addition to these classes, :mod:`xml.sax` provides the following exception
93classes.
94
95
96.. exception:: SAXException(msg, exception=None)
97
98   Encapsulate an XML error or warning.  This class can contain basic error or
99   warning information from either the XML parser or the application: it can be
100   subclassed to provide additional functionality or to add localization.  Note
101   that although the handlers defined in the
102   :class:`~xml.sax.handler.ErrorHandler` interface
103   receive instances of this exception, it is not required to actually raise the
104   exception --- it is also useful as a container for information.
105
106   When instantiated, *msg* should be a human-readable description of the error.
107   The optional *exception* parameter, if given, should be ``None`` or an exception
108   that was caught by the parsing code and is being passed along as information.
109
110   This is the base class for the other SAX exception classes.
111
112
113.. exception:: SAXParseException(msg, exception, locator)
114
115   Subclass of :exc:`SAXException` raised on parse errors. Instances of this
116   class are passed to the methods of the SAX
117   :class:`~xml.sax.handler.ErrorHandler` interface to provide information
118   about the parse error.  This class supports the SAX
119   :class:`~xml.sax.xmlreader.Locator` interface as well as the
120   :class:`SAXException` interface.
121
122
123.. exception:: SAXNotRecognizedException(msg, exception=None)
124
125   Subclass of :exc:`SAXException` raised when a SAX
126   :class:`~xml.sax.xmlreader.XMLReader` is
127   confronted with an unrecognized feature or property.  SAX applications and
128   extensions may use this class for similar purposes.
129
130
131.. exception:: SAXNotSupportedException(msg, exception=None)
132
133   Subclass of :exc:`SAXException` raised when a SAX
134   :class:`~xml.sax.xmlreader.XMLReader` is asked to
135   enable a feature that is not supported, or to set a property to a value that the
136   implementation does not support.  SAX applications and extensions may use this
137   class for similar purposes.
138
139
140.. seealso::
141
142   `SAX: The Simple API for XML <http://www.saxproject.org/>`_
143      This site is the focal point for the definition of the SAX API.  It provides a
144      Java implementation and online documentation.  Links to implementations and
145      historical information are also available.
146
147   Module :mod:`xml.sax.handler`
148      Definitions of the interfaces for application-provided objects.
149
150   Module :mod:`xml.sax.saxutils`
151      Convenience functions for use in SAX applications.
152
153   Module :mod:`xml.sax.xmlreader`
154      Definitions of the interfaces for parser-provided objects.
155
156
157.. _sax-exception-objects:
158
159SAXException Objects
160--------------------
161
162The :class:`SAXException` exception class supports the following methods:
163
164
165.. method:: SAXException.getMessage()
166
167   Return a human-readable message describing the error condition.
168
169
170.. method:: SAXException.getException()
171
172   Return an encapsulated exception object, or ``None``.
173
174