1
2.. _importsystem:
3
4*****************
5The import system
6*****************
7
8.. index:: single: import machinery
9
10Python code in one :term:`module` gains access to the code in another module
11by the process of :term:`importing` it.  The :keyword:`import` statement is
12the most common way of invoking the import machinery, but it is not the only
13way.  Functions such as :func:`importlib.import_module` and built-in
14:func:`__import__` can also be used to invoke the import machinery.
15
16The :keyword:`import` statement combines two operations; it searches for the
17named module, then it binds the results of that search to a name in the local
18scope.  The search operation of the :keyword:`import` statement is defined as
19a call to the :func:`__import__` function, with the appropriate arguments.
20The return value of :func:`__import__` is used to perform the name
21binding operation of the :keyword:`import` statement.  See the
22:keyword:`import` statement for the exact details of that name binding
23operation.
24
25A direct call to :func:`__import__` performs only the module search and, if
26found, the module creation operation.  While certain side-effects may occur,
27such as the importing of parent packages, and the updating of various caches
28(including :data:`sys.modules`), only the :keyword:`import` statement performs
29a name binding operation.
30
31When calling :func:`__import__` as part of an import statement, the
32standard builtin :func:`__import__` is called. Other mechanisms for
33invoking the import system (such as :func:`importlib.import_module`) may
34choose to subvert :func:`__import__` and use its own solution to
35implement import semantics.
36
37When a module is first imported, Python searches for the module and if found,
38it creates a module object [#fnmo]_, initializing it.  If the named module
39cannot be found, an :exc:`ModuleNotFoundError` is raised.  Python implements various
40strategies to search for the named module when the import machinery is
41invoked.  These strategies can be modified and extended by using various hooks
42described in the sections below.
43
44.. versionchanged:: 3.3
45   The import system has been updated to fully implement the second phase
46   of :pep:`302`. There is no longer any implicit import machinery - the full
47   import system is exposed through :data:`sys.meta_path`. In addition,
48   native namespace package support has been implemented (see :pep:`420`).
49
50
51:mod:`importlib`
52================
53
54The :mod:`importlib` module provides a rich API for interacting with the
55import system.  For example :func:`importlib.import_module` provides a
56recommended, simpler API than built-in :func:`__import__` for invoking the
57import machinery.  Refer to the :mod:`importlib` library documentation for
58additional detail.
59
60
61
62Packages
63========
64
65.. index::
66    single: package
67
68Python has only one type of module object, and all modules are of this type,
69regardless of whether the module is implemented in Python, C, or something
70else.  To help organize modules and provide a naming hierarchy, Python has a
71concept of :term:`packages <package>`.
72
73You can think of packages as the directories on a file system and modules as
74files within directories, but don't take this analogy too literally since
75packages and modules need not originate from the file system.  For the
76purposes of this documentation, we'll use this convenient analogy of
77directories and files.  Like file system directories, packages are organized
78hierarchically, and packages may themselves contain subpackages, as well as
79regular modules.
80
81It's important to keep in mind that all packages are modules, but not all
82modules are packages.  Or put another way, packages are just a special kind of
83module.  Specifically, any module that contains a ``__path__`` attribute is
84considered a package.
85
86All modules have a name.  Subpackage names are separated from their parent
87package name by dots, akin to Python's standard attribute access syntax.  Thus
88you might have a module called :mod:`sys` and a package called :mod:`email`,
89which in turn has a subpackage called :mod:`email.mime` and a module within
90that subpackage called :mod:`email.mime.text`.
91
92
93Regular packages
94----------------
95
96.. index::
97    pair: package; regular
98
99Python defines two types of packages, :term:`regular packages <regular
100package>` and :term:`namespace packages <namespace package>`.  Regular
101packages are traditional packages as they existed in Python 3.2 and earlier.
102A regular package is typically implemented as a directory containing an
103``__init__.py`` file.  When a regular package is imported, this
104``__init__.py`` file is implicitly executed, and the objects it defines are
105bound to names in the package's namespace.  The ``__init__.py`` file can
106contain the same Python code that any other module can contain, and Python
107will add some additional attributes to the module when it is imported.
108
109For example, the following file system layout defines a top level ``parent``
110package with three subpackages::
111
112    parent/
113        __init__.py
114        one/
115            __init__.py
116        two/
117            __init__.py
118        three/
119            __init__.py
120
121Importing ``parent.one`` will implicitly execute ``parent/__init__.py`` and
122``parent/one/__init__.py``.  Subsequent imports of ``parent.two`` or
123``parent.three`` will execute ``parent/two/__init__.py`` and
124``parent/three/__init__.py`` respectively.
125
126
127Namespace packages
128------------------
129
130.. index::
131    pair:: package; namespace
132    pair:: package; portion
133
134A namespace package is a composite of various :term:`portions <portion>`,
135where each portion contributes a subpackage to the parent package.  Portions
136may reside in different locations on the file system.  Portions may also be
137found in zip files, on the network, or anywhere else that Python searches
138during import.  Namespace packages may or may not correspond directly to
139objects on the file system; they may be virtual modules that have no concrete
140representation.
141
142Namespace packages do not use an ordinary list for their ``__path__``
143attribute. They instead use a custom iterable type which will automatically
144perform a new search for package portions on the next import attempt within
145that package if the path of their parent package (or :data:`sys.path` for a
146top level package) changes.
147
148With namespace packages, there is no ``parent/__init__.py`` file.  In fact,
149there may be multiple ``parent`` directories found during import search, where
150each one is provided by a different portion.  Thus ``parent/one`` may not be
151physically located next to ``parent/two``.  In this case, Python will create a
152namespace package for the top-level ``parent`` package whenever it or one of
153its subpackages is imported.
154
155See also :pep:`420` for the namespace package specification.
156
157
158Searching
159=========
160
161To begin the search, Python needs the :term:`fully qualified <qualified name>`
162name of the module (or package, but for the purposes of this discussion, the
163difference is immaterial) being imported.  This name may come from various
164arguments to the :keyword:`import` statement, or from the parameters to the
165:func:`importlib.import_module` or :func:`__import__` functions.
166
167This name will be used in various phases of the import search, and it may be
168the dotted path to a submodule, e.g. ``foo.bar.baz``.  In this case, Python
169first tries to import ``foo``, then ``foo.bar``, and finally ``foo.bar.baz``.
170If any of the intermediate imports fail, an :exc:`ModuleNotFoundError` is raised.
171
172
173The module cache
174----------------
175
176.. index::
177    single: sys.modules
178
179The first place checked during import search is :data:`sys.modules`.  This
180mapping serves as a cache of all modules that have been previously imported,
181including the intermediate paths.  So if ``foo.bar.baz`` was previously
182imported, :data:`sys.modules` will contain entries for ``foo``, ``foo.bar``,
183and ``foo.bar.baz``.  Each key will have as its value the corresponding module
184object.
185
186During import, the module name is looked up in :data:`sys.modules` and if
187present, the associated value is the module satisfying the import, and the
188process completes.  However, if the value is ``None``, then an
189:exc:`ModuleNotFoundError` is raised.  If the module name is missing, Python will
190continue searching for the module.
191
192:data:`sys.modules` is writable.  Deleting a key may not destroy the
193associated module (as other modules may hold references to it),
194but it will invalidate the cache entry for the named module, causing
195Python to search anew for the named module upon its next
196import. The key can also be assigned to ``None``, forcing the next import
197of the module to result in an :exc:`ModuleNotFoundError`.
198
199Beware though, as if you keep a reference to the module object,
200invalidate its cache entry in :data:`sys.modules`, and then re-import the
201named module, the two module objects will *not* be the same. By contrast,
202:func:`importlib.reload` will reuse the *same* module object, and simply
203reinitialise the module contents by rerunning the module's code.
204
205
206Finders and loaders
207-------------------
208
209.. index::
210    single: finder
211    single: loader
212    single: module spec
213
214If the named module is not found in :data:`sys.modules`, then Python's import
215protocol is invoked to find and load the module.  This protocol consists of
216two conceptual objects, :term:`finders <finder>` and :term:`loaders <loader>`.
217A finder's job is to determine whether it can find the named module using
218whatever strategy it knows about. Objects that implement both of these
219interfaces are referred to as :term:`importers <importer>` - they return
220themselves when they find that they can load the requested module.
221
222Python includes a number of default finders and importers.  The first one
223knows how to locate built-in modules, and the second knows how to locate
224frozen modules.  A third default finder searches an :term:`import path`
225for modules.  The :term:`import path` is a list of locations that may
226name file system paths or zip files.  It can also be extended to search
227for any locatable resource, such as those identified by URLs.
228
229The import machinery is extensible, so new finders can be added to extend the
230range and scope of module searching.
231
232Finders do not actually load modules.  If they can find the named module, they
233return a :dfn:`module spec`, an encapsulation of the module's import-related
234information, which the import machinery then uses when loading the module.
235
236The following sections describe the protocol for finders and loaders in more
237detail, including how you can create and register new ones to extend the
238import machinery.
239
240.. versionchanged:: 3.4
241   In previous versions of Python, finders returned :term:`loaders <loader>`
242   directly, whereas now they return module specs which *contain* loaders.
243   Loaders are still used during import but have fewer responsibilities.
244
245Import hooks
246------------
247
248.. index::
249   single: import hooks
250   single: meta hooks
251   single: path hooks
252   pair: hooks; import
253   pair: hooks; meta
254   pair: hooks; path
255
256The import machinery is designed to be extensible; the primary mechanism for
257this are the *import hooks*.  There are two types of import hooks: *meta
258hooks* and *import path hooks*.
259
260Meta hooks are called at the start of import processing, before any other
261import processing has occurred, other than :data:`sys.modules` cache look up.
262This allows meta hooks to override :data:`sys.path` processing, frozen
263modules, or even built-in modules.  Meta hooks are registered by adding new
264finder objects to :data:`sys.meta_path`, as described below.
265
266Import path hooks are called as part of :data:`sys.path` (or
267``package.__path__``) processing, at the point where their associated path
268item is encountered.  Import path hooks are registered by adding new callables
269to :data:`sys.path_hooks` as described below.
270
271
272The meta path
273-------------
274
275.. index::
276    single: sys.meta_path
277    pair: finder; find_spec
278
279When the named module is not found in :data:`sys.modules`, Python next
280searches :data:`sys.meta_path`, which contains a list of meta path finder
281objects.  These finders are queried in order to see if they know how to handle
282the named module.  Meta path finders must implement a method called
283:meth:`~importlib.abc.MetaPathFinder.find_spec()` which takes three arguments:
284a name, an import path, and (optionally) a target module.  The meta path
285finder can use any strategy it wants to determine whether it can handle
286the named module or not.
287
288If the meta path finder knows how to handle the named module, it returns a
289spec object.  If it cannot handle the named module, it returns ``None``.  If
290:data:`sys.meta_path` processing reaches the end of its list without returning
291a spec, then a :exc:`ModuleNotFoundError` is raised.  Any other exceptions
292raised are simply propagated up, aborting the import process.
293
294The :meth:`~importlib.abc.MetaPathFinder.find_spec()` method of meta path
295finders is called with two or three arguments.  The first is the fully
296qualified name of the module being imported, for example ``foo.bar.baz``.
297The second argument is the path entries to use for the module search.  For
298top-level modules, the second argument is ``None``, but for submodules or
299subpackages, the second argument is the value of the parent package's
300``__path__`` attribute. If the appropriate ``__path__`` attribute cannot
301be accessed, an :exc:`ModuleNotFoundError` is raised.  The third argument
302is an existing module object that will be the target of loading later.
303The import system passes in a target module only during reload.
304
305The meta path may be traversed multiple times for a single import request.
306For example, assuming none of the modules involved has already been cached,
307importing ``foo.bar.baz`` will first perform a top level import, calling
308``mpf.find_spec("foo", None, None)`` on each meta path finder (``mpf``). After
309``foo`` has been imported, ``foo.bar`` will be imported by traversing the
310meta path a second time, calling
311``mpf.find_spec("foo.bar", foo.__path__, None)``. Once ``foo.bar`` has been
312imported, the final traversal will call
313``mpf.find_spec("foo.bar.baz", foo.bar.__path__, None)``.
314
315Some meta path finders only support top level imports. These importers will
316always return ``None`` when anything other than ``None`` is passed as the
317second argument.
318
319Python's default :data:`sys.meta_path` has three meta path finders, one that
320knows how to import built-in modules, one that knows how to import frozen
321modules, and one that knows how to import modules from an :term:`import path`
322(i.e. the :term:`path based finder`).
323
324.. versionchanged:: 3.4
325   The :meth:`~importlib.abc.MetaPathFinder.find_spec` method of meta path
326   finders replaced :meth:`~importlib.abc.MetaPathFinder.find_module`, which
327   is now deprecated.  While it will continue to work without change, the
328   import machinery will try it only if the finder does not implement
329   ``find_spec()``.
330
331
332Loading
333=======
334
335If and when a module spec is found, the import machinery will use it (and
336the loader it contains) when loading the module.  Here is an approximation
337of what happens during the loading portion of import::
338
339    module = None
340    if spec.loader is not None and hasattr(spec.loader, 'create_module'):
341        # It is assumed 'exec_module' will also be defined on the loader.
342        module = spec.loader.create_module(spec)
343    if module is None:
344        module = ModuleType(spec.name)
345    # The import-related module attributes get set here:
346    _init_module_attrs(spec, module)
347
348    if spec.loader is None:
349        if spec.submodule_search_locations is not None:
350            # namespace package
351            sys.modules[spec.name] = module
352        else:
353            # unsupported
354            raise ImportError
355    elif not hasattr(spec.loader, 'exec_module'):
356        module = spec.loader.load_module(spec.name)
357        # Set __loader__ and __package__ if missing.
358    else:
359        sys.modules[spec.name] = module
360        try:
361            spec.loader.exec_module(module)
362        except BaseException:
363            try:
364                del sys.modules[spec.name]
365            except KeyError:
366                pass
367            raise
368    return sys.modules[spec.name]
369
370Note the following details:
371
372 * If there is an existing module object with the given name in
373   :data:`sys.modules`, import will have already returned it.
374
375 * The module will exist in :data:`sys.modules` before the loader
376   executes the module code.  This is crucial because the module code may
377   (directly or indirectly) import itself; adding it to :data:`sys.modules`
378   beforehand prevents unbounded recursion in the worst case and multiple
379   loading in the best.
380
381 * If loading fails, the failing module -- and only the failing module --
382   gets removed from :data:`sys.modules`.  Any module already in the
383   :data:`sys.modules` cache, and any module that was successfully loaded
384   as a side-effect, must remain in the cache.  This contrasts with
385   reloading where even the failing module is left in :data:`sys.modules`.
386
387 * After the module is created but before execution, the import machinery
388   sets the import-related module attributes ("_init_module_attrs" in
389   the pseudo-code example above), as summarized in a
390   :ref:`later section <import-mod-attrs>`.
391
392 * Module execution is the key moment of loading in which the module's
393   namespace gets populated.  Execution is entirely delegated to the
394   loader, which gets to decide what gets populated and how.
395
396 * The module created during loading and passed to exec_module() may
397   not be the one returned at the end of import [#fnlo]_.
398
399.. versionchanged:: 3.4
400   The import system has taken over the boilerplate responsibilities of
401   loaders.  These were previously performed by the
402   :meth:`importlib.abc.Loader.load_module` method.
403
404Loaders
405-------
406
407Module loaders provide the critical function of loading: module execution.
408The import machinery calls the :meth:`importlib.abc.Loader.exec_module`
409method with a single argument, the module object to execute.  Any value
410returned from :meth:`~importlib.abc.Loader.exec_module` is ignored.
411
412Loaders must satisfy the following requirements:
413
414 * If the module is a Python module (as opposed to a built-in module or a
415   dynamically loaded extension), the loader should execute the module's code
416   in the module's global name space (``module.__dict__``).
417
418 * If the loader cannot execute the module, it should raise an
419   :exc:`ImportError`, although any other exception raised during
420   :meth:`~importlib.abc.Loader.exec_module` will be propagated.
421
422In many cases, the finder and loader can be the same object; in such cases the
423:meth:`~importlib.abc.MetaPathFinder.find_spec` method would just return a
424spec with the loader set to ``self``.
425
426Module loaders may opt in to creating the module object during loading
427by implementing a :meth:`~importlib.abc.Loader.create_module` method.
428It takes one argument, the module spec, and returns the new module object
429to use during loading.  ``create_module()`` does not need to set any attributes
430on the module object.  If the method returns ``None``, the
431import machinery will create the new module itself.
432
433.. versionadded:: 3.4
434   The :meth:`~importlib.abc.Loader.create_module` method of loaders.
435
436.. versionchanged:: 3.4
437   The :meth:`~importlib.abc.Loader.load_module` method was replaced by
438   :meth:`~importlib.abc.Loader.exec_module` and the import
439   machinery assumed all the boilerplate responsibilities of loading.
440
441   For compatibility with existing loaders, the import machinery will use
442   the ``load_module()`` method of loaders if it exists and the loader does
443   not also implement ``exec_module()``.  However, ``load_module()`` has been
444   deprecated and loaders should implement ``exec_module()`` instead.
445
446   The ``load_module()`` method must implement all the boilerplate loading
447   functionality described above in addition to executing the module.  All
448   the same constraints apply, with some additional clarification:
449
450    * If there is an existing module object with the given name in
451      :data:`sys.modules`, the loader must use that existing module.
452      (Otherwise, :func:`importlib.reload` will not work correctly.)  If the
453      named module does not exist in :data:`sys.modules`, the loader
454      must create a new module object and add it to :data:`sys.modules`.
455
456    * The module *must* exist in :data:`sys.modules` before the loader
457      executes the module code, to prevent unbounded recursion or multiple
458      loading.
459
460    * If loading fails, the loader must remove any modules it has inserted
461      into :data:`sys.modules`, but it must remove **only** the failing
462      module(s), and only if the loader itself has loaded the module(s)
463      explicitly.
464
465.. versionchanged:: 3.5
466   A :exc:`DeprecationWarning` is raised when ``exec_module()`` is defined but
467   ``create_module()`` is not.
468
469.. versionchanged:: 3.6
470   An :exc:`ImportError` is raised when ``exec_module()`` is defined but
471   ``create_module()`` is not.
472
473Submodules
474----------
475
476When a submodule is loaded using any mechanism (e.g. ``importlib`` APIs, the
477``import`` or ``import-from`` statements, or built-in ``__import__()``) a
478binding is placed in the parent module's namespace to the submodule object.
479For example, if package ``spam`` has a submodule ``foo``, after importing
480``spam.foo``, ``spam`` will have an attribute ``foo`` which is bound to the
481submodule.  Let's say you have the following directory structure::
482
483    spam/
484        __init__.py
485        foo.py
486        bar.py
487
488and ``spam/__init__.py`` has the following lines in it::
489
490    from .foo import Foo
491    from .bar import Bar
492
493then executing the following puts a name binding to ``foo`` and ``bar`` in the
494``spam`` module::
495
496    >>> import spam
497    >>> spam.foo
498    <module 'spam.foo' from '/tmp/imports/spam/foo.py'>
499    >>> spam.bar
500    <module 'spam.bar' from '/tmp/imports/spam/bar.py'>
501
502Given Python's familiar name binding rules this might seem surprising, but
503it's actually a fundamental feature of the import system.  The invariant
504holding is that if you have ``sys.modules['spam']`` and
505``sys.modules['spam.foo']`` (as you would after the above import), the latter
506must appear as the ``foo`` attribute of the former.
507
508Module spec
509-----------
510
511The import machinery uses a variety of information about each module
512during import, especially before loading.  Most of the information is
513common to all modules.  The purpose of a module's spec is to encapsulate
514this import-related information on a per-module basis.
515
516Using a spec during import allows state to be transferred between import
517system components, e.g. between the finder that creates the module spec
518and the loader that executes it.  Most importantly, it allows the
519import machinery to perform the boilerplate operations of loading,
520whereas without a module spec the loader had that responsibility.
521
522See :class:`~importlib.machinery.ModuleSpec` for more specifics on what
523information a module's spec may hold.
524
525.. versionadded:: 3.4
526
527.. _import-mod-attrs:
528
529Import-related module attributes
530--------------------------------
531
532The import machinery fills in these attributes on each module object
533during loading, based on the module's spec, before the loader executes
534the module.
535
536.. attribute:: __name__
537
538   The ``__name__`` attribute must be set to the fully-qualified name of
539   the module.  This name is used to uniquely identify the module in
540   the import system.
541
542.. attribute:: __loader__
543
544   The ``__loader__`` attribute must be set to the loader object that
545   the import machinery used when loading the module.  This is mostly
546   for introspection, but can be used for additional loader-specific
547   functionality, for example getting data associated with a loader.
548
549.. attribute:: __package__
550
551   The module's ``__package__`` attribute must be set.  Its value must
552   be a string, but it can be the same value as its ``__name__``.  When
553   the module is a package, its ``__package__`` value should be set to
554   its ``__name__``.  When the module is not a package, ``__package__``
555   should be set to the empty string for top-level modules, or for
556   submodules, to the parent package's name.  See :pep:`366` for further
557   details.
558
559   This attribute is used instead of ``__name__`` to calculate explicit
560   relative imports for main modules, as defined in :pep:`366`. It is
561   expected to have the same value as ``__spec__.parent``.
562
563   .. versionchanged:: 3.6
564      The value of ``__package__`` is expected to be the same as
565      ``__spec__.parent``.
566
567.. attribute:: __spec__
568
569   The ``__spec__`` attribute must be set to the module spec that was
570   used when importing the module. Setting ``__spec__``
571   appropriately applies equally to :ref:`modules initialized during
572   interpreter startup <programs>`.  The one exception is ``__main__``,
573   where ``__spec__`` is :ref:`set to None in some cases <main_spec>`.
574
575   When ``__package__`` is not defined, ``__spec__.parent`` is used as
576   a fallback.
577
578   .. versionadded:: 3.4
579
580   .. versionchanged:: 3.6
581      ``__spec__.parent`` is used as a fallback when ``__package__`` is
582      not defined.
583
584.. attribute:: __path__
585
586   If the module is a package (either regular or namespace), the module
587   object's ``__path__`` attribute must be set.  The value must be
588   iterable, but may be empty if ``__path__`` has no further significance.
589   If ``__path__`` is not empty, it must produce strings when iterated
590   over. More details on the semantics of ``__path__`` are given
591   :ref:`below <package-path-rules>`.
592
593   Non-package modules should not have a ``__path__`` attribute.
594
595.. attribute:: __file__
596.. attribute:: __cached__
597
598   ``__file__`` is optional. If set, this attribute's value must be a
599   string.  The import system may opt to leave ``__file__`` unset if it
600   has no semantic meaning (e.g. a module loaded from a database).
601
602   If ``__file__`` is set, it may also be appropriate to set the
603   ``__cached__`` attribute which is the path to any compiled version of
604   the code (e.g. byte-compiled file). The file does not need to exist
605   to set this attribute; the path can simply point to where the
606   compiled file would exist (see :pep:`3147`).
607
608   It is also appropriate to set ``__cached__`` when ``__file__`` is not
609   set.  However, that scenario is quite atypical.  Ultimately, the
610   loader is what makes use of ``__file__`` and/or ``__cached__``.  So
611   if a loader can load from a cached module but otherwise does not load
612   from a file, that atypical scenario may be appropriate.
613
614.. _package-path-rules:
615
616module.__path__
617---------------
618
619By definition, if a module has an ``__path__`` attribute, it is a package,
620regardless of its value.
621
622A package's ``__path__`` attribute is used during imports of its subpackages.
623Within the import machinery, it functions much the same as :data:`sys.path`,
624i.e. providing a list of locations to search for modules during import.
625However, ``__path__`` is typically much more constrained than
626:data:`sys.path`.
627
628``__path__`` must be an iterable of strings, but it may be empty.
629The same rules used for :data:`sys.path` also apply to a package's
630``__path__``, and :data:`sys.path_hooks` (described below) are
631consulted when traversing a package's ``__path__``.
632
633A package's ``__init__.py`` file may set or alter the package's ``__path__``
634attribute, and this was typically the way namespace packages were implemented
635prior to :pep:`420`.  With the adoption of :pep:`420`, namespace packages no
636longer need to supply ``__init__.py`` files containing only ``__path__``
637manipulation code; the import machinery automatically sets ``__path__``
638correctly for the namespace package.
639
640Module reprs
641------------
642
643By default, all modules have a usable repr, however depending on the
644attributes set above, and in the module's spec, you can more explicitly
645control the repr of module objects.
646
647If the module has a spec (``__spec__``), the import machinery will try
648to generate a repr from it.  If that fails or there is no spec, the import
649system will craft a default repr using whatever information is available
650on the module.  It will try to use the ``module.__name__``,
651``module.__file__``, and ``module.__loader__`` as input into the repr,
652with defaults for whatever information is missing.
653
654Here are the exact rules used:
655
656 * If the module has a ``__spec__`` attribute, the information in the spec
657   is used to generate the repr.  The "name", "loader", "origin", and
658   "has_location" attributes are consulted.
659
660 * If the module has a ``__file__`` attribute, this is used as part of the
661   module's repr.
662
663 * If the module has no ``__file__`` but does have a ``__loader__`` that is not
664   ``None``, then the loader's repr is used as part of the module's repr.
665
666 * Otherwise, just use the module's ``__name__`` in the repr.
667
668.. versionchanged:: 3.4
669   Use of :meth:`loader.module_repr() <importlib.abc.Loader.module_repr>`
670   has been deprecated and the module spec is now used by the import
671   machinery to generate a module repr.
672
673   For backward compatibility with Python 3.3, the module repr will be
674   generated by calling the loader's
675   :meth:`~importlib.abc.Loader.module_repr` method, if defined, before
676   trying either approach described above.  However, the method is deprecated.
677
678
679The Path Based Finder
680=====================
681
682.. index::
683    single: path based finder
684
685As mentioned previously, Python comes with several default meta path finders.
686One of these, called the :term:`path based finder`
687(:class:`~importlib.machinery.PathFinder`), searches an :term:`import path`,
688which contains a list of :term:`path entries <path entry>`.  Each path
689entry names a location to search for modules.
690
691The path based finder itself doesn't know how to import anything. Instead, it
692traverses the individual path entries, associating each of them with a
693path entry finder that knows how to handle that particular kind of path.
694
695The default set of path entry finders implement all the semantics for finding
696modules on the file system, handling special file types such as Python source
697code (``.py`` files), Python byte code (``.pyc`` files) and
698shared libraries (e.g. ``.so`` files). When supported by the :mod:`zipimport`
699module in the standard library, the default path entry finders also handle
700loading all of these file types (other than shared libraries) from zipfiles.
701
702Path entries need not be limited to file system locations.  They can refer to
703URLs, database queries, or any other location that can be specified as a
704string.
705
706The path based finder provides additional hooks and protocols so that you
707can extend and customize the types of searchable path entries.  For example,
708if you wanted to support path entries as network URLs, you could write a hook
709that implements HTTP semantics to find modules on the web.  This hook (a
710callable) would return a :term:`path entry finder` supporting the protocol
711described below, which was then used to get a loader for the module from the
712web.
713
714A word of warning: this section and the previous both use the term *finder*,
715distinguishing between them by using the terms :term:`meta path finder` and
716:term:`path entry finder`.  These two types of finders are very similar,
717support similar protocols, and function in similar ways during the import
718process, but it's important to keep in mind that they are subtly different.
719In particular, meta path finders operate at the beginning of the import
720process, as keyed off the :data:`sys.meta_path` traversal.
721
722By contrast, path entry finders are in a sense an implementation detail
723of the path based finder, and in fact, if the path based finder were to be
724removed from :data:`sys.meta_path`, none of the path entry finder semantics
725would be invoked.
726
727
728Path entry finders
729------------------
730
731.. index::
732    single: sys.path
733    single: sys.path_hooks
734    single: sys.path_importer_cache
735    single: PYTHONPATH
736
737The :term:`path based finder` is responsible for finding and loading
738Python modules and packages whose location is specified with a string
739:term:`path entry`.  Most path entries name locations in the file system,
740but they need not be limited to this.
741
742As a meta path finder, the :term:`path based finder` implements the
743:meth:`~importlib.abc.MetaPathFinder.find_spec` protocol previously
744described, however it exposes additional hooks that can be used to
745customize how modules are found and loaded from the :term:`import path`.
746
747Three variables are used by the :term:`path based finder`, :data:`sys.path`,
748:data:`sys.path_hooks` and :data:`sys.path_importer_cache`.  The ``__path__``
749attributes on package objects are also used.  These provide additional ways
750that the import machinery can be customized.
751
752:data:`sys.path` contains a list of strings providing search locations for
753modules and packages.  It is initialized from the :data:`PYTHONPATH`
754environment variable and various other installation- and
755implementation-specific defaults.  Entries in :data:`sys.path` can name
756directories on the file system, zip files, and potentially other "locations"
757(see the :mod:`site` module) that should be searched for modules, such as
758URLs, or database queries.  Only strings and bytes should be present on
759:data:`sys.path`; all other data types are ignored.  The encoding of bytes
760entries is determined by the individual :term:`path entry finders <path entry
761finder>`.
762
763The :term:`path based finder` is a :term:`meta path finder`, so the import
764machinery begins the :term:`import path` search by calling the path
765based finder's :meth:`~importlib.machinery.PathFinder.find_spec` method as
766described previously.  When the ``path`` argument to
767:meth:`~importlib.machinery.PathFinder.find_spec` is given, it will be a
768list of string paths to traverse - typically a package's ``__path__``
769attribute for an import within that package.  If the ``path`` argument is
770``None``, this indicates a top level import and :data:`sys.path` is used.
771
772The path based finder iterates over every entry in the search path, and
773for each of these, looks for an appropriate :term:`path entry finder`
774(:class:`~importlib.abc.PathEntryFinder`) for the
775path entry.  Because this can be an expensive operation (e.g. there may be
776`stat()` call overheads for this search), the path based finder maintains
777a cache mapping path entries to path entry finders.  This cache is maintained
778in :data:`sys.path_importer_cache` (despite the name, this cache actually
779stores finder objects rather than being limited to :term:`importer` objects).
780In this way, the expensive search for a particular :term:`path entry`
781location's :term:`path entry finder` need only be done once.  User code is
782free to remove cache entries from :data:`sys.path_importer_cache` forcing
783the path based finder to perform the path entry search again [#fnpic]_.
784
785If the path entry is not present in the cache, the path based finder iterates
786over every callable in :data:`sys.path_hooks`.  Each of the :term:`path entry
787hooks <path entry hook>` in this list is called with a single argument, the
788path entry to be searched.  This callable may either return a :term:`path
789entry finder` that can handle the path entry, or it may raise
790:exc:`ImportError`.  An :exc:`ImportError` is used by the path based finder to
791signal that the hook cannot find a :term:`path entry finder`
792for that :term:`path entry`.  The
793exception is ignored and :term:`import path` iteration continues.  The hook
794should expect either a string or bytes object; the encoding of bytes objects
795is up to the hook (e.g. it may be a file system encoding, UTF-8, or something
796else), and if the hook cannot decode the argument, it should raise
797:exc:`ImportError`.
798
799If :data:`sys.path_hooks` iteration ends with no :term:`path entry finder`
800being returned, then the path based finder's
801:meth:`~importlib.machinery.PathFinder.find_spec` method will store ``None``
802in :data:`sys.path_importer_cache` (to indicate that there is no finder for
803this path entry) and return ``None``, indicating that this
804:term:`meta path finder` could not find the module.
805
806If a :term:`path entry finder` *is* returned by one of the :term:`path entry
807hook` callables on :data:`sys.path_hooks`, then the following protocol is used
808to ask the finder for a module spec, which is then used when loading the
809module.
810
811The current working directory -- denoted by an empty string -- is handled
812slightly differently from other entries on :data:`sys.path`. First, if the
813current working directory is found to not exist, no value is stored in
814:data:`sys.path_importer_cache`. Second, the value for the current working
815directory is looked up fresh for each module lookup. Third, the path used for
816:data:`sys.path_importer_cache` and returned by
817:meth:`importlib.machinery.PathFinder.find_spec` will be the actual current
818working directory and not the empty string.
819
820Path entry finder protocol
821--------------------------
822
823In order to support imports of modules and initialized packages and also to
824contribute portions to namespace packages, path entry finders must implement
825the :meth:`~importlib.abc.PathEntryFinder.find_spec` method.
826
827:meth:`~importlib.abc.PathEntryFinder.find_spec` takes two argument, the
828fully qualified name of the module being imported, and the (optional) target
829module.  ``find_spec()`` returns a fully populated spec for the module.
830This spec will always have "loader" set (with one exception).
831
832To indicate to the import machinery that the spec represents a namespace
833:term:`portion`. the path entry finder sets "loader" on the spec to
834``None`` and "submodule_search_locations" to a list containing the
835portion.
836
837.. versionchanged:: 3.4
838   :meth:`~importlib.abc.PathEntryFinder.find_spec` replaced
839   :meth:`~importlib.abc.PathEntryFinder.find_loader` and
840   :meth:`~importlib.abc.PathEntryFinder.find_module`, both of which
841   are now deprecated, but will be used if ``find_spec()`` is not defined.
842
843   Older path entry finders may implement one of these two deprecated methods
844   instead of ``find_spec()``.  The methods are still respected for the
845   sake of backward compatibility.  However, if ``find_spec()`` is
846   implemented on the path entry finder, the legacy methods are ignored.
847
848   :meth:`~importlib.abc.PathEntryFinder.find_loader` takes one argument, the
849   fully qualified name of the module being imported.  ``find_loader()``
850   returns a 2-tuple where the first item is the loader and the second item
851   is a namespace :term:`portion`.  When the first item (i.e. the loader) is
852   ``None``, this means that while the path entry finder does not have a
853   loader for the named module, it knows that the path entry contributes to
854   a namespace portion for the named module.  This will almost always be the
855   case where Python is asked to import a namespace package that has no
856   physical presence on the file system.  When a path entry finder returns
857   ``None`` for the loader, the second item of the 2-tuple return value must
858   be a sequence, although it can be empty.
859
860   If ``find_loader()`` returns a non-``None`` loader value, the portion is
861   ignored and the loader is returned from the path based finder, terminating
862   the search through the path entries.
863
864   For backwards compatibility with other implementations of the import
865   protocol, many path entry finders also support the same,
866   traditional ``find_module()`` method that meta path finders support.
867   However path entry finder ``find_module()`` methods are never called
868   with a ``path`` argument (they are expected to record the appropriate
869   path information from the initial call to the path hook).
870
871   The ``find_module()`` method on path entry finders is deprecated,
872   as it does not allow the path entry finder to contribute portions to
873   namespace packages.  If both ``find_loader()`` and ``find_module()``
874   exist on a path entry finder, the import system will always call
875   ``find_loader()`` in preference to ``find_module()``.
876
877
878Replacing the standard import system
879====================================
880
881The most reliable mechanism for replacing the entire import system is to
882delete the default contents of :data:`sys.meta_path`, replacing them
883entirely with a custom meta path hook.
884
885If it is acceptable to only alter the behaviour of import statements
886without affecting other APIs that access the import system, then replacing
887the builtin :func:`__import__` function may be sufficient. This technique
888may also be employed at the module level to only alter the behaviour of
889import statements within that module.
890
891To selectively prevent import of some modules from a hook early on the
892meta path (rather than disabling the standard import system entirely),
893it is sufficient to raise :exc:`ModuleNoFoundError` directly from
894:meth:`~importlib.abc.MetaPathFinder.find_spec` instead of returning
895``None``. The latter indicates that the meta path search should continue,
896while raising an exception terminates it immediately.
897
898
899Special considerations for __main__
900===================================
901
902The :mod:`__main__` module is a special case relative to Python's import
903system.  As noted :ref:`elsewhere <programs>`, the ``__main__`` module
904is directly initialized at interpreter startup, much like :mod:`sys` and
905:mod:`builtins`.  However, unlike those two, it doesn't strictly
906qualify as a built-in module.  This is because the manner in which
907``__main__`` is initialized depends on the flags and other options with
908which the interpreter is invoked.
909
910.. _main_spec:
911
912__main__.__spec__
913-----------------
914
915Depending on how :mod:`__main__` is initialized, ``__main__.__spec__``
916gets set appropriately or to ``None``.
917
918When Python is started with the :option:`-m` option, ``__spec__`` is set
919to the module spec of the corresponding module or package. ``__spec__`` is
920also populated when the ``__main__`` module is loaded as part of executing a
921directory, zipfile or other :data:`sys.path` entry.
922
923In :ref:`the remaining cases <using-on-interface-options>`
924``__main__.__spec__`` is set to ``None``, as the code used to populate the
925:mod:`__main__` does not correspond directly with an importable module:
926
927- interactive prompt
928- -c switch
929- running from stdin
930- running directly from a source or bytecode file
931
932Note that ``__main__.__spec__`` is always ``None`` in the last case,
933*even if* the file could technically be imported directly as a module
934instead. Use the :option:`-m` switch if valid module metadata is desired
935in :mod:`__main__`.
936
937Note also that even when ``__main__`` corresponds with an importable module
938and ``__main__.__spec__`` is set accordingly, they're still considered
939*distinct* modules. This is due to the fact that blocks guarded by
940``if __name__ == "__main__":`` checks only execute when the module is used
941to populate the ``__main__`` namespace, and not during normal import.
942
943
944Open issues
945===========
946
947XXX It would be really nice to have a diagram.
948
949XXX * (import_machinery.rst) how about a section devoted just to the
950attributes of modules and packages, perhaps expanding upon or supplanting the
951related entries in the data model reference page?
952
953XXX runpy, pkgutil, et al in the library manual should all get "See Also"
954links at the top pointing to the new import system section.
955
956XXX Add more explanation regarding the different ways in which
957``__main__`` is initialized?
958
959XXX Add more info on ``__main__`` quirks/pitfalls (i.e. copy from
960:pep:`395`).
961
962
963References
964==========
965
966The import machinery has evolved considerably since Python's early days.  The
967original `specification for packages
968<http://legacy.python.org/doc/essays/packages.html>`_ is still available to read,
969although some details have changed since the writing of that document.
970
971The original specification for :data:`sys.meta_path` was :pep:`302`, with
972subsequent extension in :pep:`420`.
973
974:pep:`420` introduced :term:`namespace packages <namespace package>` for
975Python 3.3.  :pep:`420` also introduced the :meth:`find_loader` protocol as an
976alternative to :meth:`find_module`.
977
978:pep:`366` describes the addition of the ``__package__`` attribute for
979explicit relative imports in main modules.
980
981:pep:`328` introduced absolute and explicit relative imports and initially
982proposed ``__name__`` for semantics :pep:`366` would eventually specify for
983``__package__``.
984
985:pep:`338` defines executing modules as scripts.
986
987:pep:`451` adds the encapsulation of per-module import state in spec
988objects.  It also off-loads most of the boilerplate responsibilities of
989loaders back onto the import machinery.  These changes allow the
990deprecation of several APIs in the import system and also addition of new
991methods to finders and loaders.
992
993.. rubric:: Footnotes
994
995.. [#fnmo] See :class:`types.ModuleType`.
996
997.. [#fnlo] The importlib implementation avoids using the return value
998   directly. Instead, it gets the module object by looking the module name up
999   in :data:`sys.modules`.  The indirect effect of this is that an imported
1000   module may replace itself in :data:`sys.modules`.  This is
1001   implementation-specific behavior that is not guaranteed to work in other
1002   Python implementations.
1003
1004.. [#fnpic] In legacy code, it is possible to find instances of
1005   :class:`imp.NullImporter` in the :data:`sys.path_importer_cache`.  It
1006   is recommended that code be changed to use ``None`` instead.  See
1007   :ref:`portingpythoncode` for more details.
1008