Fundamentals 24 min read

Unlocking Python's Import System: How Modules, Packages, and Search Paths Work

This article explains Python's import mechanism in depth, covering modules, packages, __init__.py files, absolute and relative imports, the role of sys.path, import hooks, meta_path finders, path entry finders, module specs, and loader objects, with code examples and practical tips.

MaGe Linux Operations

Dec 4, 2021

Unlocking Python's Import System: How Modules, Packages, and Search Paths Work

1. Introduction

Python’s official definition: Python code in one module gains access to the code in another module by the process of importing it.

In everyday use we often write import statements such as from xxx import xxx or import xx. If you have explored Python packages you will notice many packages contain an __init__.py file. Why is that?

This article examines modules/packages and the import loading and searching mechanisms, referencing the latest Python 3.9.1 documentation.

2. Module and Package

2.1 What is a module?

A module is a .py file that logically groups Python code (variables, functions, classes). The file is the physical organization, while the module name is the logical organization.

In fact, any file with a .py suffix is a Python module.

Inside a module you can obtain its name via the global variable __name__. Code in the module is executed once when the module is imported.

2.2 What is a package?

A package is a special kind of module that provides a hierarchical namespace. You can think of a package as a directory and a module as a file within that directory, though packages need not correspond to a real file‑system directory. All packages are modules, but not all modules are packages. A module becomes a package when it has a __path__ attribute.

Python defines two types of packages: regular packages and namespace packages.

Regular package : Implemented as a directory containing an __init__.py file. Importing the package implicitly executes __init__.py, and objects defined there become part of the package namespace.

Namespace package : Consists of multiple parts that may reside in different locations (including zip files or remote URLs). It does not require an __init__.py file. The __path__ attribute is a custom iterable that aggregates the locations of all parts.

2.3 Import system

The import statement is the most common way to trigger the import mechanism, but you can also use importlib.import_module() or the built‑in __import__() function.

The import statement first searches for the specified name, then binds the result to the current namespace. The search is performed by calling __import__() with appropriate arguments.

If a module cannot be found, a ModuleNotFoundError is raised. Python uses several strategies and hooks to customize the search.

3. Module/Package Location

3.1 Absolute and relative imports

Python provides two import mechanisms: relative import and absolute import. Modern Python 3 uses only absolute (fully qualified) imports.

from threading import Thread
from multiprocessing.pool import Pool

Absolute imports may raise ImportError when the target module cannot be found. Adding the local directory to sys.path (e.g., marking it as a Sources Root in PyCharm) resolves the issue.

3.2 Module search path

When importing a module, the interpreter first checks built‑in modules, then searches the directories listed in sys.path for a file named module_name.py.

The initial value of sys.path comes from:

The current working directory of the script

The PYTHONPATH environment variable

Python’s default installation directories

Warning: Do not name your modules the same as standard‑library modules, otherwise the local module will shadow the standard one.

3.3 Extending the search path with .pth files

You can add absolute paths to a .pth file placed in a special location (e.g., site-packages on Windows or the path returned by site.getsitepackages() on Linux). Python reads these files during module loading and adds the listed directories to the search path.

import site
print(site.getsitepackages())
# Output
['/Users/gray/anaconda3/anaconda3/envs/python-develop/lib/python3.7/site-packages']

4. Deep Dive into Import Search

The import search result is stored in sys.modules. Python follows a detailed protocol that allows extensions while maintaining compatibility.

To start a search, Python needs the fully qualified name of the module (e.g., foo.bar.baz). The name can come from an import statement, importlib.import_module(), or __import__().

4.1 Cache

Before searching, Python checks sys.modules, a dictionary that caches already‑imported modules. If a module is present, its cached object is returned; otherwise the search proceeds.

4.2 Finder and Loader

If the cache misses, Python uses the import protocol to locate and load the module. A finder determines whether a module can be found, while a loader actually loads it. Objects that implement both interfaces are called importers .

Python ships with default finders and importers: a built‑in importer for built‑in modules, a frozen importer for frozen modules, and a path‑based finder that searches sys.path.

4.3 Import hook

Import hooks extend the import mechanism. Two types exist:

Meta hook – added to sys.meta_path and invoked before the cache check.

Import‑path hook – added to sys.path_hooks and invoked while processing entries in sys.path or a package’s __path__.

4.4 Meta‑path

If a module is not found in the cache, Python iterates over sys.meta_path. Each meta‑path finder’s find_spec() method is called in order. If none can handle the name, ModuleNotFoundError is raised.

4.5 Import‑related module attributes

During import, Python populates several attributes on the module object: __name__: fully qualified name __loader__: loader object used for introspection __package__: package name for relative imports __spec__: the ModuleSpec instance __path__: present for packages (iterable of directories) __file__: path to the source file (optional for built‑ins) __cached__: path to the compiled byte‑code file

5. Import Loading Mechanism

The following pseudo‑code outlines the loading steps:

module = None
if spec.loader is not None and hasattr(spec.loader, 'create_module'):
    module = spec.loader.create_module(spec)
if module is None:
    module = ModuleType(spec.name)
_init_module_attrs(spec, module)
if spec.loader is None:
    if spec.submodule_search_locations is not None:
        # namespace package
        sys.modules[spec.name] = module
    else:
        raise ImportError
elif not hasattr(spec.loader, 'exec_module'):
    module = spec.loader.load_module(spec.name)
else:
    sys.modules[spec.name] = module
    try:
        spec.loader.exec_module(module)
    except BaseException:
        try:
            del sys.modules[spec.name]
        except KeyError:
            pass
        raise
return sys.modules[spec.name]

Key points:

The module is cached in sys.modules before execution to avoid recursive imports.

If loading fails, the failed entry is removed from the cache, but successfully loaded dependencies remain.

The loader’s exec_module() method executes the module code and populates its namespace.

5.1 Loader object

A loader is an instance of importlib.abc.Loader. It must provide exec_module() to execute the module. If it cannot, it should raise ImportError.

5.2 ModuleSpec object

The ModuleSpec carries information between finders and loaders. It is used to build a module template or to pass state. The spec’s loader attribute points to the loader, and submodule_search_locations is set for packages.

6. Path‑Based Finder

The default PathBasedFinder searches the import path (the list in sys.path) using path entry finders supplied by sys.path_hooks. Results are cached in sys.path_importer_cache to avoid repeated work.

6.1 Path entry finder

A path entry finder is created by calling a callable from sys.path_hooks with a path entry. If none of the hooks can handle the entry, None is cached and the search continues.

6.2 Path entry finder protocol

Path entry finders implement find_spec(fullname, target=None). For namespace package portions, the returned spec has loader=None and submodule_search_locations set to a list containing that portion.

Original article source: https://blog.csdn.net/u011130655/article/details/113018457

Click the "Read Original" button below for the full article.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python loader Modules Packages Import System PEP sys.path Finder

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.