PEP:534
Title:Distributing a Subset of the Standard Library
Version:$Revision$
Last-Modified:$Date$
Author:Tomáš Orsava <torsava@redhat.com>, Petr Viktorin <encukou@gmail.com>
Status:Draft
Type:Standards Track
Content-Type:text/x-rst
Created:5-Sep-2016
Python-Version:3.7
Post-History:

Contents

Abstract

Python is sometimes being distributed without its full standard library. However, there is as of yet no standardized way of dealing with importing a missing standard library module. This PEP proposes a mechanism for identifying which standard library modules are missing and puts forth a method of how attempts to import a missing standard library module should be handled.

Motivation

There are several use cases for including only a subset of Python's standard library. However, there is so far no formal specification of how to properly implement distribution of a subset of the standard library. Namely, how to safely handle attempts to import a missing stdlib module, and display an informative error message.

CPython

When one of Python standard library modules (such as _sqlite3) cannot be compiled during a Python build because of missing dependencies (e.g. SQLite header files), the module is simply skipped.

If you then install this compiled Python and use it to try to import one of the missing modules, Python will go through the sys.path entries looking for it. It won't find it among the stdlib modules and thus it will continue onto site-packages and fail with a ModuleNotFoundError [2] if it doesn't find it.

This can confuse users who may not understand why a cleanly built Python is missing standard library modules.

Linux and other distributions

Many Linux and other distributions are already separating out parts of the standard library to standalone packages. Among the most commonly excluded modules are the tkinter module, since it draws in a dependency on the graphical environment, and the test package, as it only serves to test Python internally and is about as big as the rest of the standard library put together.

The methods of omission of these modules differ. For example, Debian patches the file Lib/tkinter/__init__.py to envelop the line import _tkinter in a try-except block and upon encountering an ImportError it simply adds the following to the error message: please install the python3-tk package [1]. Fedora and other distributions simply don't include the omitted modules, potentially leaving users baffled as to where to find them.

Specification

When, for any reason, a standard library module is not to be included with the rest, a file with its name and the extension .missing.py shall be created and placed in the directory the module itself would have occupied. This file can contain any Python code, however, it should raise a ModuleNotFoundError [2] with a helpful error message.

Currently, when Python tries to import a module XYZ, the FileFinder path hook goes through the entries in sys.path, and in each location looks for a file whose name is XYZ with one of the valid suffixes (e.g. .so, ..., .py, ..., .pyc). The suffixes are tried in order. If none of them are found, Python goes on to try the next directory in sys.path.

The .missing.py extension will be added to the end of the list, and configured to be handled by SourceFileLoader. Thus, if a module is not found in its proper location, the XYZ.missing.py file is found and executed, and further locations are not searched.

The CPython build system will be modified to generate .missing.py files for optional modules that were not built.

Rationale

The mechanism of handling missing standard library modules through the use of the .missing.py files was chosen due to its advantages both for CPython itself and for Linux and other distributions that are packaging it.

The missing pieces of the standard library can be subsequently installed simply by putting the module files in their appropriate location. They will then take precedence over the corresponding .missing.py files. This makes installation simple for Linux package managers.

This mechanism also solves the minor issue of importing a module from site-packages with the same name as a missing standard library module. Now, Python will import the .missing.py file and won't ever look for a stdlib module in site-packages.

In addition, this method of handling missing stdlib modules can be implemented in a succinct, non-intrusive way in CPython, and thus won't add to the complexity of the existing code base.

The .missing.py file can be customized by the packager to provide any desirable behaviour. While we strongly recommend that these files only raise a ModuleNotFoundError [2] with an appropriate message, there is no reason to limit customization options.

Ideas leading up to this PEP were discussed on the python-dev mailing list [3].

Backwards Compatibility

No problems with backwards compatibility are expected. Distributions that are already patching Python modules to provide custom handling of missing dependencies can continue to do so unhindered.

Reference Implementation

Reference implementation can be found on GitHub [4] and is also accessible in the form of a patch [5].

References

[1]http://bazaar.launchpad.net/~doko/python/pkg3.5-debian/view/head:/patches/tkinter-import.diff
[2](1, 2, 3) https://docs.python.org/3.7/library/exceptions.html#ModuleNotFoundError
[3]https://mail.python.org/pipermail/python-dev/2016-July/145534.html
[4]https://github.com/torsava/cpython/pull/1
[5]https://github.com/torsava/cpython/pull/1.patch