207 lines
8.9 KiB
ReStructuredText
207 lines
8.9 KiB
ReStructuredText
====================
|
|
Data Files Support
|
|
====================
|
|
|
|
The distutils have traditionally allowed installation of "data files", which
|
|
are placed in a platform-specific location. However, the most common use case
|
|
for data files distributed with a package is for use *by* the package, usually
|
|
by including the data files **inside the package directory**.
|
|
|
|
Setuptools offers three ways to specify this most common type of data files to
|
|
be included in your package's [#datafiles]_.
|
|
First, you can simply use the ``include_package_data`` keyword, e.g.::
|
|
|
|
from setuptools import setup, find_packages
|
|
setup(
|
|
...
|
|
include_package_data=True
|
|
)
|
|
|
|
This tells setuptools to install any data files it finds in your packages.
|
|
The data files must be specified via the |MANIFEST.in|_ file.
|
|
(They can also be tracked by a revision control system, using an appropriate
|
|
plugin such as :pypi:`setuptools-scm` or :pypi:`setuptools-svn`.
|
|
See the section below on :ref:`Adding Support for Revision
|
|
Control Systems` for information on how to write such plugins.)
|
|
|
|
If you want finer-grained control over what files are included (for example,
|
|
if you have documentation files in your package directories and want to exclude
|
|
them from installation), then you can also use the ``package_data`` keyword,
|
|
e.g.::
|
|
|
|
from setuptools import setup, find_packages
|
|
setup(
|
|
...
|
|
package_data={
|
|
# If any package contains *.txt or *.rst files, include them:
|
|
"": ["*.txt", "*.rst"],
|
|
# And include any *.msg files found in the "hello" package, too:
|
|
"hello": ["*.msg"],
|
|
}
|
|
)
|
|
|
|
The ``package_data`` argument is a dictionary that maps from package names to
|
|
lists of glob patterns. The globs may include subdirectory names, if the data
|
|
files are contained in a subdirectory of the package. For example, if the
|
|
package tree looks like this::
|
|
|
|
setup.py
|
|
src/
|
|
mypkg/
|
|
__init__.py
|
|
mypkg.txt
|
|
data/
|
|
somefile.dat
|
|
otherdata.dat
|
|
|
|
The setuptools setup file might look like this::
|
|
|
|
from setuptools import setup, find_packages
|
|
setup(
|
|
...
|
|
packages=find_packages("src"), # include all packages under src
|
|
package_dir={"": "src"}, # tell distutils packages are under src
|
|
|
|
package_data={
|
|
# If any package contains *.txt files, include them:
|
|
"": ["*.txt"],
|
|
# And include any *.dat files found in the "data" subdirectory
|
|
# of the "mypkg" package, also:
|
|
"mypkg": ["data/*.dat"],
|
|
}
|
|
)
|
|
|
|
Notice that if you list patterns in ``package_data`` under the empty string,
|
|
these patterns are used to find files in every package, even ones that also
|
|
have their own patterns listed. Thus, in the above example, the ``mypkg.txt``
|
|
file gets included even though it's not listed in the patterns for ``mypkg``.
|
|
|
|
Also notice that if you use paths, you *must* use a forward slash (``/``) as
|
|
the path separator, even if you are on Windows. Setuptools automatically
|
|
converts slashes to appropriate platform-specific separators at build time.
|
|
|
|
If datafiles are contained in a subdirectory of a package that isn't a package
|
|
itself (no ``__init__.py``), then the subdirectory names (or ``*``) are required
|
|
in the ``package_data`` argument (as shown above with ``"data/*.dat"``).
|
|
|
|
When building an ``sdist``, the datafiles are also drawn from the
|
|
``package_name.egg-info/SOURCES.txt`` file, so make sure that this is removed if
|
|
the ``setup.py`` ``package_data`` list is updated before calling ``setup.py``.
|
|
|
|
.. note::
|
|
If using the ``include_package_data`` argument, files specified by
|
|
``package_data`` will *not* be automatically added to the manifest unless
|
|
they are listed in the |MANIFEST.in|_ file or by a plugin like
|
|
:pypi:`setuptools-scm` or :pypi:`setuptools-svn`.
|
|
|
|
.. https://docs.python.org/3/distutils/setupscript.html#installing-package-data
|
|
|
|
Sometimes, the ``include_package_data`` or ``package_data`` options alone
|
|
aren't sufficient to precisely define what files you want included. For
|
|
example, you may want to include package README files in your revision control
|
|
system and source distributions, but exclude them from being installed. So,
|
|
setuptools offers an ``exclude_package_data`` option as well, that allows you
|
|
to do things like this::
|
|
|
|
from setuptools import setup, find_packages
|
|
setup(
|
|
...
|
|
packages=find_packages("src"), # include all packages under src
|
|
package_dir={"": "src"}, # tell distutils packages are under src
|
|
|
|
include_package_data=True, # include everything in source control
|
|
|
|
# ...but exclude README.txt from all packages
|
|
exclude_package_data={"": ["README.txt"]},
|
|
)
|
|
|
|
The ``exclude_package_data`` option is a dictionary mapping package names to
|
|
lists of wildcard patterns, just like the ``package_data`` option. And, just
|
|
as with that option, a key of ``""`` will apply the given pattern(s) to all
|
|
packages. However, any files that match these patterns will be *excluded*
|
|
from installation, even if they were listed in ``package_data`` or were
|
|
included as a result of using ``include_package_data``.
|
|
|
|
In summary, the three options allow you to:
|
|
|
|
``include_package_data``
|
|
Accept all data files and directories matched by |MANIFEST.in|_ or added by
|
|
a :ref:`plugin <Adding Support for Revision Control Systems>`.
|
|
|
|
``package_data``
|
|
Specify additional patterns to match files that may or may
|
|
not be matched by |MANIFEST.in|_ or added by
|
|
a :ref:`plugin <Adding Support for Revision Control Systems>`.
|
|
|
|
``exclude_package_data``
|
|
Specify patterns for data files and directories that should *not* be
|
|
included when a package is installed, even if they would otherwise have
|
|
been included due to the use of the preceding options.
|
|
|
|
NOTE: Due to the way the distutils build process works, a data file that you
|
|
include in your project and then stop including may be "orphaned" in your
|
|
project's build directories, requiring you to run ``setup.py clean --all`` to
|
|
fully remove them. This may also be important for your users and contributors
|
|
if they track intermediate revisions of your project using Subversion; be sure
|
|
to let them know when you make changes that remove files from inclusion so they
|
|
can run ``setup.py clean --all``.
|
|
|
|
|
|
.. _Accessing Data Files at Runtime:
|
|
|
|
Accessing Data Files at Runtime
|
|
-------------------------------
|
|
|
|
Typically, existing programs manipulate a package's ``__file__`` attribute in
|
|
order to find the location of data files. However, this manipulation isn't
|
|
compatible with PEP 302-based import hooks, including importing from zip files
|
|
and Python Eggs. It is strongly recommended that, if you are using data files,
|
|
you should use :mod:`importlib.resources` to access them.
|
|
:mod:`importlib.resources` was added to Python 3.7 and the latest version of
|
|
the library is also available via the :pypi:`importlib-resources` backport.
|
|
See :doc:`importlib-resources:using` for detailed instructions [#importlib]_.
|
|
|
|
.. tip:: Files inside the package directory should be *read-only* to avoid a
|
|
series of common problems (e.g. when multiple users share a common Python
|
|
installation, when the package is loaded from a zip file, or when multiple
|
|
instances of a Python application run in parallel).
|
|
|
|
If your Python package needs to write to a file for shared data or configuration,
|
|
you can use standard platform/OS-specific system directories, such as
|
|
``~/.local/config/$appname`` or ``/usr/share/$appname/$version`` (Linux specific) [#system-dirs]_.
|
|
A common approach is to add a read-only template file to the package
|
|
directory that is then copied to the correct system directory if no
|
|
pre-existing file is found.
|
|
|
|
|
|
Non-Package Data Files
|
|
----------------------
|
|
|
|
Historically, ``setuptools`` by way of ``easy_install`` would encapsulate data
|
|
files from the distribution into the egg (see `the old docs
|
|
<https://github.com/pypa/setuptools/blob/52aacd5b276fedd6849c3a648a0014f5da563e93/docs/setuptools.txt#L970-L1001>`_). As eggs are deprecated and pip-based installs
|
|
fall back to the platform-specific location for installing data files, there is
|
|
no supported facility to reliably retrieve these resources.
|
|
|
|
Instead, the PyPA recommends that any data files you wish to be accessible at
|
|
run time be included **inside the package**.
|
|
|
|
|
|
----
|
|
|
|
.. [#datafiles] ``setuptools`` consider a *package data file* any non-Python
|
|
file **inside the package directory** (i.e., that co-exists in the same
|
|
location as the regular ``.py`` files being distributed).
|
|
|
|
.. [#system-dirs] These locations can be discovered with the help of
|
|
third-party libraries such as :pypi:`platformdirs`.
|
|
|
|
.. [#importlib] Recent versions of :mod:`importlib.resources` available in
|
|
Pythons' standard library should be API compatible with
|
|
:pypi:`importlib-metadata`. However this might vary depending on which version
|
|
of Python is installed.
|
|
|
|
|
|
.. |MANIFEST.in| replace:: ``MANIFEST.in``
|
|
.. _MANIFEST.in: https://packaging.python.org/en/latest/guides/using-manifest-in/
|