Skip to content

Commit

Permalink
Automatic discovery of packages, py_modules and name (#2894)
Browse files Browse the repository at this point in the history
It is desirable to have some configurations automatically derived.
This is definitely possible for packages and py_modules, and (based on
these two) also for name.

This change adds a new class `setuptools.discovery.ConfigDiscovery`
implementing the automatic discovery logic for packages, py_modules and
name.
  • Loading branch information
abravalheri committed Mar 5, 2022
2 parents 64386ba + f39edae commit 3f9bd68
Show file tree
Hide file tree
Showing 12 changed files with 1,114 additions and 95 deletions.
20 changes: 20 additions & 0 deletions changelog.d/2887.change.1.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Added automatic discovery for ``py_modules`` and ``packages``
-- by :user:`abravalheri`.

Setuptools will try to find these values assuming that the package uses either
the *src-layout* (a ``src`` directory containing all the packages or modules),
the *flat-layout* (package directories directly under the project root),
or the *single-module* approach (isolated Python files, directly under
the project root).

The automatic discovery will also respect layouts that are explicitly
configured using the ``package_dir`` option.

For backward-compatibility, this behavior will be observed **only if both**
``py_modules`` **and** ``packages`` **are not set**.

If setuptools detects modules or packages that are not supposed to be in the
distribution, please manually set ``py_modules`` and ``packages`` in your
``setup.cfg`` or ``setup.py`` file.
If you are using a *flat-layout*, you can also consider switching to
*src-layout*.
9 changes: 9 additions & 0 deletions changelog.d/2887.change.2.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Added automatic configuration for the ``name`` metadata
-- by :user:`abravalheri`.

Setuptools will adopt the name of the top-level package (or module in the case
of single-module distributions), **only when** ``name`` **is not explicitly
provided**.

Please note that it is not possible to automatically derive a single name when
the distribution consists of multiple top-level packages or modules.
10 changes: 10 additions & 0 deletions changelog.d/2894.breaking.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
If you purposefully want to create an *"empty distribution"*, please be aware
that some Python files (or general folders) might be automatically detected and
included.

Projects that currently don't specify both ``packages`` and ``py_modules`` in their
configuration and have extra Python files and folders (not meant for distribution),
might see these files being included in the wheel archive.

You can check details about the automatic discovery behaviour (and
how to configure a different one) in :doc:`/userguide/package_discovery`.
148 changes: 144 additions & 4 deletions docs/userguide/package_discovery.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,142 @@ included manually in the following manner:
packages=['mypkg1', 'mypkg2']
)
This can get tiresome really quickly. To speed things up, we introduce two
functions provided by setuptools:
This can get tiresome really quickly. To speed things up, you can rely on
setuptools automatic discovery, or use the provided tools, as explained in
the following sections.


Automatic discovery
===================

By default setuptools will consider 2 popular project layouts, each one with
its own set of advantages and disadvantages [#layout1]_ [#layout2]_.

src-layout:
The project should contain a ``src`` directory under the project root and
all modules and packages meant for distribution are placed inside this
directory::

project_root_directory
├── pyproject.toml
├── setup.cfg # or setup.py
├── ...
└── src/
└── mypkg/
├── __init__.py
├── ...
└── mymodule.py

This layout is very handy when you wish to use automatic discovery,
since you don't have to worry about other Python files or folders in your
project root being distributed by mistake. In some circumstances it can be
also less error-prone for testing or when using :pep:`420`-style packages.
On the other hand you cannot rely on the implicit ``PYTHONPATH=.`` to fire
up the Python REPL and play with your package (you will need an
`editable install`_ to be able to do that).

flat-layout (also known as "adhoc"):
The package folder(s) are placed directly under the project root::

project_root_directory
├── pyproject.toml
├── setup.cfg # or setup.py
├── ...
└── mypkg/
├── __init__.py
├── ...
└── mymodule.py

This layout is very practical for using the REPL, but in some situations
it can be can be more error-prone (e.g. during tests or if you have a bunch
of folders or Python files hanging around your project root)

There is also a handy variation of the *flat-layout* for utilities/libraries
that can be implemented with a single Python file:

single-module approach (or "few top-level modules"):
Standalone modules are placed directly under the project root, instead of
inside a package folder::

project_root_directory
├── pyproject.toml
├── setup.cfg # or setup.py
├── ...
└── single_file_lib.py

Setuptools will automatically scan your project directory looking for these
layouts and try to guess the correct values for the :ref:`packages <declarative
config>` and :doc:`py_modules </references/keywords>` configuration.

To avoid confusion, file and folder names that are used by popular tools (or
that correspond to well-known conventions, such as distributing documentation
alongside the project code) are automatically filtered out in the case of
*flat-layouts*:

.. autoattribute:: setuptools.discovery.FlatLayoutPackageFinder.DEFAULT_EXCLUDE

.. autoattribute:: setuptools.discovery.FlatLayoutModuleFinder.DEFAULT_EXCLUDE

Also note that you can customise your project layout by explicitly setting
``package_dir``:

.. tab:: setup.cfg

.. code-block:: ini
[options]
# ...
package_dir =
= lib
# similar to "src-layout" but using the "lib" folder
# pkg.mod corresponds to lib/pkg/mod.py
# OR
package_dir =
pkg1 = lib1
# pkg1.mod corresponds to lib1/mod.py
# pkg1.subpkg.mod corresponds to lib1/subpkg/mod.py
pkg2 = lib2
# pkg2.mod corresponds to lib2/mod.py
pkg2.subpkg = lib3
# pkg2.subpkg.mod corresponds to lib3/mod.py
.. tab:: setup.py

.. code-block:: python
setup(
# ...
package_dir = {"": "lib"}
# similar to "src-layout" but using the "lib" folder
# pkg.mod corresponds to lib/pkg/mod.py
)
# OR
setup(
# ...
package_dir = {
"pkg1": "lib1", # pkg1.mod corresponds to lib1/mod.py
# pkg1.subpkg.mod corresponds to lib1/subpkg/mod.py
"pkg2": "lib2", # pkg2.mod corresponds to lib2/mod.py
"pkg2.subpkg": "lib3" # pkg2.subpkg.mod corresponds to lib3/mod.py
# ...
)
.. important:: Automatic discovery will **only** be enabled if you don't
provide any configuration for both ``packages`` and ``py_modules``.
If at least one of them is explicitly set, automatic discovery will not take
place.
Custom discovery
================
If the automatic discovery does not work for you
(e.g., you want to *include* in the distribution top-level packages with
reserved names such as ``tasks``, ``example`` or ``docs``, or you want to
*exclude* nested packages that would be otherwise included), you can use
the provided tools for package discovery:
.. tab:: setup.cfg
Expand All @@ -61,7 +195,7 @@ functions provided by setuptools:
Using ``find:`` or ``find_packages``
====================================
------------------------------------
Let's start with the first tool. ``find:`` (``find_packages``) takes a source
directory and two lists of package name patterns to exclude and include, and
then return a list of ``str`` representing the packages it could find. To use
Expand Down Expand Up @@ -113,7 +247,7 @@ in ``src`` that starts with the name ``pkg`` and not ``additional``:
.. _Namespace Packages:
Using ``find_namespace:`` or ``find_namespace_packages``
========================================================
--------------------------------------------------------
``setuptools`` provides the ``find_namespace:`` (``find_namespace_packages``)
which behaves similarly to ``find:`` but works with namespace package. Before
diving in, it is important to have a good understanding of what namespace
Expand Down Expand Up @@ -249,3 +383,9 @@ file contains the following:
__path__ = __import__('pkgutil').extend_path(__path__, __name__)
The project layout remains the same and ``setup.cfg`` remains the same.
.. [#layout1] https://blog.ionelmc.ro/2014/05/25/python-packaging/#the-structure
.. [#layout2] https://blog.ionelmc.ro/2017/09/25/rehashing-the-src-layout/
.. _editable install: https://pip.pypa.io/en/stable/cli/pip_install/#editable-installs
82 changes: 1 addition & 81 deletions setuptools/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
"""Extensions to the 'distutils' for large or complex distributions"""

from fnmatch import fnmatchcase
import functools
import os
import re
Expand All @@ -9,14 +8,14 @@

import distutils.core
from distutils.errors import DistutilsOptionError
from distutils.util import convert_path

from ._deprecation_warning import SetuptoolsDeprecationWarning

import setuptools.version
from setuptools.extension import Extension
from setuptools.dist import Distribution
from setuptools.depends import Require
from setuptools.discovery import PackageFinder, PEP420PackageFinder
from . import monkey
from . import logging

Expand All @@ -37,85 +36,6 @@
bootstrap_install_from = None


class PackageFinder:
"""
Generate a list of all Python packages found within a directory
"""

@classmethod
def find(cls, where='.', exclude=(), include=('*',)):
"""Return a list all Python packages found within directory 'where'
'where' is the root directory which will be searched for packages. It
should be supplied as a "cross-platform" (i.e. URL-style) path; it will
be converted to the appropriate local path syntax.
'exclude' is a sequence of package names to exclude; '*' can be used
as a wildcard in the names, such that 'foo.*' will exclude all
subpackages of 'foo' (but not 'foo' itself).
'include' is a sequence of package names to include. If it's
specified, only the named packages will be included. If it's not
specified, all found packages will be included. 'include' can contain
shell style wildcard patterns just like 'exclude'.
"""

return list(
cls._find_packages_iter(
convert_path(where),
cls._build_filter('ez_setup', '*__pycache__', *exclude),
cls._build_filter(*include),
)
)

@classmethod
def _find_packages_iter(cls, where, exclude, include):
"""
All the packages found in 'where' that pass the 'include' filter, but
not the 'exclude' filter.
"""
for root, dirs, files in os.walk(where, followlinks=True):
# Copy dirs to iterate over it, then empty dirs.
all_dirs = dirs[:]
dirs[:] = []

for dir in all_dirs:
full_path = os.path.join(root, dir)
rel_path = os.path.relpath(full_path, where)
package = rel_path.replace(os.path.sep, '.')

# Skip directory trees that are not valid packages
if '.' in dir or not cls._looks_like_package(full_path):
continue

# Should this package be included?
if include(package) and not exclude(package):
yield package

# Keep searching subdirectories, as there may be more packages
# down there, even if the parent was excluded.
dirs.append(dir)

@staticmethod
def _looks_like_package(path):
"""Does a directory look like a package?"""
return os.path.isfile(os.path.join(path, '__init__.py'))

@staticmethod
def _build_filter(*patterns):
"""
Given a list of patterns, return a callable that will be true only if
the input matches at least one of the patterns.
"""
return lambda name: any(fnmatchcase(name, pat=pat) for pat in patterns)


class PEP420PackageFinder(PackageFinder):
@staticmethod
def _looks_like_package(path):
return True


find_packages = PackageFinder.find
find_namespace_packages = PEP420PackageFinder.find

Expand Down
Loading

0 comments on commit 3f9bd68

Please sign in to comment.