Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builds proceed with zero configuration #2329

Closed
hroncok opened this issue Aug 14, 2020 · 30 comments
Closed

builds proceed with zero configuration #2329

hroncok opened this issue Aug 14, 2020 · 30 comments
Labels
enhancement Needs Discussion Issues where the implementation still needs to be discussed.

Comments

@hroncok
Copy link
Contributor

hroncok commented Aug 14, 2020

Hello, my understanding is that the build_meta:__legacy__ backend should behave like python setup.py ... calls.

However, I've realized that it proceeds fine without any setup.py / setup.cfg files:

$ mkdir empty
$ cd empty/
$ pip list
Package    Version
---------- -------
packaging  20.4
pip        20.2.2
pyparsing  2.4.7
setuptools 49.6.0   # also happens with 41.6.0 and probably anything in between
six        1.15.0
wheel      0.35.0
$ python
>>> from setuptools.build_meta import __legacy__
>>> __legacy__.build_wheel('.')
running bdist_wheel
running build
installing to build/bdist.linux-x86_64/wheel
running install
running install_egg_info
running egg_info
creating UNKNOWN.egg-info
writing UNKNOWN.egg-info/PKG-INFO
writing dependency_links to UNKNOWN.egg-info/dependency_links.txt
writing top-level names to UNKNOWN.egg-info/top_level.txt
writing manifest file 'UNKNOWN.egg-info/SOURCES.txt'
reading manifest file 'UNKNOWN.egg-info/SOURCES.txt'
writing manifest file 'UNKNOWN.egg-info/SOURCES.txt'
Copying UNKNOWN.egg-info to build/bdist.linux-x86_64/wheel/UNKNOWN-0.0.0-py3.8.egg-info
running install_scripts
creating build/bdist.linux-x86_64/wheel/UNKNOWN-0.0.0.dist-info/WHEEL
creating '/home/churchyard/tmp/empty/tmpy86e5sx4/UNKNOWN-0.0.0-py3-none-any.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
adding 'UNKNOWN-0.0.0.dist-info/METADATA'
adding 'UNKNOWN-0.0.0.dist-info/WHEEL'
adding 'UNKNOWN-0.0.0.dist-info/top_level.txt'
adding 'UNKNOWN-0.0.0.dist-info/RECORD'
removing build/bdist.linux-x86_64/wheel
'UNKNOWN-0.0.0-py3-none-any.whl'
>>> ^D
[empty]$ ls
build  UNKNOWN-0.0.0-py3-none-any.whl  UNKNOWN.egg-info

This might be a problem in scripted environments. I would have expect an error here. Is this behavior deliberate or accidental?

@pganssle
Copy link
Member

It should indeed work the same as any pyproject.toml. It's probably unfortunate that a project with a bare setup() call and no setup.cfg succeeds, but it's not specific to the legacy backend (which shouldn't be used anyway).

@uranusjr
Copy link
Member

setup() without any arguments has “worked” for a very long time, so it’s sort of an unfortunate accident. Since it is good in basically no situations at all, would it be possible to block a setup() call if it receives zero arguments and setup.cfg does not exist? Preferrably I would want to apply this to both PEP 517 and legacy invocations, buit just changing the PEP 517 behaviour would be very helpful as well.

@pradyunsg
Copy link
Member

pradyunsg commented Oct 11, 2021

Can the legacy backend enforce that a setup.py file must exist?

The language in PEP 517 seems to imply that the expectation is for the file to exist, in projects built with the legacy backend:

If the pyproject.toml file is absent, or the build-backend key is missing, the source tree is not using this specification, and tools should revert to the legacy behaviour of running setup.py (either directly, or by implicitly invoking the setuptools.build_meta:__legacy__ backend).

@pradyunsg
Copy link
Member

pradyunsg commented Oct 15, 2021

Moving further, can the non-legacy backend enforce that setup.py or setup.cfg exist?

A source tree without either can be basically... well, anything. It could be a user's home directory or whatever.

@henryiii
Copy link
Contributor

henryiii commented Oct 15, 2021

What about when PEP 621 support lands? I'd hope eventually you could configure setuptools (for pure Python, at least) entirely from pyproject.toml. For __legacy__, sure. And if no configuration is present, that seems like it should always be an error, rather than producing UNKNOWN.

@pganssle
Copy link
Member

I think the __legacy__ part of this is a red herring. We should probably raise an exception when no configuration information is present, regardless of the backend used. Once that's done, the rest of it won't matter.

Can the legacy backend enforce that a setup.py file must exist?

The language in PEP 517 seems to imply that the expectation is for the file to exist, in projects built with the legacy backend:

I think that language is at best ambiguous, but I read it as saying, "You can invoke setup.py if it's there, or use the legacy backend".

It seems like a pretty niche situation, though, for someone to be relying on PEP 517 behavior but unwilling to create a well-formed pyproject.toml file. Still, I'm inclined to just leave things as they are and simply start raising exceptions when required metadata like name are left unspecified.

@pradyunsg
Copy link
Member

pradyunsg commented Oct 15, 2021

Right now, an empty directory is a valid setuptools package if you use the PEP 517 backends. Surely, we don't want that to be the case?

@pradyunsg
Copy link
Member

Note that the medium-term plan very much is to enforce PEP 517 by default, and this issue is one of the blockers to doing that.

@pganssle
Copy link
Member

Right now, an empty directory is a valid setuptools package if you use the PEP 517 backends. Surely, we don't want that to be the case?

I'm saying this is a problem to be solved regardless of the __legacy__ backend. If we get zero configuration (including in PEP 517 builds), we should raise an exception. I can't think of any situations where doing something else would be useful (except possibly when someone is using setup.py to invoke custom build commands but doesn't actually want to build packages, in which case.. uh... that is not a supported use case).. If we start raising exceptions instead of just populating the data with UNKNOWN, it will no longer be the case that an empty directory is a valid setuptools package.

It's not like we want a directory with just a pyproject.toml file specifying the build backend to be a valid setuptools package.

Note that the medium-term plan very much is to enforce PEP 517 by default, and this issue is one of the blockers to doing that.

Assuming we fix #2799 (which is basically this bug, but not scoped to __legacy__), is this still considered a blocker?

@pradyunsg
Copy link
Member

Nope, that's good enough. :)

@graingert
Copy link
Contributor

Nope, that's good enough. :)

So setup.cfg only projects (without pyproject.toml) are valid?

@henryiii
Copy link
Contributor

No, since you are allowed to call setup.py in the classic backend, then a valid PEP 517 builder could do that. You must have either setup.py or pyproject.toml, and pip/build both artificially enforce that.

@jaraco
Copy link
Member

jaraco commented Oct 20, 2021

You must have either setup.py or pyproject.toml, and pip/build both artificially enforce that.

Assuming this is true, then how is this issue a blocker for PEP 517 by default?

I should point out that the __legacy__ backend has the same behavior as classic setuptools, which will build in an empty directory using default metadata:

draft $ ls
draft $ pip-run -q setuptools wheel -- -c "import setuptools; setuptools.setup()" bdist_wheel
running bdist_wheel
running build
installing to build/bdist.macosx-10.9-universal2/wheel
running install
running install_egg_info
running egg_info
...

Same with distutils

draft $ python -c "import distutils.core; distutils.core.setup()" install --root out
<string>:1: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
running install
running build
running install_egg_info
Creating out/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/
Writing out/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/UNKNOWN-0.0.0-py3.10.egg-info

This might be a problem in scripted environments. I would have expect an error here. Is this behavior deliberate or accidental?

I believe this behavior is somewhat deliberate, to provide reasonable defaults and let the project evolve specific values as needed.

Since this issue is specifically about __legacy__ and since the behavior of __legacy__ matches that of classic Setuptools invocation, I'm inclined to say let's let legacy continue to operate the way it setup() did, unless there's a good reason the behavior needs to change... and then address the issue more generally.

Since this issue has a lot more context than #2799, I'm going to re-scope this issue to the larger issue at hand and close that one as duplicate.

@jaraco jaraco changed the title build_meta:__legacy__ proceeds without setup.py / setup.cfg builds proceed with zero configuration Oct 20, 2021
@jaraco jaraco added enhancement Needs Discussion Issues where the implementation still needs to be discussed. labels Oct 20, 2021
@jaraco
Copy link
Member

jaraco commented Oct 20, 2021

I've been thinking a lot lately how I'd like to build a backend that requires zero configuration. We already see demonstrations of backends that derive metadata and configuration from existing sources:

  • version from git tags
  • manifest from git metadata
  • python files automatically included from packages

I can imagine a world where a backend could derive reasonable defaults for metadata for a basic package from other common sources:

  • author from commit history or repo owner
  • name from directory name or origin URL or sole package discovered
  • URL from repo origin
  • description from repo metadata
  • long description from readme

Such a backend would limit the need for boilerplate and templates and provide a streamlined experience for new and experienced users. In such a world, there is no required metadata.

I suspect that's the principle that Setuptools and distutils were going for when they allowed building an empty directory. That is, if you build a degenerate project, you get a degenerate distribution.

Another situation where this project finds value in accepting empty values is in the test suite. The test suite doesn't need to bother itself with creating a full-fledged distribution, but can create a distribution that exercises only the relevant aspects under test. Making any fields hard-required leads to requiring boilerplate values throughout the test suite, adding distraction and possible confusion ("was the value added for a purpose or to satisfy constraints").

To that end, I'm reluctant to require any metadata. I'm not even sure it's viable to require metadata when plugins are able to supply that metadata (such as setuptools_scm does for version today).

I'd like to explore more the failure modes that led to this discussion. In particular:

Note that the medium-term plan very much is to enforce PEP 517 by default, and this issue is one of the blockers to doing that.

This might be a problem in scripted environments.

Can you explain more about the problems that occur other than the surprise that it succeeds?

@pradyunsg
Copy link
Member

That any arbitrary directory on the file system is a valid setuptools package by this definition. Create an empty directory. Run pip install --use-pep517 on that directory. Notice that setuptools will happily create a package out of that.

This means that passing the wrong path to pip (however you do that — a typo, an environment variable that expanded wrong etc) will result in setuptools zipping everything up. It also blurs the boundary of what a package is. PEP 517 explicitly declares there are two kinds - setup.py based and pyproject.toml based. Currently, setuptools doesn’t enforce either constraint.

While I’m sure there’s certainly space for innovation and experimentation for zero configuration build backends, I highly encourage you to not try taking setuptools down that route. There are significant backwards compatibility constraints here, and it’s much better for that sort of innovation and experimentation to happen in a separate build backend IMO.

@pradyunsg
Copy link
Member

FWIW, pip isn’t enforcing setup.py or pyproject.toml to exist. If you’d prefer that pip enforces that instead of setuptools, then… all good, this issue blocks nothing and there’s nothing actionable here.

@jaraco
Copy link
Member

jaraco commented Oct 20, 2021

According to Henry and my experience, pip does enforce it:

$ pip install --use-pep517 ~/draft
ERROR: Directory '/Users/jaraco/draft' is not installable. Neither 'setup.py' nor 'pyproject.toml' found.

Same with build. I do think enforcing that constraint at pip is the best place.

I'm an enthusiastic +1 to making --use-pep517 the default.

@pganssle
Copy link
Member

Another situation where this project finds value in accepting empty values is in the test suite. The test suite doesn't need to bother itself with creating a full-fledged distribution, but can create a distribution that exercises only the relevant aspects under test. Making any fields hard-required leads to requiring boilerplate values throughout the test suite, adding distraction and possible confusion ("was the value added for a purpose or to satisfy constraints").

I can't really tell from the core metadata spec, but I was under the impression that at least Name and Version were required: https://packaging.python.org/specifications/core-metadata/

I don't know how much more of it is required, but I think it would be fair for us to say that, at the very least, when generating package metadata, the required fields must be specified somehow. I went through and added version to all the tests missing it and it wasn't that many places — many of which were in deprecated commands that we can drop anyway.

It is definitely worth it, IMO, for the setuptools test suite to get a little more verbose or complex in exchange for stopping people from accidentally generating bad packages.

To that end, I'm reluctant to require any metadata. I'm not even sure it's viable to require metadata when plugins are able to supply that metadata (such as setuptools_scm does for version today).

The way I was considering this, and I think the way I've implemented it, was that we should enforce this as late in the process as possible. If we do it that way, any values supplied by a plugin would count as supplied values and the package would build just fine.

@henryiii
Copy link
Contributor

henryiii commented Oct 20, 2021

FWIW, pip isn’t enforcing setup.py or pyproject.toml to exist.

pypa/pip#10031 & pypa/pip#9945

@henryiii
Copy link
Contributor

FWIW, name is required in PEP 621. Also, I think it's fine to not specify name / version, etc, immediately, but by the time you are ready to make the output, if nothing has added it, it doesn't seem to make sense to build without name and versions, since they are part of the output file name.

@jaraco
Copy link
Member

jaraco commented Nov 16, 2021

I don't disagree that name and version are required. I'm just saying that it's plausible, even possibly desirable, that it's not the user's responsibility to supply those, but that the build tool could supply those, perhaps using default values or inferring the values from better data. For example, 0 is a perfectly reasonable default version, especially for a project where a version is never needed. And a name of 'UNKNOWN', while a little awkward, is justifiable for the same reason.

I don't feel strongly about it, and I do think 'UNKNOWN' probably should go, and that the system should error if no suitable name can be determined. On the other hand, a default version of 0 seems right to me and I can imagine lots of scenarios where that behavior would be useful (imagine a private environment where a suite of libraries are always built/installed from source and the version is unimportant to the design).

@pganssle
Copy link
Member

I don't disagree that name and version are required. I'm just saying that it's plausible, even possibly desirable, that it's not the user's responsibility to supply those, but that the build tool could supply those, perhaps using default values or inferring the values from better data. For example, 0 is a perfectly reasonable default version, especially for a project where a version is never needed. And a name of 'UNKNOWN', while a little awkward, is justifiable for the same reason.

My proposal was that we should validate these things as the last thing we do before generating a distribution. I think we can agree that we don't support a scenario where these things are supported after the artifact is generated, so I think our positions are not in conflict.

I don't feel strongly about it, and I do think 'UNKNOWN' probably should go, and that the system should error if no suitable name can be determined. On the other hand, a default version of 0 seems right to me and I can imagine lots of scenarios where that behavior would be useful (imagine a private environment where a suite of libraries are always built/installed from source and the version is unimportant to the design).

I feel like this is a rare enough scenario (and "oops I forgot to specify a version number" a common enough scenario) that we can impose upon people who don't care about version numbers to just specify a hard-coded version or specify some plugin that supplies one, no?

abravalheri added a commit to abravalheri/setuptools that referenced this issue Nov 17, 2021
Following up the discussion in pypa#2887 and pypa#2329, it seems that setuptools
is moving towards more automatic discovery features.

PackageFinder and PEP420PackageFinder are fundamental pieces of this
puzzle and grouping together them togheter with the code implementing these
new discovery features make a lot of sense.
@abravalheri
Copy link
Contributor

abravalheri commented Nov 18, 2021

I feel like this is a rare enough scenario (and "oops I forgot to specify a version number" a common enough scenario) that we can impose upon people who don't care about version numbers to just specify a hard-coded version or specify some plugin that supplies one, no?

How about using a warning here + changing the default version to something more obvious (and/or unique at every run).
A combination of dev and local segments could be added by default (for example 0.dev20201204+UNKNOWN.a23e87f or even 0.0.0a0.dev1637232588+UNKNOWN.f234c342a).

I believe that by adding a local segment, at least PyPI would reject the upload, so the chances of this kind of distribution to be made publicly available by mistake would drop.

@uranusjr
Copy link
Member

The problem is that setuptools is generally and increasingly used as a part of an automated workflow (e.g. pip install), and generating anything without an explicit error makes logical errors difficult to spot, even with a warning emitted in the middle of the process, when you’re inspecting only the final product of that entire workflow.

@abravalheri
Copy link
Contributor

abravalheri commented Nov 18, 2021

That is an interesting point to analyse. What precisely the error/problem that would be important to avoid/spot?

From a pragmatic point of view, I was supposing that the main problem here is having 2 distributions with the same nominal version (0.0.0) but representing 2 different states of the files (i.e., changes between builds not reflected in the version).

For this problem, forcing the user to hard-code a version number will not help a lot, since in a future build the user can forget to bump it. Indeed, in this scenario proposal 2 (see bellow) helps more to avoid errors.


Summary of the current behaviour and proposals (so far) for missing version.

a) Current behaviour: setuptools considers version = 0, when it is not provided
b) Proposal 1: Raise an error when version is not provided
c) Proposal 2: Generate a more specific version, with a "dev" segment monotonic increasing with time and a "local" segment unique per run OR state of the files (e.g. hash of the concatenated hash of the files in the dist, or some UUID4-based token) + Warning

@abravalheri
Copy link
Contributor

Add a PYTHONWARNINGS environment variable to that, and you can make a CI/automated workflow to fail, if that is important to you.

@jaraco
Copy link
Member

jaraco commented Jan 29, 2022

I'm inclined to stick with the current behavior. There's value in not requiring these metadata values to be supplied in cases where the metadata is unimportant (tests, for one). The default values supplied are good defaults (UNKNOWN for name, 0 for version). The pip/build controls already provide good guardrails on ensuring the basic assumptions are met.

I'd similarly like the validation to work in a layered approach, where the basic building blocks are lenient, allowing for degenerate behavior, but where specific workflows (such as building for release) enable more complex behaviors such as validations.

Any solution for this should be holistic and not bolt the behavior on. It should probably coordinate with the config validation and the default metadata values. We can explore these options more readily once the distutils merge is complete.

In the meantime, the issue, as reported is working as intended.

@jaraco jaraco closed this as completed Jan 29, 2022
@layday
Copy link
Member

layday commented Jan 30, 2022 via email

@jaraco
Copy link
Member

jaraco commented Jan 30, 2022

So the validation PR was merged without actually hooking the validation logic onto anything or have I misunderstood?

Yes, that's right. I kept the code because I thought it might prove useful to have at some point, but you make a good point - it's probably not worth keeping that uncalled functionality around.

Meaning that setuptools will continue to build dists out of thin air.

Correct. And after thinking about this more, I'm confident this approach is the right one. After all, there are already protections against building a dist out of an empty directory in the higher level build tools. Setuptools is a lower-level build tool, where direct invocation of setup.py is discouraged, but where it may prove useful to do so for debug/triage/experimentation/testing. There are many constraints one may wish to apply to consider a distribution valid, including having a name and version, but also having modules/packages, entry points, dependencies, etcetera. But as each of those factors are composed to form the distribution, each factor also has a degenerate form (no modules/packages, no entry points, no dependencies). For name and version, the degenerate form is "UNKNOWN" and 0, respectively.

Moreover, there is already some metadata validation that happens as part of the config processing as well as other stages in the development process (invoking build, invoking twine, uploading to pypi).

Since there's value in supporting a degenerate distribution and there are complications with adding that constraint and because it's a long-held expectation, I'd like to keep it for now. The concern about the potential __legacy__ mismatch turned out not to be real. The concern about pip or build installing an empty directory turned out not to be real.

I'd like to explore the possibility of removing the default UNKNOWN values, an exploration which may likely lead to a failure on no name, but I'd like to see that as a larger design and probably after the distutils merge is complete.

abravalheri added a commit to abravalheri/setuptools that referenced this issue Feb 7, 2022
Following up the discussion in pypa#2887 and pypa#2329, it seems that setuptools
is moving towards more automatic discovery features.

PackageFinder and PEP420PackageFinder are fundamental pieces of this
puzzle and grouping together them togheter with the code implementing these
new discovery features make a lot of sense.
abravalheri added a commit to abravalheri/setuptools that referenced this issue Feb 12, 2022
Following up the discussion in pypa#2887 and pypa#2329, it seems that setuptools
is moving towards more automatic discovery features.

PackageFinder and PEP420PackageFinder are fundamental pieces of this
puzzle and grouping together them togheter with the code implementing these
new discovery features make a lot of sense.
@pradyunsg
Copy link
Member

pradyunsg commented Feb 13, 2022

I'm a bit saddened to see this closed as no-action. ☹️

That is, if you build a degenerate project, you get a degenerate distribution.

I'd argue this is a significantly worse experience than: if you build a degenerate project, you get an error telling you where you went wrong.


I still feel that adding a clear failure mode that "you're missing a name and/or version" instead of a default UNKNOWN-0.0.0 distribution would be extremely helpful to users, especially for diagnosing failure modes with setuptools.


Moreover, there is already some metadata validation that happens as part of the config processing as well as other stages in the development process (invoking build, invoking twine, uploading to pypi).

No stage contains protections against 0.0.0 versions. Only PyPI rejects UNKNOWN distribution name.

I don't think that any level of tooling other than setuptools itself should hold protections for this.

abravalheri added a commit to abravalheri/setuptools that referenced this issue Feb 18, 2022
Following up the discussion in pypa#2887 and pypa#2329, it seems that setuptools
is moving towards more automatic discovery features.

PackageFinder and PEP420PackageFinder are fundamental pieces of this
puzzle and grouping together them togheter with the code implementing these
new discovery features make a lot of sense.
abravalheri added a commit to abravalheri/setuptools that referenced this issue Mar 5, 2022
Following up the discussion in pypa#2887 and pypa#2329, it seems that setuptools
is moving towards more automatic discovery features.

PackageFinder and PEP420PackageFinder are fundamental pieces of this
puzzle and grouping together them togheter with the code implementing these
new discovery features make a lot of sense.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Needs Discussion Issues where the implementation still needs to be discussed.
Projects
None yet
Development

No branches or pull requests

9 participants