-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more robust handling of long descriptions #156
Conversation
This commit moves type declarations into a new module, _types.py, which has two benefits: - It helps code organization and readability - It lets us use the types from different modules, when additional modules are added in the future, without having to deal with circular imports.
bfc8911
to
97c9742
Compare
@sjlongland I'm going to want to come back and give this one more review with fresh eyes before actually merging it, but I think it should be ready. Figured there was no need to wait to ping you for review. |
@diazona No problems, well I can review and hold off on actual merging if you think there might be last-minute changes. :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but I'll hold off on the merge until you're happy with it all as well. :-)
Ah you know what, it looks like I goofed up the commit organization (checked with |
In upcoming commits I'm going to be changing the code that handles the long description/readme fields in a pretty significant way, and that code is going to get rather complex, so I decided to split it out into a separate module to aid readability. I also converted it from a simple block to a class and some helper functions. The class stores the text, content type, and filesystem path of the readme element, and it provides some validation of those values before they get serialized into the data structure that will become pyproject.toml. This commit doesn't include any change in functionality, it's just refactoring the code so that future functional changes will be easier to understand.
It turns out pyproject.toml does support putting the content of the readme attribute directly inline, so this commit changes the logic to do that when no filename can be determined. Since there are many cases where setup.py reads the content of a file and gives that content to the long_description argument of setup(), the code will look for a file containing the readme text and use that if found.
The command_options attribute of a Distribution contains the CLI options passed to that command, but it also contains project metadata specified in setup.cfg, under the "metadata" key. So a project might have metadata in this value that doesn't have the long_description key, if the long_description argument was passed to setup.py instead. Accordingly, we need to check both whether the "metadata" key exists, and whether the "long_description" key exists within it, to avoid getting an error. This commit implements that change.
pyproject.toml supports giving readme information in any of three forms: - A filename with a content type - A filename without a content type, but it's expected that the type will be inferred from the file's extension - A raw string with a content type setuptools, however, also allows giving the long description as a raw string with no associated content type. In this case it's ambiguous what the corresponding data structure in pyproject.toml should be. This commit adds an option for the user to specify the content type explicitly and resolve the ambiguity. I chose this approach based in part on a Mastodon poll: https://techhub.social/@diazona/112736102428118730 The leading option in the poll was to make the user give an explicit content type, but on the other hand most of the people who replied suggested that it'd be better to guess, falling back to text/plain if no other content type should be determined. We could certainly take either approach, but ultimately I was more swayed by the responses. Plus, the program already tries to guess a readme file (although admittedly that's a more educated guess than for the content type), so we kind of have a precedent for making guesses to fill in unknown values rather than failing and forcing the user to provide them explicitly.
For some tests it will be handy to be able to create a custom project and access the Distribution object generated by running setup.py in that project directory. This commit splits out that functionality from Project.generate().
This commit changes the readme field tests to allow for writing out a readme dict with a `text` element, if the readme content appears not to come from a file, and to exercise the various ways of setting a content type (from setup.cfg or from the command line).
97c9742
to
46a835c
Compare
OK done! I can go ahead and merge this, since you already approved. (Which brings to mind the separate issue of whether approvals should be reset when the branch is rebased; I feel like it's more "technically correct" to do so, but it really doesn't matter in this context.) |
This PR makes some major improvements in how we handle the
long_description
field from setuptools:setup.cfg
orsetup.py
(as far as we can tell) rather than being loaded from a file, it will be kept as an inline string inpyproject.toml
.--readme-content-type
; otherwise the program will guesstext/plain
.Thanks to contributors on a Mastodon thread for some useful discussion.