Add more robust handling of long descriptions #156

diazona · 2024-07-07T10:21:56Z

This PR makes some major improvements in how we handle the long_description field from setuptools:

Now if the long description is given as an inline string in setup.cfg or setup.py (as far as we can tell) rather than being loaded from a file, it will be kept as an inline string in pyproject.toml.
If there's no content type provided and one cannot be definitively determined from the file extension, the user can provide it with a new option --readme-content-type; otherwise the program will guess text/plain.
It fixes the error where the long description would sometimes be missing from the command options, closing KeyError: 'long_description' #132.
The code is refactored to be more modular and more testable.

Thanks to contributors on a Mastodon thread for some useful discussion.

This commit moves type declarations into a new module, _types.py, which has two benefits: - It helps code organization and readability - It lets us use the types from different modules, when additional modules are added in the future, without having to deal with circular imports.

diazona · 2024-07-17T11:12:17Z

@sjlongland I'm going to want to come back and give this one more review with fresh eyes before actually merging it, but I think it should be ready. Figured there was no need to wait to ping you for review.

sjlongland · 2024-07-17T11:38:49Z

@diazona No problems, well I can review and hold off on actual merging if you think there might be last-minute changes. :-)

src/setuptools_pyproject_migration/__init__.py

src/setuptools_pyproject_migration/_long_description.py

sjlongland

Looks good, but I'll hold off on the merge until you're happy with it all as well. :-)

diazona · 2024-07-17T19:09:18Z

Ah you know what, it looks like I goofed up the commit organization (checked with git rebase --exec 'pre-commit run --all-files' main). I'll fix that (when I get time) and re-push - the overall diff will be the same, I just have to move a few patch fragments from one commit to another.

In upcoming commits I'm going to be changing the code that handles the long description/readme fields in a pretty significant way, and that code is going to get rather complex, so I decided to split it out into a separate module to aid readability. I also converted it from a simple block to a class and some helper functions. The class stores the text, content type, and filesystem path of the readme element, and it provides some validation of those values before they get serialized into the data structure that will become pyproject.toml. This commit doesn't include any change in functionality, it's just refactoring the code so that future functional changes will be easier to understand.

It turns out pyproject.toml does support putting the content of the readme attribute directly inline, so this commit changes the logic to do that when no filename can be determined. Since there are many cases where setup.py reads the content of a file and gives that content to the long_description argument of setup(), the code will look for a file containing the readme text and use that if found.

The command_options attribute of a Distribution contains the CLI options passed to that command, but it also contains project metadata specified in setup.cfg, under the "metadata" key. So a project might have metadata in this value that doesn't have the long_description key, if the long_description argument was passed to setup.py instead. Accordingly, we need to check both whether the "metadata" key exists, and whether the "long_description" key exists within it, to avoid getting an error. This commit implements that change.

pyproject.toml supports giving readme information in any of three forms: - A filename with a content type - A filename without a content type, but it's expected that the type will be inferred from the file's extension - A raw string with a content type setuptools, however, also allows giving the long description as a raw string with no associated content type. In this case it's ambiguous what the corresponding data structure in pyproject.toml should be. This commit adds an option for the user to specify the content type explicitly and resolve the ambiguity. I chose this approach based in part on a Mastodon poll: https://techhub.social/@diazona/112736102428118730 The leading option in the poll was to make the user give an explicit content type, but on the other hand most of the people who replied suggested that it'd be better to guess, falling back to text/plain if no other content type should be determined. We could certainly take either approach, but ultimately I was more swayed by the responses. Plus, the program already tries to guess a readme file (although admittedly that's a more educated guess than for the content type), so we kind of have a precedent for making guesses to fill in unknown values rather than failing and forcing the user to provide them explicitly.

For some tests it will be handy to be able to create a custom project and access the Distribution object generated by running setup.py in that project directory. This commit splits out that functionality from Project.generate().

This commit changes the readme field tests to allow for writing out a readme dict with a `text` element, if the readme content appears not to come from a file, and to exercise the various ways of setting a content type (from setup.cfg or from the command line).

diazona · 2024-07-18T06:38:31Z

OK done! I can go ahead and merge this, since you already approved. (Which brings to mind the separate issue of whether approvals should be reset when the branch is rebased; I feel like it's more "technically correct" to do so, but it really doesn't matter in this context.)

diazona added the setuptools-fields Fields in the pyproject data structure that this project needs to support label Jul 7, 2024

diazona added this to the v0.3 milestone Jul 7, 2024

diazona mentioned this pull request Jul 15, 2024

Support passing command line arguments to setup.py from test code #157

Merged

diazona force-pushed the long-description/1/dev branch from bfc8911 to 97c9742 Compare July 17, 2024 11:10

diazona marked this pull request as ready for review July 17, 2024 11:11

diazona requested a review from sjlongland July 17, 2024 11:13

sjlongland reviewed Jul 17, 2024

View reviewed changes

src/setuptools_pyproject_migration/__init__.py Show resolved Hide resolved

sjlongland reviewed Jul 17, 2024

View reviewed changes

src/setuptools_pyproject_migration/_long_description.py Show resolved Hide resolved

sjlongland approved these changes Jul 17, 2024

View reviewed changes

diazona mentioned this pull request Jul 17, 2024

Add a warning for nonstandard content types #159

Open

diazona added 8 commits July 17, 2024 23:35

Add tests for long description utilities

6c7aa6f

Allow accessing the Distribution object for a Project

7a05195

For some tests it will be handy to be able to create a custom project and access the Distribution object generated by running setup.py in that project directory. This commit splits out that functionality from Project.generate().

Add a changelog entry for new long description handling

46a835c

diazona force-pushed the long-description/1/dev branch from 97c9742 to 46a835c Compare July 18, 2024 06:35

diazona merged commit f5a110f into main Jul 18, 2024
11 checks passed

diazona deleted the long-description/1/dev branch July 18, 2024 06:38

diazona linked an issue Jul 19, 2024 that may be closed by this pull request

KeyError: 'long_description' #132

Closed

diazona mentioned this pull request Jul 19, 2024

Apply xfail markers to individual test methods in external project tests #160

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more robust handling of long descriptions #156

Add more robust handling of long descriptions #156

diazona commented Jul 7, 2024 •

edited

Loading

diazona commented Jul 17, 2024

sjlongland commented Jul 17, 2024

sjlongland left a comment

diazona commented Jul 17, 2024 •

edited

Loading

diazona commented Jul 18, 2024

Add more robust handling of long descriptions #156

Add more robust handling of long descriptions #156

Conversation

diazona commented Jul 7, 2024 • edited Loading

diazona commented Jul 17, 2024

sjlongland commented Jul 17, 2024

sjlongland left a comment

Choose a reason for hiding this comment

diazona commented Jul 17, 2024 • edited Loading

diazona commented Jul 18, 2024

diazona commented Jul 7, 2024 •

edited

Loading

diazona commented Jul 17, 2024 •

edited

Loading