Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more robust handling of long descriptions #156

Merged
merged 9 commits into from
Jul 18, 2024
Merged

Conversation

diazona
Copy link
Owner

@diazona diazona commented Jul 7, 2024

This PR makes some major improvements in how we handle the long_description field from setuptools:

  • Now if the long description is given as an inline string in setup.cfg or setup.py (as far as we can tell) rather than being loaded from a file, it will be kept as an inline string in pyproject.toml.
  • If there's no content type provided and one cannot be definitively determined from the file extension, the user can provide it with a new option --readme-content-type; otherwise the program will guess text/plain.
  • It fixes the error where the long description would sometimes be missing from the command options, closing KeyError: 'long_description' #132.
  • The code is refactored to be more modular and more testable.

Thanks to contributors on a Mastodon thread for some useful discussion.

@diazona diazona added the setuptools-fields Fields in the pyproject data structure that this project needs to support label Jul 7, 2024
@diazona diazona added this to the v0.3 milestone Jul 7, 2024
This commit moves type declarations into a new module, _types.py, which
has two benefits:

- It helps code organization and readability
- It lets us use the types from different modules, when additional
  modules are added in the future, without having to deal with circular
  imports.
@diazona diazona marked this pull request as ready for review July 17, 2024 11:11
@diazona
Copy link
Owner Author

diazona commented Jul 17, 2024

@sjlongland I'm going to want to come back and give this one more review with fresh eyes before actually merging it, but I think it should be ready. Figured there was no need to wait to ping you for review.

@diazona diazona requested a review from sjlongland July 17, 2024 11:13
@sjlongland
Copy link
Collaborator

@diazona No problems, well I can review and hold off on actual merging if you think there might be last-minute changes. :-)

Copy link
Collaborator

@sjlongland sjlongland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I'll hold off on the merge until you're happy with it all as well. :-)

@diazona
Copy link
Owner Author

diazona commented Jul 17, 2024

Ah you know what, it looks like I goofed up the commit organization (checked with git rebase --exec 'pre-commit run --all-files' main). I'll fix that (when I get time) and re-push - the overall diff will be the same, I just have to move a few patch fragments from one commit to another.

In upcoming commits I'm going to be changing the code that handles
the long description/readme fields in a pretty significant way, and that
code is going to get rather complex, so I decided to split it out into
a separate module to aid readability. I also converted it from a simple
block to a class and some helper functions. The class stores the text,
content type, and filesystem path of the readme element, and it provides
some validation of those values before they get serialized into the data
structure that will become pyproject.toml.

This commit doesn't include any change in functionality, it's just
refactoring the code so that future functional changes will be easier to
understand.
It turns out pyproject.toml does support putting the content of
the readme attribute directly inline, so this commit changes the logic
to do that when no filename can be determined. Since there are many
cases where setup.py reads the content of a file and gives that content
to the long_description argument of setup(), the code will look for
a file containing the readme text and use that if found.
The command_options attribute of a Distribution contains the CLI options
passed to that command, but it also contains project metadata specified
in setup.cfg, under the "metadata" key. So a project might have metadata
in this value that doesn't have the long_description key, if
the long_description argument was passed to setup.py instead.
Accordingly, we need to check both whether the "metadata" key exists,
and whether the "long_description" key exists within it, to avoid
getting an error. This commit implements that change.
pyproject.toml supports giving readme information in any of three forms:

- A filename with a content type
- A filename without a content type, but it's expected that the type
  will be inferred from the file's extension
- A raw string with a content type

setuptools, however, also allows giving the long description as a raw
string with no associated content type. In this case it's ambiguous what
the corresponding data structure in pyproject.toml should be.

This commit adds an option for the user to specify the content type
explicitly and resolve the ambiguity. I chose this approach based in
part on a Mastodon poll:
  https://techhub.social/@diazona/112736102428118730
The leading option in the poll was to make the user give an explicit
content type, but on the other hand most of the people who replied
suggested that it'd be better to guess, falling back to text/plain if no
other content type should be determined. We could certainly take either
approach, but ultimately I was more swayed by the responses. Plus, the
program already tries to guess a readme file (although admittedly that's
a more educated guess than for the content type), so we kind of have
a precedent for making guesses to fill in unknown values rather than
failing and forcing the user to provide them explicitly.
For some tests it will be handy to be able to create a custom project
and access the Distribution object generated by running setup.py in
that project directory. This commit splits out that functionality from
Project.generate().
This commit changes the readme field tests to allow for writing out
a readme dict with a `text` element, if the readme content appears not
to come from a file, and to exercise the various ways of setting
a content type (from setup.cfg or from the command line).
@diazona
Copy link
Owner Author

diazona commented Jul 18, 2024

OK done! I can go ahead and merge this, since you already approved. (Which brings to mind the separate issue of whether approvals should be reset when the branch is rebased; I feel like it's more "technically correct" to do so, but it really doesn't matter in this context.)

@diazona diazona merged commit f5a110f into main Jul 18, 2024
11 checks passed
@diazona diazona deleted the long-description/1/dev branch July 18, 2024 06:38
@diazona diazona linked an issue Jul 19, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
setuptools-fields Fields in the pyproject data structure that this project needs to support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

KeyError: 'long_description'
2 participants