Skip to content

Allow case sensitive enum values #725

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

expobrain
Copy link
Collaborator

@expobrain expobrain commented Feb 20, 2023

Added setting case_sensitive_enums to allow case sensitive enum values in the generated code.

This solve the issue in #587 .

Also, to avoid collisions, changes the prefix for values not starting with alphanumeric characters from VALUE_ to LITERAL_.

How could it happen

Consider the case [1, "1bc"], the previous code would assign the enum VALUE_1 to the first item (aka the VALUE_ prefix plus the int value of the item) and the same enum VALUE_1 for the second item (the prefix VALUE_ plus the position of the non-alphanumeric value in the list).

The new code avoids that and use different prefixes for integer values (VALUE_{value}) and string starting with non-alphanumeric characters (LITERAL_{sanitised_value}).

@dbanty dbanty added ✨ enhancement New feature or improvement 🥚breaking This change breaks compatibility labels Feb 21, 2023
@codecov
Copy link

codecov bot commented Feb 21, 2023

Codecov Report

Merging #725 (656e9af) into main (47cfdec) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main      #725   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           49        49           
  Lines         1971      1976    +5     
=========================================
+ Hits          1971      1976    +5     
Impacted Files Coverage Δ
openapi_python_client/config.py 100.00% <100.00%> (ø)
...penapi_python_client/parser/properties/__init__.py 100.00% <100.00%> (ø)
...i_python_client/parser/properties/enum_property.py 100.00% <100.00%> (ø)
openapi_python_client/utils.py 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@dbanty dbanty added this to the 0.14.0 milestone Apr 13, 2023
@@ -66,10 +66,19 @@ def fix_reserved_words(value: str) -> str:
return value


def case_insensitive_snake_case(value: str) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called case_sensitive_snake_case instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, fixed

@eli-bl
Copy link
Collaborator

eli-bl commented Aug 21, 2024

I had submitted #1095 before seeing that this fixes the same issue. I've closed mine, but it did include one other change that I think might be good to add to this PR: skipping duplicate values.

It is totally valid in OpenAPI (though pointless) to have a list of enum values like ["A", "B", "B", "C"]; since that is just saying "the value must be A or B or B or C", it would make sense for the generator to treat it exactly the same as ["A", "B", "C"] (by using a set to keep track of what values have been seen), but currently it throws an error instead.

@eli-bl
Copy link
Collaborator

eli-bl commented Aug 21, 2024

Also, if I understand correctly, the only reason this PR is flagged as "breaking" is the change to the prefix logic (VALUE_, LITERAL_). I agree with the rationale behind that change, but to avoid unnecessarily breaking people's code that is relying on the current behavior, maybe it would be best to make that part an opt-in via a config option?

@dbanty dbanty removed this from the 0.14.0 milestone Aug 25, 2024
@expobrain expobrain force-pushed the case_sensitive_enum_values branch 6 times, most recently from 63859f4 to bd77766 Compare August 28, 2024 22:02
@expobrain expobrain force-pushed the case_sensitive_enum_values branch from bd77766 to 5560f54 Compare August 28, 2024 22:03
@expobrain
Copy link
Collaborator Author

@eli-bl about your comments above:

@expobrain expobrain requested a review from eli-bl August 29, 2024 07:18
@eli-bl
Copy link
Collaborator

eli-bl commented Aug 29, 2024

@expobrain:

about having a config to handle the new enum behaviour and avoid breaking changes, I've already added this in the ConfigFile here

I see the new config option, but it looks to me like you're only using it to control the new case-sensitivity behavior.

The other new behavior you added was using LITERAL_ instead of VALUE_ for string values that start with a non-alphanumeric character (here). I understand the rationale for that, but I think it is a breaking change— since the PR is applying that new logic always, not just if the config option is set.

(Btw, you've added me as a reviewer but I'm not a maintainer of the repo, just someone else who has submitted PRs; I commented on this one because I had started working on the same thing before I saw yours)

@eli-bl
Copy link
Collaborator

eli-bl commented Aug 29, 2024

And... at the risk of being nitpicky and overthinking things, I do have some other thoughts about the VALUE_ vs. LITERAL_ thing.

First, I do think that it should be a config option because it's a breaking change... but it really doesn't have anything to do with case-sensitivity. It's just that you happened to implement it in the same PR. If I were someone trying to use the generator and having this [1, "1bc"] type of problem, I probably wouldn't think of setting config.case_sensitive_enums.

Second, although your decision to use VALUE_{index} for numeric values vs. LITERAL_{value} for string values has the advantage of preserving the existing behavior for some values, I'm not sure that that is an advantage, because the current existing behavior of VALUE_{index} is (in my opinion) kind of counter-intuitive for numeric values. That is, if I have an enum whose values are 1, 5, and 6, it's not very useful to have the symbols be VALUE_0, VALUE_1, and VALUE_2 (especially since if I later changed the list to [0, 1, 5, 6], all of those constants would suddenly refer to different values). Having the actual value be part of the name, as you're doing for strings, would be both more intuitive/useful, and more consistent with how it works for all other values.

I don't feel super strongly about this, and I'm guessing that "string property whose allowable values all start with digits" is not a very common pattern anyway, but if you think I have a point, one way you could address it would be something like:

  • For numeric values, use VALUE_{value} (not VALUE_{index}).
  • For string values that don't start with an alpha character, use VALUE_STRING_{value}.
  • Both of these new behaviors are gated by a single config option that is not the case-sensitivity option. I can't think of a great name, but the concept it would be trying to convey is "enum constant names always contain the value, even if the value isn't a valid name by itself".

So, this would be a bigger breaking change in the sense that my [1, 5, 6] example would now produce VALUE_1, VALUE_5, and VALUE_6, but I think that is arguably desirable, and using VALUE_STRING_ instead of LITERAL_ makes it a bit clearer why there would be two different prefixes (numeric values are literals too).

Of course none of these approaches can really avoid all collisions, since someone could have actual enum values that are equal to VALUE_1 or VALUE_1bc or LITERAL_1bc. The only way to be absolutely sure of being able to work with all possible values would be to use rigorous string escaping plus some kind of type prefixes, but I don't think anyone would want that, since it would make the symbols more awkward for pretty much everyone. So it's a compromise either way.

ps. There's also a way that you could avoid having to have a config option for the case-sensitivity part: turn on the case-sensitive enum behavior only if you detect that there would be conflicts otherwise. That is, take a pass through the list and see if there are any pairs like "A"/"a"; if not, use the old behavior. That way something like ["a", "b"] would still produce A and B like before, so it's not a breaking change. And something like ["a", "b", "A"] would now produce a and b and A, but that wouldn't be a breaking change because prior to the change, the generator would've simply been unusable for that spec.

@knope-bot knope-bot bot mentioned this pull request Oct 20, 2024
github-merge-queue bot pushed a commit that referenced this pull request Oct 20, 2024
> [!IMPORTANT]
> Merging this pull request will create this release

## Features

- update Ruff to >=0.2,<0.8 (#1137)
- Add UUID string format. Thanks @estyrke! (#1140)
- Support OpenAPI 3.1 prefixItems property for arrays. Thanks @estyrke!
(#1141)

### Add `literal_enums` config setting

Instead of the default `Enum` classes for enums, you can now generate
`Literal` sets wherever `enum` appears in the OpenAPI spec by setting
`literal_enums: true` in your config file.

```yaml
literal_enums: true
```

Thanks to @emosenkis for PR #1114 closes #587, #725, #1076, and probably
many more.
Thanks also to @eli-bl, @expobrain, @theorm, @chrisguillory, and anyone
else who helped getting to this design!

## Fixes

- Typo in docstring (#1128)

### Use literal value instead of `HTTPStatus` enum when checking
response statuses

Python 3.13 renamed some of the `HTTPStatus` enum members, which means
clients generated with Python 3.13 may not work
with older versions of Python. This change stops using the `HTTPStatus`
enum directly when checking response statuses.

Statuses will still be checked for validity at generation time, and
transformed into `HTTPStatus` _after_ being checked
at runtime.

This may cause some linters to complain.

Co-authored-by: knope-bot[bot] <152252888+knope-bot[bot]@users.noreply.github.com>
@dbanty
Copy link
Collaborator

dbanty commented Oct 20, 2024

Superceded by #1114, which I think will add enough flexibility for anyone who needs it. Thanks for this early version!

@dbanty dbanty closed this Oct 20, 2024
@eli-bl
Copy link
Collaborator

eli-bl commented Oct 21, 2024

Superceded by #1114, which I think will add enough flexibility for anyone who needs it

How to judge "enough flexibility" is subjective of course, but for what it's worth, here are two counter-arguments for why #1114 could still be desirable.

  1. The enum pattern is idiomatic. If there's a parameter with several possible values, and the API client provides a type name for that, then being able to write something like Sort.ASCENDING makes it clear what's going on (and makes it very easy to click through to see how that type is defined).
  2. Since the "literal enums" option is global, there is a potentially big migration problem for people maintaining existing APIs. This definitely applies in my own use case: this API has a lot of enum types, and there are a lot of developers out there using a generated client for it, so they have a lot of references to things like Sort.ASCENDING. I now have the task of adding one enum type that happens to have the conflicting-names problem (due to values like ["A", "a"]). If my only option is to turn on "literal enums", then all of that existing code will break.

@dbanty
Copy link
Collaborator

dbanty commented Oct 22, 2024

If my only option is to turn on "literal enums", then all of that existing code will break.

Totally fair, and ideally we could enhance this option to be applicable to individual enums, rather than only globally. Maybe a match on $ref would be enough?

The enum pattern is idiomatic

Unfortunately it's not possible to render every OpenAPI document into idiomatic Python. For example, enum variants should, idiomatically, be screaming snake case. But obviously that auto-conversion doesn't always work.

It's possible we could allow one-off variant name overrides in another separate config so that folks can produce idiomatic Python values. Ideally, they'd also be able to use this to explain to consumers why "A" is different than "a".

@eli-bl
Copy link
Collaborator

eli-bl commented Oct 22, 2024

Unfortunately it's not possible to render every OpenAPI document into idiomatic Python. For example, enum variants should, idiomatically, be screaming snake case. But obviously that auto-conversion doesn't always work.

Sure, I get that. OpenAPI doesn't align smoothly with Python or lots of other languages, and the generator can't be expected to cover every use case in a way that'll please everyone. And the literal enum option is useful.

I should've made it clearer that what I'm arguing for is just a little more granularity in this feature. That was brought up early on, but then your latest comment suggested that just being able to turn it on or off globally was enough flexibility... so what I was getting at was that a lot of people have gotten used to using the formal enum class pattern, and I would find it hard to explain to them why they have to stop using it for 99% of my API because there is 1% that it wouldn't be a good fit for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🥚breaking This change breaks compatibility ✨ enhancement New feature or improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants