Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable comparison of C-like enums by identity #3061

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ricohageman
Copy link
Contributor

@ricohageman ricohageman commented Mar 24, 2023

Fix #3059 by changing the IntoPy implementation of enums.

@ricohageman ricohageman force-pushed the main branch 5 times, most recently from 634c01d to 4135929 Compare March 25, 2023 22:35
@ricohageman ricohageman force-pushed the main branch 5 times, most recently from bd3524a to e3eb10c Compare March 25, 2023 23:57
@ricohageman ricohageman force-pushed the main branch 2 times, most recently from f2a89b1 to 286ee1e Compare March 26, 2023 00:17
@ricohageman ricohageman marked this pull request as ready for review March 26, 2023 00:24
@ricohageman ricohageman force-pushed the main branch 2 times, most recently from 60ed4b9 to 4e85f63 Compare March 27, 2023 09:30
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting together this PR! I think there's a couple things we need to think about, however I'm very glad for improvements in this area and so long as we have an idea how we want to move forward in future we don't need to solve everything right away 😄

quote! {
impl _pyo3::IntoPy<_pyo3::PyObject> for #cls {
fn into_py(self, py: _pyo3::Python) -> _pyo3::PyObject {
static SINGLETON: _pyo3::once_cell::GILOnceCell<_pyo3::Py<#cls>> = _pyo3::once_cell::GILOnceCell::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of GILOnceCell seems reasonable, however at the same time if we ever want to have a stab at #2274 (which is admittedly a long way off) then it'd be better to not introduce new statics containing Python objects.

I think an alternative implementation could look up variants as attributes from the enum's Python class object, however I think there's potentially a chicken-or-egg problem of how we create those attributes (as I think that currently uses IntoPy).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think given that we're still no closer to solving the statics problem, and it'll require coming up with new patterns which will probably apply to all statics equally, let's just stick with a static GILOnceCell for now and leave migration off that for the future 😆

@@ -72,10 +72,59 @@ fn test_enum_eq_incomparable() {
})
}

#[test]
fn test_enum_compare_by_identity() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I noticed a test above test_enum_class_attr with the following line:

Py::new(py, MyEnum::Variant).unwrap();

This is potentially a problem; that is a new instance of MyEnum type and so it won't compare successfully with identity.

Now, the question is, is that a good thing? Pro could be that it gives users who need it an escape hatch. However, it is probably also a footgun.

Unfortunately, changing how Py::new works would require a refactoring of our initialization machinery. I'm sure there's plenty of scope for improvement in that area (e.g. #2384), however it would likely be a much bigger patch than this PR...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this possibility is a footgun which we should prevent. Maybe we should put this a bit on hold until the scope for changes to Py::new is defined?

Copy link
Member

@davidhewitt davidhewitt Apr 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "this on hold" do you mean solving this particular problem or the whole PR?

Copy link
Contributor Author

@ricohageman ricohageman Apr 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the whole pull request but I don't know what the best approach would be. It seems that comparing enums by identity is a desired feature but there are also two concerns #3061 (comment) and #3061 (comment) that would both require quite a refactor. Also happy to help in that regard but I don't know a lot about the existing problems and where to start. But it seems that it might not be the right time to make this change as I can imagine you don't want to complicate that work further by including some changes now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we've just had #3287 merged which might be enough of a tweak to make this possible. Given that Py::new relies on Into<PyClassInitializer<T>> we might be able to adjust that trait implementation for enums to somehow use PyClassInitializerImpl::Existing (by fetching the value from a GILOnceCell which is itself initialized by a PyClassInitializerImpl::New invocation to Py::new.

quote! {
impl _pyo3::IntoPy<_pyo3::PyObject> for #cls {
fn into_py(self, py: _pyo3::Python) -> _pyo3::PyObject {
static SINGLETON_PER_VARIANT: _pyo3::once_cell::GILOnceCell<::std::collections::HashMap<usize, _pyo3::Py<#cls>>> = _pyo3::once_cell::GILOnceCell::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use a HashMap here instead of a fixed-size array?

Copy link
Member

@adamreichold adamreichold Apr 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #cls::#variant_name as usize could be sparse, e.g.

enum Foo {
  Bar = 1,
  Baz = 999999,
}

so while we could store those as sorted pairs in a [(usize, Py<#cls>); 2], we would need to do a binary search for the variants instead of a hash table look-up as direct access into a [(usize, Option<Py<#cls>>); 1000000] seems wasteful.

Personally, I see the simplicity of using the hash table but would probably opt for the sorted array in a follow-up. (But let's see whether this lands using the statics at all.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discriminant can indeed be sparse which is also tested by test_custom_discriminant_comparison_by_identity . But using binary search in a sorted array is definitely a possible replacement for the hashmap.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can know from parsing the enum whether it's dense or sparse?

For the dense case we can just index into an array, no binary search needed, and for the sparse case we can generate a match in front to convert into a dense set of indices first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and for the sparse case we can generate a match in front to convert into a dense set of indices first.

This would really be nice as the compiler is pretty clever in how to handle match AFAIK, e.g. producing linear or binary searches depending on cardinality estimates.

@alex alex mentioned this pull request Jun 20, 2023
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Comparing enums by identity
4 participants