-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable comparison of C-like enums by identity #3061
base: main
Are you sure you want to change the base?
Conversation
634c01d
to
4135929
Compare
bd3524a
to
e3eb10c
Compare
f2a89b1
to
286ee1e
Compare
60ed4b9
to
4e85f63
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting together this PR! I think there's a couple things we need to think about, however I'm very glad for improvements in this area and so long as we have an idea how we want to move forward in future we don't need to solve everything right away 😄
pyo3-macros-backend/src/pyclass.rs
Outdated
quote! { | ||
impl _pyo3::IntoPy<_pyo3::PyObject> for #cls { | ||
fn into_py(self, py: _pyo3::Python) -> _pyo3::PyObject { | ||
static SINGLETON: _pyo3::once_cell::GILOnceCell<_pyo3::Py<#cls>> = _pyo3::once_cell::GILOnceCell::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of GILOnceCell
seems reasonable, however at the same time if we ever want to have a stab at #2274 (which is admittedly a long way off) then it'd be better to not introduce new statics containing Python objects.
I think an alternative implementation could look up variants as attributes from the enum's Python class object, however I think there's potentially a chicken-or-egg problem of how we create those attributes (as I think that currently uses IntoPy
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think given that we're still no closer to solving the statics problem, and it'll require coming up with new patterns which will probably apply to all statics equally, let's just stick with a static GILOnceCell
for now and leave migration off that for the future 😆
@@ -72,10 +72,59 @@ fn test_enum_eq_incomparable() { | |||
}) | |||
} | |||
|
|||
#[test] | |||
fn test_enum_compare_by_identity() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I noticed a test above test_enum_class_attr
with the following line:
Py::new(py, MyEnum::Variant).unwrap();
This is potentially a problem; that is a new instance of MyEnum
type and so it won't compare successfully with identity.
Now, the question is, is that a good thing? Pro could be that it gives users who need it an escape hatch. However, it is probably also a footgun.
Unfortunately, changing how Py::new
works would require a refactoring of our initialization machinery. I'm sure there's plenty of scope for improvement in that area (e.g. #2384), however it would likely be a much bigger patch than this PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this possibility is a footgun which we should prevent. Maybe we should put this a bit on hold until the scope for changes to Py::new
is defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "this on hold" do you mean solving this particular problem or the whole PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the whole pull request but I don't know what the best approach would be. It seems that comparing enums by identity is a desired feature but there are also two concerns #3061 (comment) and #3061 (comment) that would both require quite a refactor. Also happy to help in that regard but I don't know a lot about the existing problems and where to start. But it seems that it might not be the right time to make this change as I can imagine you don't want to complicate that work further by including some changes now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we've just had #3287 merged which might be enough of a tweak to make this possible. Given that Py::new
relies on Into<PyClassInitializer<T>>
we might be able to adjust that trait implementation for enums to somehow use PyClassInitializerImpl::Existing
(by fetching the value from a GILOnceCell
which is itself initialized by a PyClassInitializerImpl::New
invocation to Py::new
.
pyo3-macros-backend/src/pyclass.rs
Outdated
quote! { | ||
impl _pyo3::IntoPy<_pyo3::PyObject> for #cls { | ||
fn into_py(self, py: _pyo3::Python) -> _pyo3::PyObject { | ||
static SINGLETON_PER_VARIANT: _pyo3::once_cell::GILOnceCell<::std::collections::HashMap<usize, _pyo3::Py<#cls>>> = _pyo3::once_cell::GILOnceCell::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to use a HashMap
here instead of a fixed-size array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think #cls::#variant_name as usize
could be sparse, e.g.
enum Foo {
Bar = 1,
Baz = 999999,
}
so while we could store those as sorted pairs in a [(usize, Py<#cls>); 2]
, we would need to do a binary search for the variants instead of a hash table look-up as direct access into a [(usize, Option<Py<#cls>>); 1000000]
seems wasteful.
Personally, I see the simplicity of using the hash table but would probably opt for the sorted array in a follow-up. (But let's see whether this lands using the static
s at all.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The discriminant can indeed be sparse which is also tested by test_custom_discriminant_comparison_by_identity
. But using binary search in a sorted array is definitely a possible replacement for the hashmap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can know from parsing the enum whether it's dense or sparse?
For the dense case we can just index into an array, no binary search needed, and for the sparse case we can generate a match
in front to convert into a dense set of indices first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and for the sparse case we can generate a match in front to convert into a dense set of indices first.
This would really be nice as the compiler is pretty clever in how to handle match
AFAIK, e.g. producing linear or binary searches depending on cardinality estimates.
Fix #3059 by changing the
IntoPy
implementation of enums.