Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add #[classattr] methods to define Python class attributes #905

Merged
merged 4 commits into from
May 8, 2020

Conversation

scalexm
Copy link
Contributor

@scalexm scalexm commented May 5, 2020

Here is my approach to #662: we add an attribute on methods defined in #[pymethods] impl blocks to allow defining class attributes (also named class variables).

A class attribute method cannot take arguments, and must return impl IntoPy<PyObject>.

For example, given the following Python code:

class Foo:
    foo = 5
    BAR = 'bar'

a translation in Rust with pyo3 could be given by:

#[pyclass]
struct Foo { }

#[pymethods]
impl Foo {
    #[classattr]
    fn foo() -> i32 {
        5
    }

    #[classattr]
    #[name = "BAR"]
    fn bar() -> String {
        "bar".to_string()
    }
}

@davidhewitt
Copy link
Member

Thanks for this PR, a way to add class attributes like this would be great. Couple questions (disclaimer, I haven't read the code yet):

Bike-shedding on the name of the attribute (currently #[attribute]) is welcome. Also, I'm not used to the Python C API so it's possible that my code has memory errors.

Can these things ever be settable? Maybe a pair of names like classgetter / classsetter? Though maybe these functions run only once during type object creation?

Note that in the latter case, returning an Err value will crash the Python interpreter

Did you explore mechanisms to prevent returning PyResult from these functions? It would be nice to make it impossible to compile code that's guaranteed to cause runtime crash, if that's viable.

Also, I wonder whether we could use syntax like associated constants instead of functions. But I think they're only stable for traits, not structs?

@programmerjake
Copy link
Contributor

It would be nice if we could also use associated constants (which are stable, i8::MAX (not to be confused with std::i8::MAX) is a good example):

#[pymethods]
impl MyClass {
    #[attribute]
    pub const MY_CONST: &str = "blah";

    #[attribute]
    pub fn my_const2(_py: Python) -> Vec<u8> {
        vec![1, 2, 3]
    }
}

@kngwyu
Copy link
Member

kngwyu commented May 6, 2020

Interesting, thank you @scalexm!

It would be nice if we could also use associated constants

How about using const fn instead of constants?

Copy link
Member

@kngwyu kngwyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, but generally looks good.

pyo3-derive-backend/src/pymethod.rs Outdated Show resolved Hide resolved
src/pyclass.rs Outdated Show resolved Hide resolved
tests/test_class_attributes.rs Outdated Show resolved Hide resolved
src/class/methods.rs Outdated Show resolved Hide resolved
@pganssle
Copy link
Member

pganssle commented May 6, 2020

Haven't dug into the implementation, but it's not clear - can you modify these at runtime? Either way I would add a test for the expected behavior if you try to assign a value to it.

Also, it's not immediately clear: Is the function evaluated when the class is constructed, or every time it is accessed?

@kngwyu
Copy link
Member

kngwyu commented May 6, 2020

Haven't dug into the implementation, but it's not clear - can you modify these at runtime? Either way I would add a test for the expected behavior if you try to assign a value to it.
Also, it's not immediately clear: Is the function evaluated when the class is constructed, or every time it is accessed?

In the current implementation, we just do type(PyClass).__dict__["name"] = var once when the type object is created.

  1. About returning an error
    Is there any use case where we need to return an error from attribute? If not, I think we should restrict the return type to IntoPy<PyObject>.

  2. About naming
    attribute is not so bad, but aren't they called class variable in Python?
    How about #[classvar]?

  3. Can these things ever be settable?

They are settable in Python.
But since in Rust, we need to use Cell/RefCell to share mutable variables and it's not a common pattern. So I don't think we have any strong reason to support setter for it.

  1. It would be nice if we could also use associated constants

Agreed.

@scalexm
Copy link
Contributor Author

scalexm commented May 6, 2020

Thanks for the reviews.

How about #[classvar]

Yes, that’s probably better. Always called that class attributes but indeed the official Python documentation calls them class variables :)

Is there any use case where we need to return an error from attribute

Python does support it, in which case an exception is raised when the module is imported. It’s indeed not clear how useful it is, so I guess we can disable it for now (only accept IntoPy<PyObject>) and think about it later if someone comes up with a use case.

About using associated constants, I can open a follow-up PR.

@davidhewitt
Copy link
Member

It would be nice if we could also use associated constants (which are stable, i8::MAX (not to be confused with std::i8::MAX) is a good example):

Nice, I was so sure they were stable but then when looking at documentation to confirm this last night I got bogged down in the docs for traits xD

They are settable in Python.

If they're settable in Python, but we don't want to allow setting these in Rust, I suggest we make it so that an exception is raised if a user attempts to set these attributes from Python. Otherwise the original Rust value and current Python value may become out of sync?

@pganssle
Copy link
Member

pganssle commented May 6, 2020

For comparison, here's what happens with a Cython cdef class defined like so:

cdef class MyClass:
    attr = 3
>>> from mymodule._backend import MyClass
>>> MyClass.attr
3
>>> MyClass.attr = 4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't set attributes of built-in/extension type 'mymodule._backend.MyClass'
>>> MyClass().attr
3
>>> MyClass().attr = 4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'mymodule._backend.MyClass' object attribute 'attr' is read-only

So it seems like they are called "attributes" and that it is reasonable for them to be read-only in both Python and Rust.

I think #[attribute] or #[classattr] or #[classattribute] are all reasonable spellings. I'm not terribly fond of the idea of #[classvar]. Class scoping in Python is super weird and there may be a distinction between "class variable" and "attribute" that I don't understand.

}
}

#[test]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned this in my comment, but adding an in-line note so it doesn't get lost: I think it would be a good idea to add tests for the error cases, to ensure that they raise the correct exception.

Could be done here or it could be done in the examples/rustapi_module tests, which are written in Python and where you can use pytest.raises.

@kngwyu
Copy link
Member

kngwyu commented May 6, 2020

I think #[attribute] or #[classattr] or #[classattribute] are all reasonable spellings. I'm not terribly fond of the idea of #[classvar]. Class scoping in Python is super weird and there may be a distinction between "class variable" and "attribute" that I don't understand.

Thank you for the feedback.

I suggest we make it so that an exception is raised if a user attempts to set these attributes from Python.

I checked the code that Cython produced and what it does is the same as this PR. They set attr to tp_dict.

  if (PyDict_SetItem((PyObject *)__pyx_ptype_4demo_MyClass->tp_dict, __pyx_n_s_attr, __pyx_int_3) < 0) __PYX_ERR(1, 2, __pyx_L1_error)

But I don't understand how it makes the attr read-only 🤔 ...

@programmerjake
Copy link
Contributor

But I don't understand how it makes the attr read-only ...

Because Python enforces that, for classes defined using the C API that PyO3 uses, type objects can't be changed (I'm assuming since they're shared across all interpreters in the program) and tp_dict is considered part of that type object.

@davidhewitt
Copy link
Member

+1 for #[classattr] as my favourite of the names suggested.

Because Python enforces that, for classes defined using the C API that PyO3 uses, type objects can't be changed (I'm assuming since they're shared across all interpreters in the program) and tp_dict is considered part of that type object.

Nice. I guess should make sure there are tests in this PR to verify that trying to modify a classattr raises an exception as described.

@scalexm
Copy link
Contributor Author

scalexm commented May 6, 2020

Current status:

  • applied @kngwyu comments: it's cleaner that way, thanks
  • removed the possibility to return a PyResult: this can always be added later if someone has a use case for it
  • added a test to showcase that class variables from an extension module are indeed immutable

Next steps:

let gil = Python::acquire_gil();
let py = gil.python();
let typeobj = py.get_type::<Foo>();
py_expect_exception!(py, typeobj, "typeobj.foo = 6", TypeError);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

@kngwyu
Copy link
Member

kngwyu commented May 7, 2020

Thank you for the update!

decide on the name: I personally don't have a strong opinion

Me too, so the top candidate is #[classattr]?(+1 by @davidhewitt)

add a paragraph to the guide

Yeah, please place some docs in class.md.

allow associated constants to define class attributes: I'd prefer to open a separate follow-up PR for that one

👍

@programmerjake
Copy link
Contributor

programmerjake commented May 7, 2020

For naming, I'm happy with any of the options other than #[classvar] since the attributes act like constants, not variables.

if self_.is_some() || !arguments.is_empty() {
return Err(syn::Error::new_spanned(
name,
"Class attribute methods cannot take arguments",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be helpful to optionally allow one argument, of type Python, for those who want to create python types as class attributes. See pymethod/impl_wrap_getter as an example of how we allow this for getters.

One caveat I am not sure about though: as we're currently in the middle of creating a type object, is it safe to run arbitrary Python code?

Copy link
Member

@kngwyu kngwyu May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it safe to run arbitrary Python code?

Maybe this code can cause SIGSEGV.
Oh I was wrong. This code works. Still investigating this can cause some odd errors, though.

#[pymethods]
impl MyClass {
    #[classattr]
    fn foo() -> MyClass { ... } 
}

I'll open a small PR to prevent this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kngwyu is your conclusion that this is safe?

kngwyu
kngwyu previously requested changes May 7, 2020
if !methods.is_empty() {
methods.push(ffi::PyMethodDef_INIT);
type_object.tp_methods = Box::into_raw(methods.into_boxed_slice()) as *mut _;
}

// class attributes
if !attrs.is_empty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering a recursive case(e.g.,

#[pymethods]
impl MyClass {
    #[classattr]
    fn foo() -> MyClass { ... } 
}

), could you please move this code after PyType_Ready?
Then an incomplete type object is never used.

Copy link
Contributor Author

@scalexm scalexm May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was precisely one of my use cases indeed. But I think it’s still possible to return MyOtherClass where MyOtherClass has not yet been initialized?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think it’s still possible to return MyOtherClass where MyOtherClass has not yet been initialized?

Yes, but I'm not sure there's no corner case 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to modify type_object after PyType_Ready ?

Copy link
Member

@kngwyu kngwyu May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that it cannot be a problem since PyCell::new doesn't use the type object at all.
Thanks @scalexm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll add a few « recursive » test cases to be sure that we don’t break that assumption in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to modify type_object after PyType_Ready ?

no, it's not safe, according to the C API docs

@kngwyu kngwyu dismissed their stale review May 7, 2020 08:53

It cannot be a problem

@scalexm scalexm changed the title Add #[attribute] methods to define Python class attributes Add #[classattr] methods to define Python class attributes May 7, 2020
@kngwyu
Copy link
Member

kngwyu commented May 8, 2020

LGTM, thank you for the update!

@kngwyu kngwyu merged commit 8aeae6c into PyO3:master May 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants