Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyObject<'p> and Send #15

Closed
dgrunwald opened this issue Jun 27, 2015 · 13 comments
Closed

PyObject<'p> and Send #15

dgrunwald opened this issue Jun 27, 2015 · 13 comments
Assignees

Comments

@dgrunwald
Copy link
Owner

When defining a python extension type in rust, it's possible for python code to retain a reference to instances of that type, and later use them on another (python) thread.
Thus, python extension types require that their contents are Send. (and also 'static, but we could potentially allow 'p if we make the 'p lifetime invariant and use a higher-order lifetime when acquiring the GIL)

The problem: PyObject itself is not Send, so it is not possible to create python extension objects that contain references to other python objects :(

So, why is PyObject not Send? The python runtime requires that python objects are only used while the global interpreter lock (GIL) is held. But it's no problem to hold a reference to a python object without having the GIL acquired -- as long as you don't access the python object until re-acquiring the GIL.
Currently, a PyObject<'p> is a *mut ffi::PyObject with two invariants:
a) The PyObject owns one reference to the python object (will call Py_DECREF() in the Drop impl)
b) The GIL is held for at least the lifetime 'p. (i.e. PyObject<'p> contains a Python<'p>)

Python<'p> represents "the current thread holds the GIL for lifetime 'p", and thus is fundamentally not Send. We could attempt to remove the Python<'p> from PyObject<'p>. This would require the user to explicitly pass in the Python<'p> for every operation on the PyObject. But on the other hand, it means we don't need a lifetime on PyObject (and transitively, PyResult etc.), so overall it should be an ergonomics win. It would open up some possibilities for a safe API for temporarily releasing the GIL during a long pure-rust computation (Py_BEGIN_ALLOW_THREADS).

But there's a big issue with this approach: the Drop impl. Calling Py_DECREF() requires holding the GIL, so we wouldn't be able to do that in Drop -- we'd have to provide an explicit fn release(self, Python), and any call to Drop would leak a reference count.
Forgetting to call release() would be a serious footgun.

This becomes less of an issue with static analysis to find missing release() calls: humpty_dumpty might be able to help; and hopefully Rust will get linear types in the future.

So, do we adopt this change? When writing python extension in rust, I'd say yes, it's worth it. But for embedding python in a rust program, trading a few lifetime annotations for a serious footgun is not exactly a good trade-off.

Maybe we could make Drop (re-)acquire the GIL? Locking the GIL recursively is allowed. That would reduce the footgun from a ref leak to a performance problem: instead of a simple decrement (which is getting inlined into the rust code), Drop would involve two calls into python library, both of which read from TLS and then do some cheap arithmetic+branching. Nothing too expensive, so this might be a workable approach (considering you can prevent the performance loss by explicitly calling release()).

@novocaine
Copy link
Contributor

I think the ability to unlock the GIL safely is really important - rust extensions are going to want to do I/O.

Maybe the annoyance of passing a Python<'p> to every PyObject method could be mitigated by a stateful macro that automatically passes it as the first argument. Like JS' with.

There are Python functions for checking if the current thread already holds the GIL which would help not acquire the GIL again unnecessarily (recursively)..?

My one concern is that invisibly locking the GIL in a destructor might lead to unexpected deadlock if the user isn't aware of what's happening, but I think if it is clearly documented that the GIL is held during destruction as well as during method calls it's okay.

@dgrunwald
Copy link
Owner Author

I thought some more about safely unlocking the GIL.

There are two semi-workable approaches:

  1. Require the closure that executes with the GIL unlocked to be Send:
impl <'p> Python <'p> {
    fn unlock_gil(&self, f: F) -> T where F : Send + FnOnce() -> T;
}

Because both Python and PyObject are !Send, the closure cannot call into the python interpreter while the GIL is unlocked.
This does not require removing the lifetime from PyObject; in fact it should work pretty much without any changes to rust-cpython.
In this model, both Python and PyObject contain the right to access the python interpreter, and both being !Send prevents transferring that right to another thread, or to the inside of the closure.

Of course, this approach is unnecessarily restrictive in that it prevents passing other non-Send types like Rc<String> into the closure.

More importantly, this approach is misusing what Send means. I worry that it's possible to use TLS to smuggle the Python or PyObject into the closure. Now, std::thread::LocalKey won't take Python<'p> due to the 'static bound, and ScopedKey only works with &'a T, not T<'a>. But a future version of ScopedKey using HKT might allow smuggling the Python<'p> into the closure.


  1. Don't have the Send requirement, and instead use a single Python instance to track who is allowed to access the python interpreter.
impl ... {
    fn unlock_gil(&mut self, f: F) -> T where F : FnOnce() -> T;
}

This approach only works if there's only a single Python access token. We need the ability to have multiple PyObject instances, so we'd have to remove the access token from the PyObject (which is what this issue originally was about). But we'd also have to remove Copy from Python.

But without Copy, how could users pass it to another function without losing access? The only way I see is to make use of automatic re-borrowing by defining

type Python<'p> = &'p mut GILGuard;

The downside is that sizeof(Python) increases from 0 bytes to pointer-size. And is no longer free to pass around. Or is there some way to create a struct that allows re-borrowing like &mut does? Playpen

Additionally, this approach means we have to prevent the user from creating a new GILGuard when he might already have one. Which means Python::acquire_gil() would have to fail when recursively acquiring the GIL.

In the case of a callback from python (Rust calls Python calls Rust), we could still synthesize an access token in the callback function: the existing access token is currently inaccessible because it is borrowed by the API call into the python runtime.

But if the Drop impl on PyObject acquires the GIL and calls Py_DECREF, that might end up calling a python __del__ implementation that might call back into an py_fn!(). Now the callback would synthesize a new access token even though the existing one might be still accessible. Using TLS, the outer access token might be smuggled into the callback, thus giving the callback access to two tokens.

By the way, the Display and Debug impls have the same issue as the Drop impl.


Both approaches should be safe unless rust gets scoped TLS that accepts Python<'p>, so I'm starting to wonder whether this is more of an argument against HKT scoped TLS than against these approaches...

@novocaine
Copy link
Contributor

In 1 - is "Because both Python and PyObject are Send, the closure cannot call into the python interpreter while the GIL is unlocked." supposed to be !Send I think?

1 is a clever solution, but I'm not sure about flexibility. What if I want to go grind in that closure, but then in the middle of the grinding I need to take the GIL back again? For example, say I want to log something using the python logging system .. (one of the applications I work on does exactly this).

How could this be accomplished..? If we pass a channel, we can get messages out, but they can't be received on the same thread until the closure exits and lock is re-taken, which isn't very nice for realtime logging. Unless you want to spawn another thread just to receive messages .. not great.

Does rust support something like yield? That might look reasonably elegant. But I have no clue how that would interact with Send.

I don't think you can solve this by passing a callback into the closure that re-takes the GIL, because that callback would have to be Send too ..? So couldn't contain any PyObjects to use.

@dgrunwald
Copy link
Owner Author

Yes, that was supposed to be !Send (I've edited the post).

The closure can still reacquire the GIL by calling Python::acquire_gil() again, and you could do logging that way. However, you still won't have access to any of the PyObjects in the outer scope (since those are not Send), so you'll be unable to pass object instances to the logging function (unless you can re-discover them in some way, e.g. through some module storing a dict from object ID to object).

But that still doesn't help with the original issue: create python extension objects that contain references to other objects. Approach 1 doesn't handle that case at all. The next best thing would be a hybrid approach, where both PyObject<'p> and SendablePyObject exist. That would reduce the problem with Drop to only those cases where the object really has to be Send.
Unfortunately the separate SendablePyObject doesn't combine well with the derived types like PyDict -- SendablePyObject<PyDict> isn't possible until Rust gets HKT. So this would require python extension objects to use cast_into() whenever they access another python object and need a more concrete type than PyObject. OK, that's not really a big deal...

But the hybrid approach would be strictly more complicated than approach 2 alone, and I'd like to get rid of the lifetime on PyObject so that function signatures don't have to constantly deal with lifetimes.

Consider a py_method!() callback function that forwards the call to another extension object:

// Approach 1
fn method<'p>(slf: &PyRustObject<'p, i32>, args: &PyTuple<'p>) -> PyResult<'p, PyObject<'p>> {
     // get() is SendablePyObject::get(Python) to get a regular PyObject
    another_method(slf.another_object.get(slf.python()).cast_to::<MyType>(), args)
}

// Approach 2
fn method(py: Python, slf: &PyRustObject<i32>, args: &PyTuple) -> PyResult<PyObject> {
    // Of the types in the function signature, only Python<'p> still has a lifetime, but that
    // can be elided as it's not used in the output type.
    another_method(py, &slf.another_object, args)
}

Edit: more realistic call syntax; as I don't think we'll be able to make py_method!() callable with rust method syntax.

@dgrunwald
Copy link
Owner Author

Actually in approach 2, the convention should be that the py parameter is always the last parameter, not the first. This is due to http://is.gd/Wjx06W. (unless rust fixes that soon? it's a rather pointless limitation in the borrow checker...)

@novocaine
Copy link
Contributor

I'm super-keen to see this move forward as the proof of concept I'd like to write involves a type that holds references to other Python objects.

I think the clearest design is to separate the concern of holding the GIL from PyObject and pass a token around explicitly as you have suggested in 2).

I'm not exactly sure why the lock token necessarily has to be mut, and would have thought you could just pass around borrowed references to a 0-size struct. I think it's conceptually the same thing as a MutexGuard, which seems to support that...? But I'm guessing there's a reason I am missing :)

@dgrunwald
Copy link
Owner Author

The &mut was for fn unlock_gil(&mut self, f: F) -> T where F : FnOnce() -> T; -- that function needs to ensure that the user gives up the token, not just one copy of the token. This means the type Python<'p> can't be Copy if we want the ability to temporarily unlock the GIL.
But of all non-Copy types, only &mut ... supports implicit re-borrowing, which is crucial for ergonomically passing around the Python<'p>.
But I'd hide the &mut behind a type alias (type Python<'p> = &'p mut Something), so that user code can continue to use py: Python<'p>.

Of course, Python<'p> can initially stay as-is for implementing option 2; this change is only necessary when adding unlock_gil later.

@dgrunwald
Copy link
Owner Author

What's interesting is that removing the lifetime from PyObject invalidates the some of the reasons why I kept the rustpy design of PyObject being a reference-counted pointer.

When I started with rust-cpython, I originally had a different design: The reference-counting pointer was PyPtr<PyObject>, and PyObject was the python object itself. &PyObject thus was a borrowed reference to a python object, while the current design uses a double indirection in this case (borrowed reference to owned reference to python object).

Going back to this kind of design could help with the hybrid approach between the two options:
A normal reference-counting pointer could be PyPtr<'p, PyObject> which wouldn't have to acquire the GIL on Drop, but we could additionally offer a SendablePyPtr<PyObject>.

So I see three ways of going forward:

Option PyPtr:

  • Owned reference: PyObject<'p> becomes PyPtr<'p, PyObject>.
  • Borrowed reference: &'a PyObject<'p> becomes &'a PyObject and no longer implies that the GIL is held.
  • All methods on PyObject will require explicitly passing in the Python<'p> argument.
  • The Python<'p> token remains a zero-sized struct that implements Copy
  • &'a PyObject will be Send
  • SendablePyPtr<PyObject> will also be Send, and will acquire the GIL in its destructor.
  • Recursively locking the GIL remains possible
  • Temporarily unlocking the GIL works as in the previously discussed option 1, using a Send bound for safety:
impl <'p> Python <'p> {
    fn unlock_gil(&self, f: F) -> T where F : Send + FnOnce() -> T;
}

Option 2: Python: !Copy:

  • Owned reference: PyObject<'p> becomes PyObject
  • Borrowed reference: &'a PyObject<'p> becomes &'a PyObject
  • Neither implies that the GIL is held.
  • All methods on PyObject will require explicitly passing in the Python<'p> argument.
  • The Python<'p> token will no longer be Copy. There can only be one Python<'p> instance at a time. Python<'p> will allow implicit re-borrowing so that it can be passed to multiple functions. This implies it will be a typedef for &mut ...; but we might return to a zero-sized struct if future Rust versions allow implicit re-borrowing for user-defined types.
  • &'a PyObject will be Send
  • PyObject itself will be Send, and will (if necessary) acquire the GIL in its destructor.
  • Recursively locking the GIL won't be possible, as it would violate the "one one Python<'p> token" invariant.
  • Temporarily unlocking the GIL relies on the single Python<'p> token for safety:
impl <'p> Python <'p> {
    fn unlock_gil(&mut self, f: F) -> T where F : FnOnce() -> T;
}

Option Python: Copy without PyPtr:

(a mixture of the two options above)

  • Owned reference: PyObject<'p> becomes PyObject
  • Borrowed reference: &'a PyObject<'p> becomes &'a PyObject
  • Neither implies that the GIL is held.
  • All methods on PyObject will require explicitly passing in the Python<'p> argument.
  • The Python<'p> token remains a zero-sized struct that implements Copy
  • &'a PyObject will be Send
  • PyObject itself will be Send, and will (if necessary) acquire the GIL in its destructor.
  • Recursively locking the GIL remains possible
  • Temporarily unlocking the GIL works as in the previously discussed option 1, using a Send bound for safety.

The different aspects of these options can't be freely mixed and matched, due to these constraints:

  • Removing the token and lifetime from PyObject requires either that PyObject has no destructor, or that its destructor acquires the GIL.
  • The approach with SendablePyPtr also requires reintroducing PyPtr and removing Python token from PyObject, as otherwise SendablePyPtr would require HKT.
  • The PyPtr type must contain a Python token so that the destructor is safe, which implies that Python must be Copy.

The PyPtr solution feels like the right way to model the low-level details of Python objects in Rust.

On the other hand, the other solutions win on ergonomics: They would allow the types PyErr and PyResult to get rid of the 'p lifetime. Compare the signature for a function that returns an object, or possibly an error:

fn f<'p>(py: Python<'p>) -> PyResult<'p, PyPtr<'p, PyObject>> { .. }

fn f(py: Python) -> PyResult<PyObject> { .. }

@dgrunwald
Copy link
Owner Author

After writing all that up, my impression is that ergonomics are important enough to rule out the PyPtr solution. Thus "Option Python: Copy without PyPtr" seems like a good first step; we can still decide later if we want to change Python to allow only one token at a time.

@dgrunwald dgrunwald self-assigned this Oct 25, 2015
@dgrunwald
Copy link
Owner Author

I've started implementing this, and discovered that there is a case were we really need Python to remain Copy: looping through sequences

for item in seq.iter(py) { .. }

The iterator needs to hold a copy of the token because it's not available in the next() method called by the compiler, but the token also has to be available in the loop body. -> py must be Copy.

dgrunwald added a commit that referenced this issue Oct 26, 2015
Since the `Python` token no longer is a part of `PyObject`,
lots of methods now require the token as additional argument.

This [breaking-change] breaks everything!
@ehiggs
Copy link
Contributor

ehiggs commented Oct 29, 2015

I think this requires a motivating example to see which ends up being the best. I suggest wrapping a value data type like a chrono::NaiveDateTime object (ignore TZ since it's just for display purposes) including constructors as static functions like from_datetime. Then wrap std::collections::BTree<NaiveDateTime, f32>. And make sure it's are picklable.

@dgrunwald
Copy link
Owner Author

Well; there's not much options left to check.

Option 2 (Python: !Copy) is out because it's incompatible with iterators.

The current code implements choice "Option Python: Copy without PyPtr". We could theoretically still add back PyPtr, but that would make usage of the library even less ergonomic than it currently is.

By the way, I finally added the impl Send for PyObject today; so it's finally possible to write an extension type constructor: custom_type.rs.
Of course, GILProtected<RefCell<Option<PyRustType<i32>>>> is quite a mouthful. I guess we'll have to hide the static holding the type behind a macro...

@dgrunwald
Copy link
Owner Author

I'm starting to really like the lifetime-freedom of the current approach. I don't think I'll try to reintroduce PyPtr.

PyObject is Send now, so I'm closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants