-
-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow declaring weak properties #211
Conversation
070091e
to
cdd4ee5
Compare
Hrmp, not sure why this segfaults in Python 3.5 :/ The tests all pass locally on Python 3.9. |
I am having trouble installing Python 3.5 on my own machine with macOS 11 but can confirm that the tests all run and pass on Python 3.6. |
FWIW, I can confirm I'm seeing the crash on Py3.5, but tests pass fine on 3.6-3.9. One possibility here would be to drop Python 3.5 support. Dropping support because of a bug we can't explain doesn't make me entirely comfortable, but if it turns out that Python 3.5 behavior is the underlying problem (which seems plausible, given the changes 3.6 introduced around dict ordering), then I'd be open to dropping 3.5 support. We've dropped Python 3.5 elsewhere in BeeWare, and 3.5 is no longer officially supported by Python itself. |
@freakboy3742, can you check which test exactly is causing the crash? It should (hopefully) be one of the two new tests which I have introduced. And is there a useful stacktrace? I agree that Python 3.5 support is not so important but as you say, it would still be good to understand why it crashes. |
re. the crashing tests on Python 3.5: We've already had a similar situation before in #201, where everything ran fine on newer Python versions, and only Python 3.5 sometimes crashed while running the tests. In that case, it turned out that we had incorrect reference management code in the tests (some test objects were released twice), which for some reason only caused visible crashes on Python 3.5. I would guess that the crash we're seeing here is similar - there's probably some sort of memory management bug in the new code, and Python 3.5 is the only version where we're noticing it. Maybe it's because we haven't added handling for weak properties in It's of course also possible that the crash comes from some difference between Python 3.5 and later versions, but I don't know what difference specifically could cause it. It's probably not related to dict ordering, because the code added/changed in this PR doesn't really use dicts. About Python 3.5 support in general - I'm not strongly attached to 3.5, and 3.6 has some nice features like f-strings, so in general I would be fine with dropping Python 3.5 support. But we should do that after we've figured out where the crashes here are coming from. Even if the crashes only happen on 3.5, the underlying issue could apply to later versions too. |
@samschott As best as I can make out, it is Digging deeper than that, it's the call to Full pdb trace
|
dba468f
to
7a40f08
Compare
Hm, I don't think so. Weak properties should not need any handling in dealloc, right? It's the strong properties where we need to explicitly call @freakboy3742, thanks for the investigation, this is very helpful. I can reproduce the segfault when adding a breakpoint just before accessing the |
Interestingly this (new) test also segfaults with the current master branch. It's likely is the manifestation of an existing issue, maybe introduced in #201. |
I've found the issue and its protracted. After adding the following debug code to the setter: def _objc_setter(objc_self, _cmd, new_value):
if not isinstance(new_value, self.vartype):
# If vartype is a primitive, then new_value may be unboxed. If that is the case, box it manually.
new_value = self.vartype(new_value)
old_value = get_ivar(objc_self, '_' + attr_name, weak=self.weak)
print("old_value", old_value.value)
print("new_value", new_value.value)
if new_value is old_value:
return
if issubclass(self.vartype, objc_id) and new_value and not self.weak:
# If the new value is a non-null object, retain it.
print("retaining", new_value.value)
send_message(new_value, 'retain', restype=objc_id, argtypes=[])
set_ivar(objc_self, '_' + attr_name, new_value, weak=self.weak)
if issubclass(self.vartype, objc_id) and old_value and not self.weak:
# If the old value is a non-null object, release it.
print("releasing", old_value.value)
send_message(old_value, 'release', restype=None, argtypes=[]) the test
So while setting the property value, it is both retained and released. I suspect this happens because the first access with
Setting the new value therefore directly changes Edit: It turns out that we always get |
1cdca01
to
1eb95fe
Compare
if weak: | ||
value = libobjc.objc_loadWeakRetained(obj.value + libobjc.ivar_getOffset(ivar)) | ||
return libobjc.objc_autoreleaseReturnValue(value) | ||
elif isinstance(vartype, objc_id): | ||
return cast(libobjc.object_getIvar(obj, ivar), vartype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is this cast required? Isn't it sufficient to to return a objc_id
which later gets converted to the correct ObjCInstance
?
If it is required, do we need to add a similar cast before returning the value in the weak case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that libobjc.object_getIvar
already returns an objc_id
. The most common case is that vartype
is exactly objc_id
, in which case the cast indeed does nothing. I think it's only needed for the less common case where vartype
is Class
(which we have declared as a subclass of objc_id
) so that the return value is cast to Class
instead of a plain objc_id
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok! So we should perform a similar cast when returning a value in the weak case as well.
b078dbf
to
4ea8131
Compare
Yes, strong properties should also be handled in
the runtime will set Object A's weak property to But as you've already figured out (thank you for the debugging!) the segfault we're seeing is caused by something else.
Yes, in fact this should always be the case, no matter if the ivar is currently if isinstance(vartype, objc_id):
return cast(libobjc.object_getIvar(obj, ivar), vartype) and |
Oh, I had not thought of that. I'll need to check. Incidentally, what is a good way of overriding
I think the problem lies with the |
4ea8131
to
e475b93
Compare
You are right about the cleanup of weak ivars. Testing the cycle which you have outlined prints the following warning:
|
this acts like add_method if the method does not exist, replaces it otherwise
95eff15
to
dd83d5d
Compare
d2cc6f9
to
2d50c47
Compare
rubicon/objc/runtime.py
Outdated
def replace_method(cls, selector, method, encoding): | ||
"""Add a new instance method to the given class or replace an existing instance method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't know if it's a good idea to always replace existing methods by default... In most cases where add_method
is called, there should be no existing method with the same name - and if there is, either Rubicon or the programmer probably did something wrong. So IMHO it would be better to keep add_method
with the current behavior of failing if the method already exists, and add replace_method
as a new separate function (or perhaps as a kwarg replace=True/False
, as the implementations are very similar).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. I can see the value of keeping add_method
and raising an error when it fails.
@@ -874,7 +886,7 @@ def add_method(cls, selector, method, encoding): | |||
|
|||
cfunctype = CFUNCTYPE(*signature) | |||
imp = cfunctype(method) | |||
libobjc.class_addMethod(cls, selector, cast(imp, IMP), types) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, it looks like our existing add_method
implementation doesn't check the return value of class_addMethod
, so the method would silently do nothing if a conflicting method already exists. So if we keep add_method
we should also fix it to raise an error if class_addMethod
returns false.
rubicon/objc/api.py
Outdated
old_dealloc_callable = cast(old_dealloc, cfunctype) | ||
old_dealloc_callable(objc_self, SEL("dealloc")) | ||
|
||
replace_method(class_ptr, 'dealloc', _new_delloc, [None, ObjCInstance, SEL]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are multiple properties defined on one class, this is going to re-wrap dealloc
once for each property, right? Although that works, IMO it would be better if we define dealloc
only once and have it clean up all of the object's properties in a simple loop. This would be easier to read and debug, and probably also a bit faster, because it avoids calling back and forth many times between Python and Objective-C.
Though with the current class_register
mechanism there's no way to implement this - class_register
is called once for each property, and the call has no information about other properties in the class. So this would probably require adding an extra method to objc_property
that implements deallocating a single property, and have ObjCClass._new_from_class_statement
define a dealloc
method that asks each property to deallocate itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, it's not particularly elegant. I was already thinking of having a single dealloc method in ObjCClass._new_from_class_statement
but was torn between having a simpler dealloc and keeping all the registration / reference management functionality contained in the objc_property
class.
rubicon/objc/api.py
Outdated
if self.weak: | ||
# Clean up weak reference. | ||
ivar = libobjc.class_getInstanceVariable(libobjc.object_getClass(objc_self), ivar_name.encode()) | ||
libobjc.objc_storeWeak(objc_self.value + libobjc.ivar_getOffset(ivar), None) | ||
elif issubclass(self.vartype, objc_id): | ||
# If the old value is a non-null object, release it. There is no need to set the actual ivar to nil. | ||
old_value = get_ivar(objc_self, ivar_name, weak=self.weak) | ||
send_message(old_value, 'release', restype=None, argtypes=[]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be simplified somehow by calling _objc_setter
or set_ivar
? Especially in the weak case the logic for accessing the ivar is a bit more complex, so it would be better to use our existing functions instead of reimplementing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. At least, I think so. The current implementation ensures that the super dealloc
only gets called after we have cleaned up our ivars so the instance should generally still be in a sane state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember now why I did this. The current set_ivar
implementation raises a TypeError when setting the value to None
. This can be changed of course, it is common place in ObjC to set a value to nil. Edit: Or just convert None
to self.vartype
🤦
27caa2a
to
bd70e4e
Compare
bd70e4e
to
6701965
Compare
rubicon/objc/api.py
Outdated
# Invoke original dealloc. | ||
cfunctype = CFUNCTYPE(None, objc_id, SEL) | ||
old_dealloc_callable = cast(old_dealloc, cfunctype) | ||
old_dealloc_callable(objc_self, SEL("dealloc")) | ||
|
||
add_method(ptr, "dealloc", _new_delloc, [None, ObjCInstance, SEL], replace=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What behavior do we want if a class contains both a user-defined dealloc
method and properties with dealloc callbacks? The implementation here will call the dealloc callbacks before the user-defined dealloc
method. But in practice I think the opposite order would be more useful - that way the user-defined dealloc
could call methods on objects stored in properties before they are released by the callbacks (which could cause them to be deallocated right away).
Implementing this would be a bit more difficult. A user-defined dealloc
is expected to end with a call to the superclass dealloc
. But if a class has dealloc callbacks, those would need before the superclass dealloc
(but also after the user-defined dealloc
code). There's no way for Rubicon to insert the dealloc callbacks into the user's dealloc
method, so the user would have to manually call the dealloc callbacks before the super call (via an extra function/method provided by Rubicon).
A different solution would be to allow users to define their own dealloc callback that gets called before any dealloc callbacks from properties. Then users could use that callback instead of manually overriding dealloc
, and let Rubicon generate a dealloc
method that calls all callbacks in the right order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that calling the user's dealloc
before clearing any instance variables can make life at lot easier for the user. But I don't really like either of the proposed solutions, both look like workarounds that force the user to become aware of the memory management which we are otherwise performing behind the scenes.
I can possibly think of two alternative approaches that could provide a better user experience (but would complicate our own implementation):
-
We document that user should not call the
dealloc
of the super class in their own implementation, as in a proper ARC environment. We then call the following methods in order: First, any user-defined dealloc, second, our cleanup code and third the super class's dealloc. This would mean treating the dealloc definition differently from other method definitions, at least internally. It would however be transparent to the user. -
We call our own cleanup code after calling the old
delloc
implementation (which includes the user's code and any calls to the super dealloc). I'm not sure if this is possible to do reliably.
What do you think, is either of those options worth the effort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Option 1 would be ideal, because it's simple to use and matches what you do in normal ARC Objective-C code. But implementing this behavior now would silently break existing code that overrides dealloc
and correctly calls the superclass dealloc
at the end - that would cause the superclass dealloc
to get called twice, which can lead to other objects being release
d too often and other similar problems. That's the main reason why I suggested adding a separate user-definable callback for the same purpose.
Rubicon is still before version 1.0, so we could still make breaking changes like this, especially if it improves usability in the long term. But if we do that, we should try to throw errors for code relying on the old behavior, to avoid silent double frees in code that was previously correct.
I honestly don't know if option 2 would be safe or not... There should be no way for the superclass dealloc
s to corrupt ivars defined by the subclass, but I wouldn't really rely on it. Especially NSObject
's top-level dealloc
could do things that make the entire object unusable somehow.
Another alternative would be to not touch user-defined dealloc
s at all, and only generate an automatic dealloc
if the user hasn't already overridden dealloc
manually. This would be fully compatible with existing code and should also be simple to implement. The disadvantage is that if the user really needs to add custom code to dealloc
, they then have to manually do all the cleanup that the dealloc callbacks would have done automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do prefer Option 1 from those choices, or not touching use-defined deallocs at all together with a clear documentation on manual cleanup. My only issue with Option 1 is, how can we raise an error if user does call the superclass dealloc? Is there an elegant way of doing so? We are not a compiler after all and don't want to inspect the actual code in the user's dealloc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm that running our cleanup after the superclass dealloc does lead to trouble during the ivar cleanup. In particular, libobjc.class_getInstanceVariable()
returns None
. So this is not viable.
I've implemented the first option now: calling the user's dealloc, then our cleanup, then the superclass dealloc. For the time being, there is no special error handling if the user calls the superclass dealloc manually. It will however raise (obscure) errors when we run our own cleanup. Without our own cleanup code (for example when no properties are declared), dealloc is called twice and leads to a segfault. In either case, users will notice that something is wrong without knowing what it might be. Not ideal...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The best workaround I can come up with, short of actually inspecting the user's dealloc code when creating the class, is to print a warning when send_super
is called, before either segfaulting or failing to complete the dealloc. What do you think, is this acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply - had some uni stuff that I needed to finish in the last few days.
The implementation you wrote looks good IMO. You're right that we can't do much to guard against old code that still calls send_super
at the end of dealloc
. The only option I can think of is what you've already suggested - add a special case inside send_super
that warns whenever send_super(self, "dealloc", ...)
is called. If we do that, we just need to make sure that the warning doesn't appear when Rubicon itself calls send_super(self, "dealloc", ...)
- probably using an internal keyword to suppress the warning?
That way I think we could even "fix" the segfaults. If send_super(self, "dealloc", ...)
is called without the special internal keyword argument, we can make it show a warning and then return without actually calling the super method. That way, if a user-defined dealloc
calls send_super
, the super dealloc
won't actually be called yet. Then the user-defined dealloc
returns to Rubicon, which runs the dealloc callbacks then makes the real send_super
call (with the internal keyword argument, so that this time it actually calls the super method).
This isn't a very nice solution - but if it works, I would rather have some less nice code in Rubicon that shows a helpful warning about the change, rather than breaking existing code so that it causes unexplained segfaults.
Co-authored-by: dgelessus <dgelessus@users.noreply.github.com>
3998925
to
6a60e42
Compare
6a60e42
to
1b071f5
Compare
ca2ca6d
to
a82cf90
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can confirm that the send_super(..., "dealloc", ...)
warning works as expected. A couple of small things, then we are really done I think 🙂
rubicon/objc/runtime.py
Outdated
if not _allow_dealloc and selector.name == b"dealloc": | ||
warnings.warn( | ||
"You should not call the superclass dealloc manually when overriding dealloc. Rubicon-objc " | ||
"will call it for you after releasing objects stored in properties and ivars." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use stacklevel
here, so that the warning points to the source location of the send_super
call and not the warn
call:
"will call it for you after releasing objects stored in properties and ivars." | |
"will call it for you after releasing objects stored in properties and ivars.", | |
stacklevel=2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point
tests/test_core.py
Outdated
assert attr0.retainCount() == 2 | ||
assert attr1.retainCount() == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only noticed this now - our tests use plain standard unittest
and not PyTest, so these asserts should be written using self.assertEqual
and not the assert
keyword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. Yes, now that you say that...
tests/test_core.py
Outdated
assert obj._did_dealloc, "custom dealloc did not run" | ||
assert attr0.retainCount() == 1, "strong property value was not released" | ||
assert attr1.retainCount() == 1, "weak property value was released" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this should use self.assertEqual
and self.assertTrue
.
81992a8
to
2a514ea
Compare
What do you think about the expanded docs on memory management? Is the new section on reference cycles helpful? I was trying to balance being brief and giving all the necessary information + code snippets. |
Yes, the new docs are definitely a useful addition! That way people running into reference cycle problems can find out about weak properties more easily. I don't think the docs are too long or detailed either. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR adds support for declaring weak properties when creating custom Objective-C classes.
PR Checklist: