Allow declaring weak properties #211

samschott · 2021-06-27T21:16:47Z

This PR adds support for declaring weak properties when creating custom Objective-C classes.

PR Checklist:

All new features have been tested
All new features have been documented
I have read the CONTRIBUTING.md file
I will abide by the code of conduct

samschott · 2021-06-27T21:34:33Z

Hrmp, not sure why this segfaults in Python 3.5 :/ The tests all pass locally on Python 3.9.

samschott · 2021-06-28T22:57:14Z

I am having trouble installing Python 3.5 on my own machine with macOS 11 but can confirm that the tests all run and pass on Python 3.6.

freakboy3742 · 2021-06-28T23:47:48Z

FWIW, I can confirm I'm seeing the crash on Py3.5, but tests pass fine on 3.6-3.9.

One possibility here would be to drop Python 3.5 support.

Dropping support because of a bug we can't explain doesn't make me entirely comfortable, but if it turns out that Python 3.5 behavior is the underlying problem (which seems plausible, given the changes 3.6 introduced around dict ordering), then I'd be open to dropping 3.5 support. We've dropped Python 3.5 elsewhere in BeeWare, and 3.5 is no longer officially supported by Python itself.

samschott · 2021-06-29T08:40:03Z

@freakboy3742, can you check which test exactly is causing the crash? It should (hopefully) be one of the two new tests which I have introduced. And is there a useful stacktrace?

I agree that Python 3.5 support is not so important but as you say, it would still be good to understand why it crashes.

rubicon/objc/api.py

rubicon/objc/runtime.py

rubicon/objc/api.py

rubicon/objc/runtime.py

dgelessus · 2021-06-30T01:42:53Z

re. the crashing tests on Python 3.5: We've already had a similar situation before in #201, where everything ran fine on newer Python versions, and only Python 3.5 sometimes crashed while running the tests. In that case, it turned out that we had incorrect reference management code in the tests (some test objects were released twice), which for some reason only caused visible crashes on Python 3.5.

I would guess that the crash we're seeing here is similar - there's probably some sort of memory management bug in the new code, and Python 3.5 is the only version where we're noticing it. Maybe it's because we haven't added handling for weak properties in dealloc yet?

It's of course also possible that the crash comes from some difference between Python 3.5 and later versions, but I don't know what difference specifically could cause it. It's probably not related to dict ordering, because the code added/changed in this PR doesn't really use dicts.

About Python 3.5 support in general - I'm not strongly attached to 3.5, and 3.6 has some nice features like f-strings, so in general I would be fine with dropping Python 3.5 support. But we should do that after we've figured out where the crashes here are coming from. Even if the crashes only happen on 3.5, the underlying issue could apply to later versions too.

freakboy3742 · 2021-06-30T01:49:05Z

@samschott As best as I can make out, it is tests/test_core.py::RubiconTest::tests/test_core.py::RubiconTest::test_class_properties_lifecycle_strong that is failing. No stacktrace, unfortunately; just a segfault. Working through the test, it's the final properties.object attribute access on line 1014 that is causing the segfault.

Digging deeper than that, it's the call to object_isClass(object_ptr) (api.py, line 655):

Full pdb trace

tests/test_core.py::RubiconTest::test_class_properties_lifecycle_strong 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> PDB set_trace (IO-capturing turned off) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> /Users/rkm/projects/beeware/rubicon/objc/tests/test_core.py(1015)test_class_properties_lifecycle_strong()
-> properties.object
(Pdb) s
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(744)__getattr__()
-> def __getattr__(self, name):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(797)__getattr__()
-> if not name.endswith('_'):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(798)__getattr__()
-> method = self.objc_class._cache_property_accessor(name)
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(597)objc_class()
-> @property
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(603)objc_class()
-> try:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(604)objc_class()
-> return super(ObjCInstance, type(self)).__getattribute__(self, "_objc_class")
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(604)objc_class()-><rubicon.objc...x7fa5d157a140>
-> return super(ObjCInstance, type(self)).__getattribute__(self, "_objc_class")
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(1142)_cache_property_accessor()
-> def _cache_property_accessor(self, name):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(1147)_cache_property_accessor()
-> try:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(1148)_cache_property_accessor()
-> methods = self.instance_properties[name]
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(1152)_cache_property_accessor()
-> if methods:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(1153)_cache_property_accessor()
-> return methods[0]
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(1153)_cache_property_accessor()-><ObjCMethod: b'object' b'@@:'>
-> return methods[0]
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(799)__getattr__()
-> if method:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(800)__getattr__()
-> return ObjCBoundMethod(method, self)()
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(217)__init__()
-> def __init__(self, method, receiver):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(219)__init__()
-> self.method = method
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(220)__init__()
-> if type(receiver) == Class:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(223)__init__()
-> self.receiver = receiver
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(223)__init__()->None
-> self.receiver = receiver
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(229)__call__()
-> def __call__(self, *args, **kwargs):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(231)__call__()
-> return self.method(self.receiver, *args, **kwargs)
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(97)__call__()
-> def __call__(self, receiver, *args, convert_args=True, convert_result=True):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(120)__call__()
-> if len(args) != len(self.method_argtypes):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(126)__call__()
-> if convert_args:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(127)__call__()
-> converted_args = []
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(128)__call__()
-> for argtype, arg in zip(self.method_argtypes, args):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(156)__call__()
-> result = send_message(
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(157)__call__()
-> receiver, self.selector, *converted_args,
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(158)__call__()
-> restype=self.restype, argtypes=self.method_argtypes,
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(679)send_message()
-> def send_message(receiver, selector, *args, restype, argtypes, varargs=None):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(720)send_message()
-> try:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(721)send_message()
-> receiver = receiver._as_parameter_
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(725)send_message()
-> if not isinstance(receiver, objc_id):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(731)send_message()
-> if not isinstance(selector, SEL):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(734)send_message()
-> if len(args) != len(argtypes):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(740)send_message()
-> if varargs is None:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(741)send_message()
-> varargs = []
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(743)send_message()
-> send = _msg_send_for_types(restype, argtypes)
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(641)_msg_send_for_types()
-> def _msg_send_for_types(restype, argtypes):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(651)_msg_send_for_types()
-> try:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(653)_msg_send_for_types()
-> return _msg_send_cache[(restype, *argtypes)]
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(653)_msg_send_for_types()-><_FuncPtr obj...t 0x1031b79a8>
-> return _msg_send_cache[(restype, *argtypes)]
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(745)send_message()
-> try:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(746)send_message()
-> result = send(receiver, selector, *args, *varargs)
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(123)__new__()
-> def __new__(cls, init=None):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(128)__new__()
-> if isinstance(init, (bytes, str)):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(133)__new__()
-> self = super().__new__(cls, init)
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(134)__new__()
-> self._inited = False
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(135)__new__()
-> return self
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(135)__new__()->rubicon.objc.runtime.SEL(None)
-> return self
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(137)__init__()
-> def __init__(self, init=None):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(138)__init__()
-> if not self._inited:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(139)__init__()
-> super().__init__(init)
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(139)__init__()->None
-> super().__init__(init)
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(395)_objc_getter()
-> def _objc_getter(objc_self, _cmd):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(396)_objc_getter()
-> if self.weak:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(399)_objc_getter()
-> value = get_ivar(objc_self, '_' + attr_name)
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(951)get_ivar()
-> def get_ivar(obj, varname):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(964)get_ivar()
-> try:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(965)get_ivar()
-> obj = obj._as_parameter_
(Pdb) 
AttributeError: 'objc_id' object has no attribute '_as_parameter_'
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(965)get_ivar()
-> obj = obj._as_parameter_
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(966)get_ivar()
-> except AttributeError:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(967)get_ivar()
-> pass
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(969)get_ivar()
-> ivar = libobjc.class_getInstanceVariable(libobjc.object_getClass(obj), ensure_bytes(varname))
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(567)ensure_bytes()
-> def ensure_bytes(x):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(573)ensure_bytes()
-> if isinstance(x, bytes):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(577)ensure_bytes()
-> return x.encode('utf-8')
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(577)ensure_bytes()->b'_object'
-> return x.encode('utf-8')
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(970)get_ivar()
-> vartype = ctype_for_encoding(libobjc.ivar_getTypeEncoding(ivar))
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/types.py(296)ctype_for_encoding()
-> def ctype_for_encoding(encoding):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/types.py(314)ctype_for_encoding()
-> encoding = encoding.lstrip(b"NORVnor")
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/types.py(316)ctype_for_encoding()
-> try:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/types.py(318)ctype_for_encoding()
-> return _ctype_for_encoding_map[encoding]
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/types.py(318)ctype_for_encoding()-><class 'rubic...time.objc_id'>
-> return _ctype_for_encoding_map[encoding]
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(972)get_ivar()
-> if isinstance(vartype, objc_id):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(975)get_ivar()
-> return vartype.from_address(obj.value + libobjc.ivar_getOffset(ivar))
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(975)get_ivar()-><objc_id obje...t 0x1030097b8>
-> return vartype.from_address(obj.value + libobjc.ivar_getOffset(ivar))
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(403)_objc_getter()
-> if not isinstance(value, (Structure, Union)):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(404)_objc_getter()
-> try:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(405)_objc_getter()
-> value = value.value
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(409)_objc_getter()
-> return value
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(409)_objc_getter()->140350158314784
-> return value
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(756)send_message()
-> if restype == c_void_p:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(758)send_message()
-> return result
(Pdb) 
--Return--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/runtime.py(758)send_message()-><objc_id obje...t 0x10300b950>
-> return result
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(161)__call__()
-> if not convert_result:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(165)__call__()
-> if self.restype is not None and issubclass(self.restype, objc_id):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(166)__call__()
-> result = ObjCInstance(result)
(Pdb) 
--Call--
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(620)__new__()
-> def __new__(cls, object_ptr, _name=None, _bases=None, _ns=None):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(640)__new__()
-> if not isinstance(object_ptr, objc_id):
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(644)__new__()
-> if not object_ptr.value:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(649)__new__()
-> try:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(650)__new__()
-> return cls._cached_objects[object_ptr.value]
(Pdb) 
--Call--
> /Users/rkm/.pyenv/versions/3.5.9/lib/python3.5/weakref.py(134)__getitem__()
-> def __getitem__(self, key):
(Pdb) 
> /Users/rkm/.pyenv/versions/3.5.9/lib/python3.5/weakref.py(135)__getitem__()
-> if self._pending_removals:
(Pdb) 
> /Users/rkm/.pyenv/versions/3.5.9/lib/python3.5/weakref.py(137)__getitem__()
-> o = self.data[key]()
(Pdb) 
KeyError: 140350158314784
> /Users/rkm/.pyenv/versions/3.5.9/lib/python3.5/weakref.py(137)__getitem__()
-> o = self.data[key]()
(Pdb) 
--Return--
> /Users/rkm/.pyenv/versions/3.5.9/lib/python3.5/weakref.py(137)__getitem__()->None
-> o = self.data[key]()
(Pdb) 
KeyError: 140350158314784
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(650)__new__()
-> return cls._cached_objects[object_ptr.value]
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(651)__new__()
-> except KeyError:
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(652)__new__()
-> pass
(Pdb) 
> /Users/rkm/projects/beeware/rubicon/objc/rubicon/objc/api.py(655)__new__()
-> if not issubclass(cls, ObjCClass) and object_isClass(object_ptr):
(Pdb) 
zsh: segmentation fault  pytest tests/test_core.py::RubiconTest::test_class_properties_lifecycle_stron

samschott · 2021-06-30T11:55:58Z

Maybe it's because we haven't added handling for weak properties in dealloc yet?

Hm, I don't think so. Weak properties should not need any handling in dealloc, right? It's the strong properties where we need to explicitly call release during dealloc, if I understand things correctly.

@freakboy3742, thanks for the investigation, this is very helpful. I can reproduce the segfault when adding a breakpoint just before accessing the properties.object attribute in test_class_properties_lifecycle_strong. So it's definitely a memory management bug somewhere.

samschott · 2021-06-30T12:00:28Z

Interestingly this (new) test also segfaults with the current master branch. It's likely is the manifestation of an existing issue, maybe introduced in #201.

samschott · 2021-06-30T12:30:28Z

I've found the issue and its protracted. After adding the following debug code to the setter:

        def _objc_setter(objc_self, _cmd, new_value):

            if not isinstance(new_value, self.vartype):
                # If vartype is a primitive, then new_value may be unboxed. If that is the case, box it manually.
                new_value = self.vartype(new_value)

            old_value = get_ivar(objc_self, '_' + attr_name, weak=self.weak)

            print("old_value", old_value.value)
            print("new_value", new_value.value)

            if new_value is old_value:
                return

            if issubclass(self.vartype, objc_id) and new_value and not self.weak:
                # If the new value is a non-null object, retain it.
                print("retaining", new_value.value)
                send_message(new_value, 'retain', restype=objc_id, argtypes=[])

            set_ivar(objc_self, '_' + attr_name, new_value, weak=self.weak)

            if issubclass(self.vartype, objc_id) and old_value and not self.weak:
                # If the old value is a non-null object, release it.
                print("releasing", old_value.value)
                send_message(old_value, 'release', restype=None, argtypes=[])

the test test_class_properties_lifecycle_strong prints the following to stdout:

old_value None
new_value 140575283530480
retaining 140575283530480
releasing 140575283530480

So while setting the property value, it is both retained and released. I suspect this happens because the first access with get_ivar() returns None and per your documentation of get_ivar():

For non-object types (everything except :class:objc_id and subclasses), the returned data object is backed by the
ivar's actual memory. This means that the data object is only usable as long as the "owner" object is alive, and
writes to it will directly change the ivar's value.

Setting the new value therefore directly changes old_value and the check issubclass(self.vartype, objc_id) and old_value evaluates to True, triggering a release.

Edit: It turns out that we always get old_value.value == new_value.value after setting the ivar, even when the old value is a non-null object. I.e., old_value.value returns a pointer to the new value after the set_ivar() call. I think I have fixed the handling in the setter now, while still properly releasing any old value. However, I am probably missing something here. Shouldn't get_ivar() return the value stored in the in the ivar instead of the ivar's actual memory in case of a non-null objc_id?

samschott · 2021-06-30T14:39:14Z

rubicon/objc/runtime.py

+    if weak:
+        value = libobjc.objc_loadWeakRetained(obj.value + libobjc.ivar_getOffset(ivar))
+        return libobjc.objc_autoreleaseReturnValue(value)
+    elif isinstance(vartype, objc_id):
        return cast(libobjc.object_getIvar(obj, ivar), vartype)


When is this cast required? Isn't it sufficient to to return a objc_id which later gets converted to the correct ObjCInstance?

If it is required, do we need to add a similar cast before returning the value in the weak case?

You're right that libobjc.object_getIvar already returns an objc_id. The most common case is that vartype is exactly objc_id, in which case the cast indeed does nothing. I think it's only needed for the less common case where vartype is Class (which we have declared as a subclass of objc_id) so that the return value is cast to Class instead of a plain objc_id.

Ah, ok! So we should perform a similar cast when returning a value in the weak case as well.

dgelessus · 2021-07-01T00:39:26Z

Hm, I don't think so. Weak properties should not need any handling in dealloc, right? It's the strong properties where we need to explicitly call release during dealloc, if I understand things correctly.

Yes, strong properties should also be handled in dealloc, but that's less of a problem. Not releasing the values of strong properties "just" causes a memory leak - it doesn't lead to crashes. Weak properties on the other hand could cause segfaults if not correctly cleaned up, I think, because a weak pointer gets set to nil if the object it points to is deallocated. So if something like this happens:

Object A (which has a weak property) is allocated
Object B is allocated
Object A's weak property is set to point to object B
Object A is deallocated, but its weak property is not cleaned up
Object B is deallocated

the runtime will set Object A's weak property to nil, but Object A has already been deallocated, so some other memory gets zeroed instead.

But as you've already figured out (thank you for the debugging!) the segfault we're seeing is caused by something else.

Shouldn't get_ivar() return the value stored in the in the ivar instead of the ivar's actual memory in case of a non-null objc_id?

Yes, in fact this should always be the case, no matter if the ivar is currently nil or not. I'm honestly really confused why you're getting different behavior. If the ivar type is objc_id, this branch of get_ivar should execute:

    if isinstance(vartype, objc_id):
        return cast(libobjc.object_getIvar(obj, ivar), vartype)

and libobjc.object_getIvar returns the ivar's value (i. e. the address stored in the ivar, not the address of the ivar) as an integer, which is then cast to objc_id. So the returned object should always be a fresh objc_id object that isn't coupled to any existing memory address.

samschott · 2021-07-01T10:42:35Z

Weak properties on the other hand could cause segfaults if not correctly cleaned up

Oh, I had not thought of that. I'll need to check. Incidentally, what is a good way of overriding dealloc at runtime? Could one use method_setImplementation for this?

If the ivar type is objc_id, this branch of get_ivar should execute:

I think the problem lies with the isinstance(vartype, objc_id) check. vartype is a class and not an instance so the check should probably be replaced with issubclass.

samschott · 2021-07-01T12:25:21Z

You are right about the cleanup of weak ivars. Testing the cycle which you have outlined prints the following warning:

__weak variable at 0x7fed284f8e00 holds 0x800007fed267083b instead of 0x7fed26689610. This is probably incorrect use of objc_storeWeak() and objc_loadWeak(). Break on objc_weak_error to debug.

this acts like add_method if the method does not exist, replaces it otherwise

dgelessus · 2021-07-06T00:36:34Z

rubicon/objc/runtime.py

+def replace_method(cls, selector, method, encoding):
+    """Add a new instance method to the given class or replace an existing instance method.


Hmm, I don't know if it's a good idea to always replace existing methods by default... In most cases where add_method is called, there should be no existing method with the same name - and if there is, either Rubicon or the programmer probably did something wrong. So IMHO it would be better to keep add_method with the current behavior of failing if the method already exists, and add replace_method as a new separate function (or perhaps as a kwarg replace=True/False, as the implementations are very similar).

Fair point. I can see the value of keeping add_method and raising an error when it fails.

dgelessus · 2021-07-06T00:39:34Z

rubicon/objc/runtime.py

@@ -874,7 +886,7 @@ def add_method(cls, selector, method, encoding):

    cfunctype = CFUNCTYPE(*signature)
    imp = cfunctype(method)
-    libobjc.class_addMethod(cls, selector, cast(imp, IMP), types)


That said, it looks like our existing add_method implementation doesn't check the return value of class_addMethod, so the method would silently do nothing if a conflicting method already exists. So if we keep add_method we should also fix it to raise an error if class_addMethod returns false.

dgelessus · 2021-07-06T01:26:18Z

rubicon/objc/api.py

+            old_dealloc_callable = cast(old_dealloc, cfunctype)
+            old_dealloc_callable(objc_self, SEL("dealloc"))
+
+        replace_method(class_ptr, 'dealloc', _new_delloc, [None, ObjCInstance, SEL])


If there are multiple properties defined on one class, this is going to re-wrap dealloc once for each property, right? Although that works, IMO it would be better if we define dealloc only once and have it clean up all of the object's properties in a simple loop. This would be easier to read and debug, and probably also a bit faster, because it avoids calling back and forth many times between Python and Objective-C.

Though with the current class_register mechanism there's no way to implement this - class_register is called once for each property, and the call has no information about other properties in the class. So this would probably require adding an extra method to objc_property that implements deallocating a single property, and have ObjCClass._new_from_class_statement define a dealloc method that asks each property to deallocate itself.

True, it's not particularly elegant. I was already thinking of having a single dealloc method in ObjCClass._new_from_class_statement but was torn between having a simpler dealloc and keeping all the registration / reference management functionality contained in the objc_property class.

dgelessus · 2021-07-06T01:37:12Z

rubicon/objc/api.py

+            if self.weak:
+                # Clean up weak reference.
+                ivar = libobjc.class_getInstanceVariable(libobjc.object_getClass(objc_self), ivar_name.encode())
+                libobjc.objc_storeWeak(objc_self.value + libobjc.ivar_getOffset(ivar), None)
+            elif issubclass(self.vartype, objc_id):
+                # If the old value is a non-null object, release it. There is no need to set the actual ivar to nil.
+                old_value = get_ivar(objc_self, ivar_name, weak=self.weak)
+                send_message(old_value, 'release', restype=None, argtypes=[])


Could this be simplified somehow by calling _objc_setter or set_ivar? Especially in the weak case the logic for accessing the ivar is a bit more complex, so it would be better to use our existing functions instead of reimplementing it.

Sure. At least, I think so. The current implementation ensures that the super dealloc only gets called after we have cleaned up our ivars so the instance should generally still be in a sane state.

I remember now why I did this. The current set_ivar implementation raises a TypeError when setting the value to None. This can be changed of course, it is common place in ObjC to set a value to nil. Edit: Or just convert None to self.vartype 🤦

rubicon/objc/api.py

dgelessus · 2021-07-10T02:31:00Z

rubicon/objc/api.py

+            # Invoke original dealloc.
+            cfunctype = CFUNCTYPE(None, objc_id, SEL)
+            old_dealloc_callable = cast(old_dealloc, cfunctype)
+            old_dealloc_callable(objc_self, SEL("dealloc"))
+
+        add_method(ptr, "dealloc", _new_delloc, [None, ObjCInstance, SEL], replace=True)


What behavior do we want if a class contains both a user-defined dealloc method and properties with dealloc callbacks? The implementation here will call the dealloc callbacks before the user-defined dealloc method. But in practice I think the opposite order would be more useful - that way the user-defined dealloc could call methods on objects stored in properties before they are released by the callbacks (which could cause them to be deallocated right away).

Implementing this would be a bit more difficult. A user-defined dealloc is expected to end with a call to the superclass dealloc. But if a class has dealloc callbacks, those would need before the superclass dealloc (but also after the user-defined dealloc code). There's no way for Rubicon to insert the dealloc callbacks into the user's dealloc method, so the user would have to manually call the dealloc callbacks before the super call (via an extra function/method provided by Rubicon).

A different solution would be to allow users to define their own dealloc callback that gets called before any dealloc callbacks from properties. Then users could use that callback instead of manually overriding dealloc, and let Rubicon generate a dealloc method that calls all callbacks in the right order.

I agree that calling the user's dealloc before clearing any instance variables can make life at lot easier for the user. But I don't really like either of the proposed solutions, both look like workarounds that force the user to become aware of the memory management which we are otherwise performing behind the scenes.

I can possibly think of two alternative approaches that could provide a better user experience (but would complicate our own implementation):

We document that user should not call the dealloc of the super class in their own implementation, as in a proper ARC environment. We then call the following methods in order: First, any user-defined dealloc, second, our cleanup code and third the super class's dealloc. This would mean treating the dealloc definition differently from other method definitions, at least internally. It would however be transparent to the user.

We call our own cleanup code after calling the old delloc implementation (which includes the user's code and any calls to the super dealloc). I'm not sure if this is possible to do reliably.

What do you think, is either of those options worth the effort?

Option 1 would be ideal, because it's simple to use and matches what you do in normal ARC Objective-C code. But implementing this behavior now would silently break existing code that overrides dealloc and correctly calls the superclass dealloc at the end - that would cause the superclass dealloc to get called twice, which can lead to other objects being released too often and other similar problems. That's the main reason why I suggested adding a separate user-definable callback for the same purpose.

Rubicon is still before version 1.0, so we could still make breaking changes like this, especially if it improves usability in the long term. But if we do that, we should try to throw errors for code relying on the old behavior, to avoid silent double frees in code that was previously correct.

I honestly don't know if option 2 would be safe or not... There should be no way for the superclass deallocs to corrupt ivars defined by the subclass, but I wouldn't really rely on it. Especially NSObject's top-level dealloc could do things that make the entire object unusable somehow.

Another alternative would be to not touch user-defined deallocs at all, and only generate an automatic dealloc if the user hasn't already overridden dealloc manually. This would be fully compatible with existing code and should also be simple to implement. The disadvantage is that if the user really needs to add custom code to dealloc, they then have to manually do all the cleanup that the dealloc callbacks would have done automatically.

I do prefer Option 1 from those choices, or not touching use-defined deallocs at all together with a clear documentation on manual cleanup. My only issue with Option 1 is, how can we raise an error if user does call the superclass dealloc? Is there an elegant way of doing so? We are not a compiler after all and don't want to inspect the actual code in the user's dealloc.

I can confirm that running our cleanup after the superclass dealloc does lead to trouble during the ivar cleanup. In particular, libobjc.class_getInstanceVariable() returns None. So this is not viable.

I've implemented the first option now: calling the user's dealloc, then our cleanup, then the superclass dealloc. For the time being, there is no special error handling if the user calls the superclass dealloc manually. It will however raise (obscure) errors when we run our own cleanup. Without our own cleanup code (for example when no properties are declared), dealloc is called twice and leads to a segfault. In either case, users will notice that something is wrong without knowing what it might be. Not ideal...

The best workaround I can come up with, short of actually inspecting the user's dealloc code when creating the class, is to print a warning when send_super is called, before either segfaulting or failing to complete the dealloc. What do you think, is this acceptable?

Sorry for the late reply - had some uni stuff that I needed to finish in the last few days.

The implementation you wrote looks good IMO. You're right that we can't do much to guard against old code that still calls send_super at the end of dealloc. The only option I can think of is what you've already suggested - add a special case inside send_super that warns whenever send_super(self, "dealloc", ...) is called. If we do that, we just need to make sure that the warning doesn't appear when Rubicon itself calls send_super(self, "dealloc", ...) - probably using an internal keyword to suppress the warning?

That way I think we could even "fix" the segfaults. If send_super(self, "dealloc", ...) is called without the special internal keyword argument, we can make it show a warning and then return without actually calling the super method. That way, if a user-defined dealloc calls send_super, the super dealloc won't actually be called yet. Then the user-defined dealloc returns to Rubicon, which runs the dealloc callbacks then makes the real send_super call (with the internal keyword argument, so that this time it actually calls the super method).

This isn't a very nice solution - but if it works, I would rather have some less nice code in Rubicon that shows a helpful warning about the change, rather than breaking existing code so that it causes unexplained segfaults.

Co-authored-by: dgelessus <dgelessus@users.noreply.github.com>

dgelessus

Can confirm that the send_super(..., "dealloc", ...) warning works as expected. A couple of small things, then we are really done I think 🙂

dgelessus · 2021-07-23T11:58:43Z

rubicon/objc/runtime.py

+    if not _allow_dealloc and selector.name == b"dealloc":
+        warnings.warn(
+            "You should not call the superclass dealloc manually when overriding dealloc. Rubicon-objc "
+            "will call it for you after releasing objects stored in properties and ivars."


We should use stacklevel here, so that the warning points to the source location of the send_super call and not the warn call:

Suggested change

"will call it for you after releasing objects stored in properties and ivars."

"will call it for you after releasing objects stored in properties and ivars.",

stacklevel=2,

dgelessus · 2021-07-23T12:00:10Z

tests/test_core.py

+        assert attr0.retainCount() == 2
+        assert attr1.retainCount() == 1


Only noticed this now - our tests use plain standard unittest and not PyTest, so these asserts should be written using self.assertEqual and not the assert keyword.

Oops. Yes, now that you say that...

dgelessus · 2021-07-23T12:00:29Z

tests/test_core.py

+        assert obj._did_dealloc, "custom dealloc did not run"
+        assert attr0.retainCount() == 1, "strong property value was not released"
+        assert attr1.retainCount() == 1, "weak property value was released"


As above, this should use self.assertEqual and self.assertTrue.

samschott · 2021-07-23T13:09:08Z

What do you think about the expanded docs on memory management? Is the new section on reference cycles helpful? I was trying to balance being brief and giving all the necessary information + code snippets.

dgelessus · 2021-07-23T13:26:08Z

Yes, the new docs are definitely a useful addition! That way people running into reference cycle problems can find out about weak properties more easily. I don't think the docs are too long or detailed either.

dgelessus

samschott force-pushed the weak-properties branch 2 times, most recently from 070091e to cdd4ee5 Compare June 27, 2021 21:27

dgelessus reviewed Jun 30, 2021

View reviewed changes

rubicon/objc/api.py Outdated Show resolved Hide resolved

rubicon/objc/runtime.py Outdated Show resolved Hide resolved

rubicon/objc/api.py Outdated Show resolved Hide resolved

rubicon/objc/runtime.py Outdated Show resolved Hide resolved

samschott force-pushed the weak-properties branch 3 times, most recently from dba468f to 7a40f08 Compare June 30, 2021 11:40

samschott force-pushed the weak-properties branch from 1cdca01 to 1eb95fe Compare June 30, 2021 12:42

samschott commented Jun 30, 2021

View reviewed changes

samschott force-pushed the weak-properties branch 3 times, most recently from b078dbf to 4ea8131 Compare June 30, 2021 15:48

Sam Schott added 4 commits July 1, 2021 13:18

allow declaring weak properties

884960a

add tests for strong and weak property lifetcycles

041061d

add changelog entry

63ef1c3

fix ivar access for objc_id types

e475b93

samschott force-pushed the weak-properties branch from 4ea8131 to e475b93 Compare July 1, 2021 12:19

Sam Schott added 2 commits July 1, 2021 14:34

change add_method -> replace_method

f2da5ef

this acts like add_method if the method does not exist, replaces it otherwise

clean up properties in dealloc

e9f2d78

update docs

dd83d5d

samschott force-pushed the weak-properties branch from 95eff15 to dd83d5d Compare July 1, 2021 13:51

minor cleanup

2d50c47

samschott force-pushed the weak-properties branch from d2cc6f9 to 2d50c47 Compare July 1, 2021 13:57

dgelessus reviewed Jul 6, 2021

View reviewed changes

samschott force-pushed the weak-properties branch from 27caa2a to bd70e4e Compare July 6, 2021 09:48

Sam Schott added 3 commits July 6, 2021 10:50

revert replace_method to add_method with replace kwarg

e606a40

use set_ivar in dealloc

ffa7467

implement a single dealloc replacement for all properties

6701965

samschott force-pushed the weak-properties branch from bd70e4e to 6701965 Compare July 6, 2021 09:52

dgelessus reviewed Jul 10, 2021

View reviewed changes

Check all class attributes for dealloc callbacks

eaf0586

Co-authored-by: dgelessus <dgelessus@users.noreply.github.com>

samschott force-pushed the weak-properties branch from 3998925 to 6a60e42 Compare July 14, 2021 12:24

expand how-to section with notes on reference cycles

1b071f5

samschott force-pushed the weak-properties branch from 6a60e42 to 1b071f5 Compare July 14, 2021 12:27

perform any user-defined dealloc before our own cleanup

b3cdb75

samschott force-pushed the weak-properties branch from ca2ca6d to a82cf90 Compare July 16, 2021 10:29

dgelessus reviewed Jul 23, 2021

View reviewed changes

Sam Schott added 2 commits July 23, 2021 13:50

add tests for proper dealloc behavior

1f7f74a

warn the user when calling dealloc manually

2a514ea

samschott force-pushed the weak-properties branch from 81992a8 to 2a514ea Compare July 23, 2021 12:51

dgelessus approved these changes Jul 23, 2021

View reviewed changes

dgelessus merged commit 4dffad5 into beeware:master Jul 23, 2021

samschott deleted the weak-properties branch July 23, 2021 13:34

		def replace_method(cls, selector, method, encoding):
		"""Add a new instance method to the given class or replace an existing instance method.

	"will call it for you after releasing objects stored in properties and ivars."
	"will call it for you after releasing objects stored in properties and ivars.",
	stacklevel=2,

		assert attr0.retainCount() == 2
		assert attr1.retainCount() == 1

Allow declaring weak properties #211

Allow declaring weak properties #211

Conversation

samschott commented Jun 27, 2021

PR Checklist:

samschott commented Jun 27, 2021

samschott commented Jun 28, 2021

freakboy3742 commented Jun 28, 2021

samschott commented Jun 29, 2021 • edited Loading

dgelessus commented Jun 30, 2021

freakboy3742 commented Jun 30, 2021

samschott commented Jun 30, 2021

samschott commented Jun 30, 2021

samschott commented Jun 30, 2021 • edited Loading

samschott Jun 30, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dgelessus commented Jul 1, 2021

samschott commented Jul 1, 2021 • edited Loading

samschott commented Jul 1, 2021

Choose a reason for hiding this comment

samschott Jul 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dgelessus Jul 6, 2021 • edited Loading

Choose a reason for hiding this comment

samschott Jul 6, 2021 • edited Loading

Choose a reason for hiding this comment

samschott Jul 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samschott Jul 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dgelessus left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

samschott commented Jul 23, 2021

dgelessus commented Jul 23, 2021

dgelessus left a comment

Choose a reason for hiding this comment

samschott commented Jun 29, 2021 •

edited

Loading

samschott commented Jun 30, 2021 •

edited

Loading

samschott Jun 30, 2021 •

edited

Loading

samschott commented Jul 1, 2021 •

edited

Loading

samschott Jul 6, 2021 •

edited

Loading

dgelessus Jul 6, 2021 •

edited

Loading

samschott Jul 6, 2021 •

edited

Loading

samschott Jul 6, 2021 •

edited

Loading

samschott Jul 10, 2021 •

edited

Loading