-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python 3.13 problems #1204
Comments
They have made it very difficult to allocate memory with differing layouts. I tried to propse a fix, but it required putting in a PEP. Unfortunately I was almost immediately hit with an emergency at work and have been unable to find any time since. The solution to this problem would be to create an actual heap type first then use it to allocate the desired memory rather than this static type hack. Or they need to give back access to making gc allocations. Given there is a slot for allocators, removing the gc call private basic deprecates the slot function. I wish I had the time to persue the PEP but I have another 3 months of work before I can cycle back to major work here. |
What I'm having trouble understanding is why this is even necessary at all? Shouldn't you be doing something like this? Mostly copypasta from https://docs.python.org/3.13/extending/newtypes_tutorial.html#the-basics struct JPValueObject {
PyObject_HEAD;
JPValue slot;
};
static PyTypeObject JPValueType = {
.ob_base = PyVarObject_HEAD_INIT(NULL, 0)
.tp_name = "custom.Custom",
.tp_basicsize = sizeof(JPValueObject),
.tp_itemsize = 0,
.tp_flags = Py_TPFLAGS_DEFAULT,
.tp_new = PyType_GenericNew,
};
JPValue* PyJPValue_getJavaSlot(PyObject* self)
{
if (PyObject_IsInstance(self, (PyObject *)&JPValueType)) {
return &((JPValueObject *) self)->slot;
}
return nullptr;
} The only one I foresee causing trouble would be I'm sure there was a reason, if only at one point in time, for doing it the way it is, I'm just not seeing why. |
Unfortunately you missed something vital. The Java slot can't appear in the structure at all. We are using this as a mixin because we must be compatible with all possible Python concrete types. If you try to put it in the structure then it will conflict with int, float, exception etc which we must derive from as all base types are concrete. We thus used the alloc slot to adjust shift the Python structure down 8 bytes and added the needed data in front of the Python structure where we could always guarantee access at order 1. The issue is that Python developers decided to start placing their info in front to the Python object with no api to handle conflicts. This basically breaks the whole concept of the alloc slot as you have no way to alloc object memory with variable size. I wrote a PR to give an API but have no way to pursue it. |
So it seems to be a combination of problems. First is the one already discussed and the second I think is changes to how a type is created which makes I had hoped to be able to figure out the problem myself and have my employer submit the fix. Unfortunately it would probably require significant refactoring that would take me months, due to the lack of familiarity with the jpype internals, and I'm not able to justify it. It seems the best way would be to tell you what I know and let you handle it. The only thing I could think of that might not require a major rework would be to stick the Also, c++ not supporting c99 designated initializers is extremely annoying. |
Thanks, I will take a look.
…--Karl
|
I am still working this one, but thus far it looks hopeless. They have removed all access to the GC (alloc, link, but still have track). Thus creating a working allocation using the alloc slot means either calling into their available calls as a side effect to get an allocated and linked object or replicate their private structures and somehow gaining access to the current state. Then they used the custom allocation slot (which we have been using for a while) and allocated extra memory before the object and extra memory behind other objects (which was never reported to the allocator). The problem with JPClass is that types request 40 extra bytes at the end even though they have an item count of zero. Thus the crash was just making a call to the type code while using a custom allocator. My allocator does know that they need an extra 40 bytes (and that is where I was going to place my memory.) While other objects just allocate the correct space. I could go back to adding my extra memory in front of the GC head, but they have now moved the weak reference tables and the dictionary pointer there. And as the allocations are all hard coded I can't politely ask for a few extra bytes that I require. They were going to give a mechanism for user requested data, but it doesn't handle their broken object layout meaning that it is useless for much other than a generic object. I tried to send a fix, but they requested a PEP which given I have been going 60 hours a week for the last 9 months just can't happen. Thus I am back to horrible kludges. I was allocating my object by adding extra space to a temporary type then mutating it to my object. But that wont work now that they are using custom allocators that lie about the amount of space to allocate. My memory and their extra memory will collide and I would need a secret table of types to figure out which classes need custom allocators. I can't add to the front because they are using the front already, and they have barred access to the allocator, meaning unless I can post mutilate to move the GC head that route is out. Adding to dictionary (very slow) doesn't work because some types don't accept dictionaries. They already did bad things like reusing the ob_size field with integers as no one was accessing it (I was!) The more they mutate and kludge their type system without any long term plan of how someone who needs O(1) access to multiple inheritance, the harder it will be maintain something like JPype. The fact that they are using the allocator slots for their classes and I was just doing the same should have been a clue that they needed a formal API. Instead they have just rendered it impossible to carry out the same patterns they use internally. It would have cost nothing to just make PyObject_GC_Alloc and PyObject_GC_Link accessible to the user as those two are required to create custom memory allocation for GC types. @marscher I have found one pattern that appears to work, but I have to merge it and test with every prior version. Unfortunately if someone can't convince the Python developers to merge my patch for user supported data into CPython, we may be down to our last supported version of JPype. The developers have hidden all of the required functions to replicate the allocator functions, placed assertions that preview mutating the object during allocation to add a few extra bytes, and have taken over both the area in front and behind the object so without an API to support user data in multiple inheritance I am out of options. I tried to send a patch to prevent this outcome, but they sat on it for months, then told me that I had to write a PEP. Unfortunately by the time they got back to me it was a crisis at work and I have been at 60-80 hr work weeks without a break for months, and it likely won't let up soon. Someone else would have to step up to push this forward. |
I have some ideas that should create a path forward with less issues, it will just require a non-trivial refactor. The main caveat would be that each The biggest challenge I'll likely face is getting it working and finishing before I lose interest 😅 |
There were serious issues with the two object approach which is why it was abandoned in Jpype 0.7. The biggest being Isa relations. Primitives, exceptions and the like can't be duck typed to meet their contracts. The must be inherited directly from the Python objects they seek to emulate. They also have there own incomparable memory layouts. And for float and long you can't add anything to the base. You can make a special exception type for it's objects. This brings about the second issue which is the look up time. If you have many layouts then figuring out if it is a class derivative takes time. With the double object we used this was a substantial number of calls. And to test a method overload yous have to check for it for each combination of overloads. If we break that jpype gets rather slow. We could bypass this requirement by adding a slot offset to the type object meta class. All the other language binders have the same problem and thus a get user data method was to be added to basic Python object system, but the implementation missed as it was still incomplete with int, float, and exception (they didn't test can this inherit from every python type). The solution is to fix their type system by making a facility for any type to access user data in O(1). It isn't hard as I already submitted a patch. Trying to fix it at jpype end as they have constantly changing their type system is a losing battle. I would recommend taking the time for the for all solution which is trivial (negative baseline implies the data will be added to the preheader) rather than reworking jpype where you will get a dozen edge cases. |
I am tracking the conversation in the pull request. Hopefully things go in the right direction. For now I will sit down and be quiet. |
Sounds good. There are only three changes in Python that are needed to eliminate all of the black magic that we are doing.
Many language binding (ie C#) need these same pieces. Each isn't that complex, but it will take time and effort to press them forward. |
This is now problematic because
PyType_IS_GC
implies it is on the heap and the pointer gets cast toPyHeapTypeObject
. It will eventually read memory on the stack and interpret it to be some huge size and it will result in aMemoryError
.jpype/native/python/pyjp_value.cpp
Lines 51 to 59 in 653ccff
I have tried replacing it with PyUnstable_Object_GC_NewWithExtraData but it can't handle
nitems > 0
. So I tried the following:That works fine until here
jpype/native/python/pyjp_value.cpp
Line 215 in 653ccff
Unfortunately it then tries to access
(PyDictValues *)((char *)obj + sizeof(PyObject))
when it is not valid. The value atPyDictValues->valid
is not zero so its checks pass and it moves forward until it faults. I suspect the java slot is being put isomewhere python is using and causing memory corruption but I haven't been able to prove it or find out what is really wrong.The text was updated successfully, but these errors were encountered: