Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for 3.11 #1087

Merged
merged 9 commits into from
Oct 26, 2022
Merged

Fixes for 3.11 #1087

merged 9 commits into from
Oct 26, 2022

Conversation

Thrameos
Copy link
Contributor

@Thrameos Thrameos commented Sep 1, 2022

Fixes #1086

@Thrameos Thrameos mentioned this pull request Sep 1, 2022
@codecov
Copy link

codecov bot commented Sep 1, 2022

Codecov Report

Attention: Patch coverage is 89.28571% with 3 lines in your changes missing coverage. Please review.

Project coverage is 88.62%. Comparing base (82a7658) to head (64223ad).
Report is 266 commits behind head on master.

Files Patch % Lines
native/common/jp_exception.cpp 80.00% 0 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1087      +/-   ##
==========================================
- Coverage   88.64%   88.62%   -0.02%     
==========================================
  Files         112      111       -1     
  Lines       10247    10207      -40     
  Branches     4012     4014       +2     
==========================================
- Hits         9083     9046      -37     
+ Misses        705      701       -4     
- Partials      459      460       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pelson
Copy link
Contributor

pelson commented Sep 5, 2022

We still have compile errors:

ative/common/jp_exception.cpp: In function ‘PyObject* tb_create(PyObject*, PyObject*, const char*, const char*, int)’:
native/common/jp_exception.cpp:527:9: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘frame’; did you mean ‘cframe’?
  527 |   state.frame = (PyFrameObject*) prev.get();
      |         ^~~~~
      |         cframe
native/common/jp_exception.cpp:530:9: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘frame’; did you mean ‘cframe’?
  530 |   state.frame = NULL;
      |         ^~~~~
      |         cframe
native/common/jp_exception.cpp:540:84: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
  540 |  JPPyObject lasti = JPPyObject::claim(PyLong_FromLong(((PyFrameObject*)frame.get())->f_lasti));
      |                                                                                    ^~
In file included from /opt/_internal/cpython-3.11.0rc1/include/python3.11/Python.h:42,
                 from native/common/jp_exception.cpp:16:
/opt/_internal/cpython-3.11.0rc1/include/python3.11/pytypedefs.h:22:16: note: forward declaration of ‘PyFrameObject’ {aka ‘struct _frame’}
   22 | typedef struct _frame PyFrameObject;
      |                ^~~~~~
error: command '/usr/bin/ccache' failed with exit code 1

Perhaps worth enabling CI for Python 3.11 (you might have to use the Python available on the manylinux image, rather than what Azure provide). I note that currently a release of JPype would fail because it would try to build a Python 3.11 wheel (since we glob the Python versions available on manylinux), even though that isn't tested (something we could try to address another time).

@rapgro
Copy link

rapgro commented Sep 14, 2022

native/common/jp_exception.cpp:527:9: error: ‘PyThreadState’ {aka ‘struct _ts’} has no member named ‘frame’; did you mean ‘cframe’?

Changes of the [PyThreadState](https://docs.python.org/3.11/c-api/init.html#c.PyThreadState) structure members:

    frame: removed, use [PyThreadState_GetFrame()](https://docs.python.org/3.11/c-api/init.html#c.PyThreadState_GetFrame) (function added to Python 3.9 by [bpo-40429](https://bugs.python.org/issue?@action=redirect&bpo=40429)).

see https://docs.python.org/3.11/whatsnew/3.11.html#id6

@pelson
Copy link
Contributor

pelson commented Oct 4, 2022

see https://docs.python.org/3.11/whatsnew/3.11.html#id6

Thanks @rapgro, but the problem is that we are setting the frame, not getting the frame in this case. I appreciate that this hasn't been public API in any Python version from Python 2.7. Some interesting context on the motivation at #1071 (comment).

@Thrameos
Copy link
Contributor Author

@pelson I found the issue with the broken test. So I have a fully working copy for 3.11. Now we just need to get the multi version going and we should be good to go.

@Thrameos
Copy link
Contributor Author

@marscher Some background on this. Python 3.11 removed a number of internal methods that we were using. These methods were used because the Python API is completely deficient in some areas like handling stack frames or using custom allocators with GC objects. They are planning replacements for what we need but none of the replacements are in 3.11. So they gutted the interface we are standing on, and there is no replacement. Meaning unless I do something drastic we won't support 3.11 at all.

So I had to come up with a few creative kludges. The biggest danger is these kludges will introduce reference leaks as they are doing things like initializing an object twice and swapping the memory layout. They are safer in the sense that I managed to get them to use the public API, but in horrible ways (basically creating using side effects of the public implementations). Hopefully these last until the replacement API is released.

@marscher
Copy link
Member

Python 3.11 removed a number of internal methods that we were using. These methods were used because the Python API is completely deficient in some areas like handling stack frames or using custom allocators with GC objects. They are planning replacements for what we need but none of the replacements are in 3.11. So they gutted the interface we are standing on, and there is no replacement. Meaning unless I do something drastic we won't support 3.11 at all.

That's bad news, lets hope that 3.12 will add support for what is needed.

So I had to come up with a few creative kludges. The biggest danger is these kludges will introduce reference leaks as they are doing things like initializing an object twice and swapping the memory layout. They are safer in the sense that I managed to get them to use the public API, but in horrible ways (basically creating using side effects of the public implementations). Hopefully these last until the replacement API is released.

I wonder how bullet proof our leakage testing is. This could be the time to check and eventually improve this. I know this is hard to achieve, but would safe us lots of troubles some day.
If we now can compile and run the testsuite on 3.11m, I'd recommend adding it to the testing matrix.

@Thrameos
Copy link
Contributor Author

Our testing matrix checks common leak paths in objects. This particular spot is on types not general objects, so it only really leaks after the JVM is shutdown which is less of an issue than an object leak. Thus unless you were adding lots and lots of dynamically allocated types form Java then the leak may be a problem. But as we always cache types they are never freed in normal situations anyway.

Copy link
Contributor

@pelson pelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few change proposals in Thrameos#56.

In general: I propose that this is merged, and that we have a special v1.4.0.post0 Python3.11 wheel be uploaded to PyPI before the 2022-10-24 (Python 3.11 release day).

native/python/pyjp_value.cpp Outdated Show resolved Hide resolved
native/python/pyjp_value.cpp Outdated Show resolved Hide resolved
native/python/pyjp_value.cpp Show resolved Hide resolved
current = StrictVersion(platform.mac_ver()[0][:4])
if current >= StrictVersion('10.6') and current < StrictVersion('10.9'):
current = Version(platform.mac_ver()[0][:4])
if current >= Version('10.6') and current < Version('10.9'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is part of your distutils handling. I suggest that this can be a separate PR, or alternatively, you should add packaging as an explicit dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. This was required to get Python 3.11 to work for me so I think it is same theme. I will look at dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I dont see where it goes in setup.py. we never listed distutils as a requirement. Can you suggest the required line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it was supposed to have distutils there before.

https://packaging.python.org/en/latest/discussions/install-requires-vs-requirements/

I don't see the same format we have so I am guessing we want...

install_requires=['typing_extensions ; python_version< "3.8"',
        'packaging ; python_version< "3.10"'],

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

distutils is a standard library, which has been deprecated for a while now and will be removed in Python 3.12.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'packaging ; python_version< "3.10"'

Should just be

'packaging'

native/common/jp_exception.cpp Outdated Show resolved Hide resolved
setupext/test_java.py Show resolved Hide resolved
Copy link
Contributor

@pelson pelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The minor point about adding packaging as a dependency aside, looks good to me. Much better than what was there before in fact. Excellent work @Thrameos!

@pelson
Copy link
Contributor

pelson commented Oct 19, 2022

@Thrameos & @marscher - I'm unavailable for a bit now, and won't be back before the Python 3.11 release. Do you agree that we should merge and release a v1.4.0.post0 wheel, with only a Python 3.11 build? I believe this should work out just fine for existing users, and new Python 3.11 users will get the newer version.

@Thrameos
Copy link
Contributor Author

Thrameos commented Oct 19, 2022

@marscher I wpuld look to you regarding this decision.

I would be fine with releasing a new patch or new minor with the fixes.

I can start the build process once you decide which.

@potiuk potiuk mentioned this pull request Oct 26, 2022
11 tasks
potiuk added a commit to apache/airflow that referenced this pull request Oct 26, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
@Thrameos Thrameos merged commit d83200a into jpype-project:master Oct 26, 2022
@marscher
Copy link
Member

I'm also fine with a micro patch release for this kind of change. Sorry for the delay!

@Thrameos
Copy link
Contributor Author

Will do. Thanks.

potiuk added a commit to apache/airflow that referenced this pull request Oct 27, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
potiuk added a commit to apache/airflow that referenced this pull request Oct 27, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
potiuk added a commit to apache/airflow that referenced this pull request Oct 31, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
@vstinner
Copy link
Contributor

vstinner commented Nov 3, 2022

native/python/pyjp_value.cpp:

		// Horrible kludge because python lacks an API for allocating a GC type with extra memory
		// The private method _PyObject_GC_Alloc is no longer visible, so we are forced to allocate
		// a different type with the extra memory and then hot swap the type to the real one.

FYI @encukou is working on a Python API for that.

@Thrameos
Copy link
Contributor Author

Thrameos commented Nov 3, 2022

Yep. Have looked at that API, but it doesn't exist for Py 3.11 so we are stuck because our old method using private functions doesn't work. Worse it appears this kludge may break something in the garbage collector at least in one projects testing. So I need to investigate further this weekend.

@marscher
Copy link
Member

marscher commented Nov 7, 2022

Thanks for taking the shot!

Should we try to pin JPype to use Python<3.11 in the meantime, if we run into such horrible subtle errors like garbage collection issues?

@pelson
Copy link
Contributor

pelson commented Nov 7, 2022

Thanks for taking the shot!

Should we try to pin JPype to use Python<3.11 in the meantime, if we run into such horrible subtle errors like garbage collection issues?

IMO, if we can engage with the Python RC process, there should be enough time to avoid nasty surprises. If we find that we absolutely can't be compatible with Python 3.12, we can always release something then which has that constraint. In general, upper pins are discouraged (long read).

I think some more dialogue between the C API team and JPype should come out of this 🤞, so I truly hope things will get better, not worse, in the future.

@marscher
Copy link
Member

marscher commented Nov 7, 2022

A first step could be testing against unreleased versions of Python, right? I'm sure they provide a nightly build, which we can integrate into our CI process on a regular basis.

@pelson Thanks for the reference about version pinning. I wasn't aware, that is becoming a disease in the eco system :)
How do you suggest to get in touch with the C-API team? If the nightly Python job fails, we could get in touch like on their bug-tracker or IRC, whatever they use nowadays.

@encukou
Copy link

encukou commented Nov 7, 2022

The first Alpha of CPython 3.12 is already out. I think the alphas/betas would work bit better for you than nightly builds, since they're proper releases that pass the full test suite (i.e. stable buildbots) & have release blockers sorted out.

There's no formal “C API” team, but to get in touch with the relevant people, use the expert-C-API label on the CPython bug tracker, or the just-opened C-API category on Discourse: https://discuss.python.org/c/core-dev/c-api/30

@marscher
Copy link
Member

marscher commented Nov 7, 2022

Thanks a lot @encukou for these useful hints!

potiuk added a commit to apache/airflow that referenced this pull request Nov 24, 2022
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: #27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
potiuk added a commit to potiuk/airflow that referenced this pull request Jan 19, 2023
Python 3.11 has been released as scheduled on October 25, 2022 and
this is the first attempt to see how far Airflow (mostly dependencies)
are from being ready to officially support 3.11.

So far we had to exclude the following dependencies:

- [ ] Pyarrow dependency: apache/arrow#14499
- [ ] Google Provider: apache#27292
  and googleapis/python-bigquery#1386
- [ ] Databricks Provider:
  databricks/databricks-sql-python#59
- [ ] Papermill Provider: nteract/papermill#700
- [ ] Azure Provider: Azure/azure-uamqp-python#334
  and Azure/azure-sdk-for-python#27066
- [ ] Apache Beam Provider: apache/beam#23848
- [ ] Snowflake Provider:
  snowflakedb/snowflake-connector-python#1294
- [ ] JDBC Provider: jpype-project/jpype#1087
- [ ] Hive Provider: cloudera/python-sasl#30

We might decide to release Airflow in 3.11 with those providers
disabled in case they are lagging behind eventually, but for the
moment we want to work with all the projects in concert to be
able to release all providers (Google Provider requires quite
a lot of work and likely Google Team stepping up and community helping
with migration to latest Goofle cloud libraries)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build for Python 3.11
6 participants