Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-41631: _ast module uses again a global state #21961

Merged
merged 3 commits into from
Sep 15, 2020
Merged

bpo-41631: _ast module uses again a global state #21961

merged 3 commits into from
Sep 15, 2020

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Aug 26, 2020

Using a different state per module instance is causing subtle
practical problems.

For example, the Mercurial project replaces the import() function
to implement lazy import, whereas Python expected that "import _ast"
always return a fully initialized _ast module.

https://bugs.python.org/issue41631

@vstinner
Copy link
Member Author

cc @pablogsal @corona10 @encukou

@vstinner
Copy link
Member Author

Hum, I'm not sure that this change is correct. If a second instance of the _ast module is created, astmodule_clear() will clear the state which makes the first instance unusable :-(

@vstinner
Copy link
Member Author

I also reverted partially the commit b1cc6ba "Convert _ast extension to PEP 489": astmodule_clear() and astmodule_free() now do nothing, the global _ast module state is never cleared.

@corona10 corona10 self-requested a review August 26, 2020 11:00
Parser/asdl_c.py Outdated
@@ -1473,6 +1452,7 @@ def write_source(f, mod):
f.write('#include "Python.h"\n')
f.write('#include "%s-ast.h"\n' % mod.name)
f.write('#include "structmember.h" // PyMemberDef\n')
f.write('#include "pycore_pylifecycle.h" // export _PyAST_Fini()\n')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code line occurs Py_BUILD_CORE error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like extern void _PyAST_Fini(PyThreadState *tstate); is not needed to export the symbol. I just removed the include.

@encukou
Copy link
Member

encukou commented Aug 26, 2020

Unfortunately, now all the node types are heap types shared across interpreters.
An arbitrary attribute can be set on an AST node class to smuggle any object to another interpreter.

Can they be made static again? Or maybe the AST state can be part of interpreter state.

@encukou
Copy link
Member

encukou commented Sep 2, 2020

@vstinner @pablogsal: I've made the module state per-interpreter, i.e. part of the interpreter state, here: vstinner/cpython@revert_ast...encukou:revert_ast

The complication is that Python-ast.c is also used to build extension modules by pegen tests. See pegen/ast_dump.py for one note about this. Since there can be several such modules, they can't really use per-interpreter state. My solution is to make their state global, and make such modules loadable only once per process to avoid sharing the state.

@vstinner, do you want to use that commit for this PR?

@vstinner
Copy link
Member Author

I updated my PR to convert AST_type again to a static type to fix a corner case with subinterpreters: https://bugs.python.org/issue41631#msg376686

cc @encukou

@vstinner
Copy link
Member Author

@pablogsal @corona10: Would you mind to review this PR? It fix a release blocker for Python 3.9.0, a regression compared to 3.8: see https://bugs.python.org/issue41631 The deadline for 3.9.0rc2 is next monday :-(

@encukou proposes PR #21973 to workaround the regression and to fix the regression in a 3.9 bugfix release (like 3.9.1).

This PR is non-trivial, but _ast got tons of changes between 3.8 and 3.9, and I would prefer to avoid further large changes (like this PR) in the 3.9 stable branch after 3.9.0 final release.

I'm not sure what is the best option :-(

I suggest you to revert the two commits separately, since they are non-trivial:
https://github.com/python/cpython/pull/21961/commits

Note: If we choose to merge this PR and backport it to Python 3.9, do you think that it would be better to merge the second commit in a separated PR, since commits of this PR will be squashed?

@vstinner
Copy link
Member Author

Summary of my change: this PR reverts changes made after 3.8, to move back to the battle-tested 3.8 code. I reverted the changes which causes many bugs.

@encukou
Copy link
Member

encukou commented Sep 11, 2020

@encukou proposes PR #21973 to workaround the regression and to fix the regression in a 3.9 bugfix release (like 3.9.1).

Not necessarily in 3.9.1 – I'd be OK with fixing this in 3.10 as well.

@vstinner
Copy link
Member Author

Not necessarily in 3.9.1 – I'd be OK with fixing this in 3.10 as well.

If possible, I would prefer to https://bugs.python.org/issue41631 regression in Python 3.9.x (the best would be to fix it in 3.9.0).

@vstinner
Copy link
Member Author

I fixed the Windows build, but now the documentation job fails with:

Warning, treated as error:

/home/travis/build/python/cpython/Doc/library/string.rst:311:duplicate token description of format_spec, other instance in reference/lexical_analysis

It seems to be unrelated to my change: https://bugs.python.org/issue41762

Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I locally run this PR and it looks good.

@vstinner
Copy link
Member Author

Close/reopen to attempt to magically fix https://bugs.python.org/issue41762

@pablogsal
Copy link
Member

@pablogsal @corona10: Would you mind to review this PR? It fix a release blocker for Python 3.9.0, a regression compared to 3.8: see https://bugs.python.org/issue41631 The deadline for 3.9.0rc2 is next monday :-(

@encukou proposes PR #21973 to workaround the regression and to fix the regression in a 3.9 bugfix release (like 3.9.1).

This PR is non-trivial, but _ast got tons of changes between 3.8 and 3.9, and I would prefer to avoid further large changes (like this PR) in the 3.9 stable branch after 3.9.0 final release.

I'm not sure what is the best option :-(

I suggest you to revert the two commits separately, since they are non-trivial:
https://github.com/python/cpython/pull/21961/commits

Note: If we choose to merge this PR and backport it to Python 3.9, do you think that it would be better to merge the second commit in a separated PR, since commits of this PR will be squashed?

I tried to review reviewed this today but time ran out so I will try to do a review tomorrow. In any case, it would be good to have regression test for the problems we have uncovered.

@encukou
Copy link
Member

encukou commented Sep 14, 2020

Partially revert commit ac46eb4:
"bpo-38113: Update the Python-ast.c generator to PEP384 (gh-15957)".

Using a module state per module instance is causing subtle practical
problems.

For example, the Mercurial project replaces the __import__() function
to implement lazy import, whereas Python expected that "import _ast"
always return a fully initialized _ast module.

Add _PyAST_Fini() to clear the state at exit.

The _ast module has no state (set _astmodule.m_size to 0). Remove
astmodule_traverse(), astmodule_clear() and astmodule_free()
functions.
@encukou
Copy link
Member

encukou commented Sep 15, 2020

This does not really move _ast back battle-tested 3.8 code. It does not revert everything, so it possibly has new bugs.

One of the bugs was already present in 3.8 – it's possible to objects across subinterpreters. We can agree to ignore that in 3.9.0\ and say that multiple interpreters still aren't safe. That's fine with me at this point; I opened ericsnowcurrently/multi-core-python#68 to track this.

But then, the second commit here doesn't solve anything: it makes only one of the types static, when are lots of heap types still remaining. Or is it supposed to solve a different problem?

If not, I say we should merge only the first commit here, plus any tests.

@vstinner
Copy link
Member Author

@pablogsal: I added 3 tests to test_ast which should cover all bugs listed in https://bugs.python.org/issue41631

With the new subinterpreter test, I noticed two reference leak bugs which are not introduced by my PR, but simply have not be seen previously. So I fixed them as part of this PR as well. Otherwise, this PR would make all Refleak buildbot failing.

@vstinner
Copy link
Member Author

One of the bugs was already present in 3.8 – it's possible to objects across subinterpreters. We can agree to ignore that in 3.9.0\ and say that multiple interpreters still aren't safe. That's fine with me at this point; I opened ericsnowcurrently/multi-core-python#68 to track this.

You are correct that _ast.AST subclasses like _ast.Constant can be modified and changes impact all interpreters.

I attempted to better isolate the _ast module in subinterpreter (fix ericsnowcurrently/multi-core-python#68 ). My intent was to fix some bugs by doing that, but I introduced other new bugs :-(

Example:

import _ast
from test.support import run_in_subinterp
run_in_subinterp('import _ast; _ast.Constant.x=1')
print(_ast.Constant.x)

This code displays 1 with Python 3.8 and with my PR (master branch): same behavior (not better, not worse).

Since it's not a regression compared to Python 3.8, I suggest to open a new issue if you consider that the _ast module should be enhanced.

But then, the second commit here doesn't solve anything: it makes only one of the types static, when are lots of heap types still remaining. Or is it supposed to solve a different problem?

My intent is to move back to Python 3.8 status quo (behavior) for the _ast.AST type. For example, with my change, it's not possible to add a new attribute to _ast.AST in the main interpreter or in a subinterpreter.

Example:

import _ast
_ast.AST.x = 1

This code raises TypeError: can't set attributes of built-in/extension type 'ast.AST' with Python 3.8 and with my PR (not better, not worse).

@vstinner
Copy link
Member Author

vstinner commented Sep 15, 2020

My intent is to move back to Python 3.8 status quo (behavior) for the _ast.AST type. For example, with my change, it's not possible to add a new attribute to _ast.AST in the main interpreter or in a subinterpreter.

I added a 4th test to check this behavior with the comment:

# [bpo-41631](https://bugs.python.org/issue41631): The _ast.AST type is static and cannot be modified.
# This test should be removed when the _ast.AST type will be converted
# to a heap type.

@encukou
Copy link
Member

encukou commented Sep 15, 2020

not better, not worse

Right. Commits 2 and 3 in this PR are not necessary.

To me, reverting to 3.8 would make sense as a reason if _ast was reverted all the way back to how it was in 3.8. But that's not the case here, so I don't see a reason for making _ast.AST static. IMO, it's better to stay closer to 3.9.0.rc1.

@vstinner
Copy link
Member Author

Right. Commits 2 and 3 in this PR are not necessary.

My first intent was to prevent changes of the _ast module in a subinterpreter which would impact other interpreters.

Sadly, I didn't notice that it was already possible to modify other interpreter using _ast.AST subtypes.

By the way, import ast does modify ast.Constant and ast.Tuple types on purpose, to add deprecated methods.

@vstinner
Copy link
Member Author

I'm fine with leaving _ast.AST as a heap type in Python 3.9 (remove my commits 2 and 3), and so keep this minor behavior change. It can be fixed later. As I wrote, there was already other ways to change the state of other interpreters using _ast.AST subtypes.

Do other people have an opinion on that?

Copy link
Member

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR is a blocker,
I would like to approve this PR if the mercurial issue is solved and other regression.
(And looks like this issue is solved)

And other minor behavior can be modified after this release.

* Only call _PyAST_Fini() in the main interpreter
* "import ast" only adds deprecated methods once to _ast.Constant and
  _ast.Tuple types.
@vstinner
Copy link
Member Author

I removed the commits which convert _ast.AST back to a static type.

@ambv ambv merged commit e5fbe0c into python:master Sep 15, 2020
@bedevere-bot
Copy link

@ambv: Please replace # with GH- in the commit message next time. Thanks!

@miss-islington
Copy link
Contributor

Thanks @vstinner for the PR, and @ambv for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖 I'm not a witch! I'm not a witch!

@miss-islington
Copy link
Contributor

Sorry, @vstinner and @ambv, I could not cleanly backport this to 3.9 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker e5fbe0cbd4be99ced5f000ad382208ad2a561c90 3.9

pablogsal pushed a commit to pablogsal/cpython that referenced this pull request Sep 15, 2020
Partially revert commit ac46eb4:
"bpo-38113: Update the Python-ast.c generator to PEP384 (pythongh-15957)".

Using a module state per module instance is causing subtle practical
problems.

For example, the Mercurial project replaces the __import__() function
to implement lazy import, whereas Python expected that "import _ast"
always return a fully initialized _ast module.

Add _PyAST_Fini() to clear the state at exit.

The _ast module has no state (set _astmodule.m_size to 0). Remove
astmodule_traverse(), astmodule_clear() and astmodule_free()
functions..
(cherry picked from commit e5fbe0c)

Co-authored-by: Victor Stinner <vstinner@python.org>
@bedevere-bot
Copy link

GH-22258 is a backport of this pull request to the 3.9 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Sep 15, 2020
ambv pushed a commit that referenced this pull request Sep 15, 2020
…-22258)

Partially revert commit ac46eb4:
"bpo-38113: Update the Python-ast.c generator to PEP384 (gh-15957)".

Using a module state per module instance is causing subtle practical
problems.

For example, the Mercurial project replaces the __import__() function
to implement lazy import, whereas Python expected that "import _ast"
always return a fully initialized _ast module.

Add _PyAST_Fini() to clear the state at exit.

The _ast module has no state (set _astmodule.m_size to 0). Remove
astmodule_traverse(), astmodule_clear() and astmodule_free()
functions..
(cherry picked from commit e5fbe0c)

Co-authored-by: Victor Stinner <vstinner@python.org>
@vstinner vstinner deleted the revert_ast branch September 16, 2020 10:04
xzy3 pushed a commit to xzy3/cpython that referenced this pull request Oct 18, 2020
Partially revert commit ac46eb4:
"bpo-38113: Update the Python-ast.c generator to PEP384 (pythongh-15957)".

Using a module state per module instance is causing subtle practical
problems.

For example, the Mercurial project replaces the __import__() function
to implement lazy import, whereas Python expected that "import _ast"
always return a fully initialized _ast module.

Add _PyAST_Fini() to clear the state at exit.

The _ast module has no state (set _astmodule.m_size to 0). Remove
astmodule_traverse(), astmodule_clear() and astmodule_free()
functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants