There's more to changing Python's grammar than editing
Grammar/python.gram
.
Below is a checklist of things that may need to change.
Note
Many of these changes require re-generating some of the derived
files. If things mysteriously don't work, it may help to run
make clean
.
-
Grammar/python.gram
: The grammar definition, with actions that build AST nodes. After changing it, runmake regen-pegen
(orbuild.bat --regen
on Windows), to regenerateParser/parser.c
. (This runs Python's parser generator,Tools/peg_generator
). -
Grammar/Tokens
is a place for adding new token types. After changing it, runmake regen-token
to regenerateInclude/internal/pycore_token.h
,Parser/token.c
,Lib/token.py
andDoc/library/token-list.inc
. If you change bothpython.gram
andTokens
, runmake regen-token
beforemake regen-pegen
. On Windows,build.bat --regen
will regenerate both at the same time. -
Parser/Python.asdl
may need changes to match the grammar. Then runmake regen-ast
to regenerateInclude/internal/pycore_ast.h
andPython/Python-ast.c
. -
Parser/lexer/
contains the tokenization code. This is where you would add a new type of comment or string literal, for example. -
Python/ast.c
will need changes to validate AST objects involved with the grammar change. -
Python/ast_unparse.c
will need changes to unparse AST involved with the grammar change ("unparsing" is used to turn annotations into strings per PEP 563. -
The
compiler
may need to change when there are changes to theAST
. -
_Unparser
in theLib/ast.py
file may need changes to accommodate any modifications in the AST nodes. -
Doc/library/ast.rst
may need to be updated to reflect changes to AST nodes. -
Add some usage of your new syntax to
test_grammar.py
. -
Certain changes may require tweaks to the library module
pyclbr
. -
Lib/tokenize.py
needs changes to match changes to the tokenizer. -
Documentation must be written! Specifically, one or more of the pages in
Doc/reference/
will need to be updated.