Skip to content

Development Methodology

Jason R. Coombs edited this page Aug 19, 2024 · 7 revisions

Several projects in the CPython stdlib follow this methodology for development:

These projects are maintained primarily in their external repositories (sometimes referred to as "backports"). Development is preferred in the external repositories due to a host of advantages:

  • The test suite runs under pytest, a richer test runner, affording more sophisticated behaviors even when running unittest-based tests.
  • Maintenance follows best practices as defined by the skeleton used by hundreds of other projects, and deriving value from common concerns such as updating support for Python versions.
  • Extra checks are performed to ensure consistent formatting, style, and safety (ruff, mypy, ...).
  • Doctests are checked.
  • Performance benchmarks can be measured and tested.
  • Code is tested against all supported Python versions.
  • Changes can be released quickly and get rapid feedback, shifting left the lifecycle.

As a result, this preference means that the external projects are "upstream" of Python and code is synced "downstream" into Python's stdlib.

These projects still accept contributions downstream in Python.

Regardless of the original source of the contribution, these two codebases should be kept in close sync and utilize techniques to minimize the diffs between the two targets. Here are some of the ways these projects achieve that minimum variance:

  • Code in the stdlib should be partitioned into folders pertaining to the shared functionality. That means that the abstract base classes for importlib resources should live in importlib.resources.abc and not importlib.abc. That is also why zipfile.Path is implemented as zipfile._path.Path.
  • The external project should implement "compatibility" shims in separate modules or (preferably) packages. For example, importlib_resources exposes future and compat packages for the external-specific behaviors. This behavior is excluded in the port to CPython.
  • Each project keeps a cpython branch that tracks the state of the code that keeps track of the code in the same layout as it appears in the stdlib, such that one can cp -r the contents in either direction and there should be no diff when projects are in sync.

Syncing

Ensure projects are in sync

  1. Check out the cpython branch of the external project (at $PROJECT).
  2. Check out cpython to the main branch (at $CPYTHON).
  3. cp -r $(PROJECT) $(CPYTHON).
  4. Ensure git -C $(CPYTHON) diff shows no diff.
  5. If there is a diff, track down the source of the changes and determine how to reconcile.

Sync external to stdlib

  1. Ensure projects are in sync.
  2. In the cpython branch of the project, merge changes from main with git merge main.
  3. Resolve conflicts. Some conflicts will be changes to deleted files that aren't needed for stdlib - just confirm the deletion. Other conflicts may be in code, so intelligently compare the changes for each branch.
  4. Ensure compatibility modules are not included. If new ones were added, delete them with git rm -f path/to/compat.
  5. Replace references to compatibility modules with their stdlib equivalent.
  6. Search for references to the external package by name (e.g. importlib_resources) and replace them.
  7. Commit the merge.
  8. Test the changes.
    1. Remove any incidental files (git clean -fdx).
    2. cp -r * $(CPYTHON).
    3. Change directory to the CPython checkout.
    4. Build and test:
      1. Run the tests ./python.exe -m test.test_importlib -v (or similar).
    5. Address test failures in the external project, either in cpython or main (or another branch) as appropriate. Amend or commit or re-merge the changes to the cpython branch and repeat "Test the changes."
  9. Push the cpython branch of the external project.
  10. Commit the changes and submit them for review following the usual CPython development process.
    1. If encountering issues with docs builds, consider running make check suspicious html in the Doc/ directory of CPython.

Port stdlib to external

The best way to sync from stdlib to external is to find the relevant squashed commit from CPython's main branch and cherry-pick it to the external project. Find the relevant commit (often from a relevant pull request merge message), $(COMMIT).

  1. Check out the project to the main branch.
  2. Fetch the CPython repo with git fetch https://github.com/python/cpython. Maybe pass --depth 1000 so as not to fetch everything. This fetches the CPython repo into the external project's repo.
  3. Cherry-pick the change with git cherry-pick $(COMMIT). git will most likely be able to associate the changes with the relevant files (even though they're in different locations).
    1. Resolve creation/deletion conflicts (usually by electing to delete irrelevant files).
    2. Remove any files not relevant to this project (e.g. git rm -rf Misc for news fragments).
  4. (optional) Commit the merge as a checkpoint.
  5. Test the changes, make amendments or tweaks to make the code compatible, possibly as new commits.
  6. Push the changes.
  7. (optional) git prune to remove the CPython history.
  8. Merge the changes into the cpython branch.
  9. (optional) Check that projects are in sync.
Clone this wiki locally