- Introduction
- Writing helpful bug reports
- Installing the latest version
- Setting up a local development environment
- Pull Requests (PRs)
- Documentation
- Jupyter notebook style guide
- Maintainer guide
Thank you for contributing to SHAP. SHAP is an open source collective effort, and contributions of all forms are welcome!
You can contribute by:
- Submitting bug reports and features requests on the GitHub issue tracker,
- Contributing fixes and improvements via Pull Requests, or
- Discussing ideas and questions in the Discussions forum.
If you are looking for a good place to get started, look for issues with the good first issue label.
When submitting bug reports on the issue tracker, it is very helpful for the maintainers to include a good Minimal Reproducible Example (MRE).
An MRE should be:
- Minimal: Use as little code as possible that still produces the same problem.
- Self-contained: Include everything needed to reproduce your problem, including imports and input data.
- Reproducible: Test the code you're about to provide to make sure it reproduces the problem.
For more information, see How To Craft Minimal Bug Reports.
To get the very latest version of shap, you can pip-install the library directly
from the master
branch:
pip install git+https://github.com/shap/shap.git@master
This can be useful to test if a particular issue or bug has been fixed since the most recent release.
Alternatively, if you are considering making changes to the code you can clone the repository and install your local copy as described below.
Click this link to fork the repository on GitHub to your user area.
Clone the repository to your local environment, using the URL provided by the
green <> Code
button on your projects home page.
Create a new isolated environment for the project, e.g. with conda:
conda create -n shap python=3.11
conda activate shap
To build from source, you need a compiler to build the C extension.
-
On linux, you can install gcc with:
sudo apt install build-essential
-
Or on Windows, one way of getting a compiler is to install mingw64.
Pip-install the project with the --editable
flag, which ensures that any
changes you make to the source code are immediately reflected in your
environment.
pip install --editable '.[test,plots,docs]'
The various pip extras are defined in pyproject.toml:
test-core
: a minimal set of dependencies to run pytest.test
: a wider set of 3rd party packages for the full test suite such as tensorflow, pytest, xgboost.plots
: includes matplotlib.docs
: dependencies for building the docs with Sphinx.
Note: When installing from source, shap will attempt to build the C extension and the CUDA extension. If CUDA is not available, shap will retry the build without CUDA support.
Consequently, is is quite normal to see warnings such as WARNING: Could not compile cuda extensions
when building from source if you do not have CUDA
available.
We use pre-commit hooks to run code checks.
Enable pre-commit
in your local environment with:
pip install pre-commit
pre-commit install
To run the checks on all files, use:
pre-commit install
pre-commit run --all-files
Ruff is used as a linter, and it is enabled as a
pre-commit hook. You can also run ruff
locally with:
pip install ruff
ruff check .
The unit test suite can be run locally with:
pytest
Before starting on a PR, please make a proposal by opening an Issue, checking for any duplicates. This isn't necessary for trivial PRs such as fixing a typo.
Keep the scope small. This makes PRs a lot easier to review. Separate functional code changes (such as bug fixes) from refactoring changes (such as style improvements). PRs should contain one or the other, but not both.
Open a Draft PR as early as possible, do not wait until the feature is
ready. Work on a feature branch with a descriptive name such as
fix/lightgbm-warnings
or doc/contributing
.
Use a descriptive title, such as:
FIX: Update parameters to remove DeprecationWarning in TreeExplainer
ENH: Add support for python 3.11
DOCS: Fix formatting of ExactExplainer docstring
Before marking your PR as "ready for review" (by removing the Draft
status),
please ensure:
- Your feature branch is up-to-date with the master branch,
- All pre-commit hooks pass, and
- Unit tests have been added (if your PR adds any new features or fixes a bug).
The documentation is hosted at shap.readthedocs.io. If you have modified the docstrings or notebooks, please also check that the changes are are rendered properly in the generated HTML files.
The documentation is built automatically on each Pull Request, to facilitate previewing how your changes will render. To see the preview:
- Look for "All checks have passed", and click "Show all checks".
- Browse to the check called "docs/readthedocs.org".
- Click the
Details
hyperlink to open a preview of the docs.
The PR previews are typically hosted on a URL of the form below, replacing
<pr-number>
:
https://shap--<pr-number>.org.readthedocs.build/en/<pr-number>
To build the documentation locally:
- Navigate to the
docs
directory. - Run
make html
. - Open "_build/html/index.html" in your browser to inspect the documentation.
Note that nbsphinx
currently requires the stand-alone program pandoc
. If you
get an error "Pandoc wasn't found", install pandoc
as described in
nbsphinx installation
guide.
If you are contributing changes to the Jupyter notebooks in the documentation, please adhere to the following style guidelines.
Before committing your notebook(s),
- Ensure that you "Restart Kernel and Run All Cells...", making sure that cells are executed in order, the notebook is reproducible and does not have any hidden states.
- Ensure that the notebook does not raise syntax warnings in the Sphinx build logs as a result of your changes.
You are advised to include links in the notebooks as much as possible if it provides the reader with more background / context on the topic at hand.
Here's an example of how you would accomplish this in a Markdown cell in the notebook:
# Force Plot Colors
The [scatter][scatter_doclink] plot create Python matplotlib plots that can be customized at will.
[scatter_doclink]: ../../../generated/shap.plots.scatter.rst#shap.plots.scatter
where the link specified is a relative path to the rst file generated by Sphinx. Prefer relative links over absolute paths.
We use ruff
to perform code linting and auto-formatting on our notebooks.
Assuming you have set up pre-commit
as described
above, these checks will run automatically
whenever you commit any changes.
To run the code-quality checks manually, you can do, e.g.:
pre-commit run --files notebook1.ipynb notebook2.ipynb
replacing notebook1.ipynb
and notebook2.ipynb
with any notebook(s) you have modified.
Bug reports and feature requests are managed on the github issue tracker. We use automation to help prioritise and organise the issues.
The good first issue
label should be assigned to any issue that could be
suitable for new contributors.
The awaiting feedback
label should be assigned if more information is required
from the author, such as a reproducible example.
The stale bot will mark issues and PRs that
have not had any activity for a long period of time with the stale
label, and
comment to solicit feedback from our community. If there is still no activity,
the issue will be closed after a further period of time.
We value feedback from our users very highly, so the bot is configured with long time periods before marking issues as stale.
Issues marked with the todo
label will never be marked as stale, so this label
should be assigned to any issues that should be kept open such as long-running
feature requests.
Pull Requests should generally be assigned a category label such as bug
,
enhancement
or BREAKING
. These labels are used to categorise the PR in the
release notes, as described below.
All PRs should have at least one review before being merged. In particular, maintainers should generally ensure that PRs have sufficient unit tests to cover any fixed bugs or new features.
PRs are usually completed with "squash and merge" in order to maintain a clear linear history and make it easier to debug any issues.
shap uses a PEP 440-compliant versioning scheme of MAJOR.MINOR.PATCH
. Like
numpy, shap does not use semantic versioning, and has
never made a major
release. Most releases increment minor
, typically made
every month or two. patch
releases are sometimes made for any important
bugfixes.
Breaking changes are done with care, given that shap is a very popular package.
When breaking changes are made, the PR should be tagged with the BREAKING
label to ensure it is highlighted in the release notes. Deprecation cycles are
used to mitigate the impact on downstream users.
GitHub milestones can be used to track any actions that need to be completed for a given release, such as those relating to deprecation cycles.
We use setuptools-scm
to source the version number from the git history
automatically. At build time, the version number is determined from the git tag.
We try to use automation to make the release process reliable, transparent and reproducible. This also helps us make releases more frequently.
A release is made by publishing a GitHub Release, tagged with an appropriately incremented version number.
When a release is published, the wheels will be built and published to PyPI
automatically by the build_wheels
GitHub action. This workflow can also be
triggered manually at any time to do a dry-run of cibuildwheel.
In the run-up to a release, create a GitHub issue for the release such as [Meta issue] Release 0.43.0. This can be used to co-ordinate with other maintainers and agree to make a release.
Suggested release checklist:
- [ ] Dry-run cibuildwheel & test
- [ ] Make GitHub release & tag
- [ ] Confirm PyPI wheels published
- [ ] Conda forge published
The conda package is managed in a separate repo. The conda-forge bot will automatically make a PR to this repo to update the conda package, typically within a few hours of the PyPSA package being published.
Release notes can be automatically drafted by Github using the titles and labels of PRs that were merged since the previous release. See the GitHub docs on automatically generated release notes for more information.
The generated notes will follow the template defined in .github/release.yml, arranging PRs into subheadings by label and excluding PRs made by bots. See the docs for the available configuration options.
It's helpful to assign labels such as BREAKING
, bug
, enhancement
or
skip-changelog
to each PR, so that the change will show up in the notes under
the right section. It also helps to ensure each PR has a descriptive name.
The notes can be edited (both before and after release) to remove information that is unlikely to be of high interest to users, such as maintenance updates.