Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-94759: Collect C-level coverage using llvm-cov #94760

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions .github/workflows/c-coverage.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
name: C coverage

on:
schedule:
# Run this daily at 2400 UTC
- cron: '0 0 * * *'

permissions:
contents: read

jobs:
c-coverage:
name: 'C-level coverage (using llvm-cov)'
runs-on: ubuntu-20.04
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for using 20.04 instead of 22.04?

Suggested change
runs-on: ubuntu-20.04
runs-on: ubuntu-22.04

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what the current build.yml uses, and I think there is value in being consistent with it. build.yml is run in every PR, this will be in a nightly cronjob, so it would be annoying if a theoretical Ubuntu-version-specific bug crept in. (That said, we could consider upgrading all of the Ubuntu's in tandem in a separate PR, of course...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(That said, we could consider upgrading all of the Ubuntu's in tandem in a separate PR, of course...)

+1

env:
OPENSSL_VER: 1.1.1n
PYTHONSTRICTEXTENSIONBUILD: 1
steps:
- name: Install Dependencies
run: |
sudo ./.github/workflows/posix-deps-apt.sh
- name: Configure OpenSSL env vars
run: |
echo "MULTISSL_DIR=${GITHUB_WORKSPACE}/multissl" >> $GITHUB_ENV
echo "OPENSSL_DIR=${GITHUB_WORKSPACE}/multissl/openssl/${OPENSSL_VER}" >> $GITHUB_ENV
echo "LD_LIBRARY_PATH=${GITHUB_WORKSPACE}/multissl/openssl/${OPENSSL_VER}/lib" >> $GITHUB_ENV
- name: Restore OpenSSL build
id: cache-openssl
uses: actions/cache@v3
with:
path: ./multissl/openssl/${{ env.OPENSSL_VER }}
key: ${{ runner.os }}-multissl-openssl-${{ env.OPENSSL_VER }}
- name: Install OpenSSL
if: steps.cache-openssl.outputs.cache-hit != 'true'
run: python3 Tools/ssl/multissltests.py --steps=library --base-directory $MULTISSL_DIR --openssl $OPENSSL_VER --system Linux
- name: Add ccache to PATH
run: |
echo "PATH=/usr/lib/ccache:$PATH" >> $GITHUB_ENV
- name: Configure ccache action
uses: hendrikmuhs/ccache-action@v1
- name: Configure clang (with coverage)
run: |
echo "CC=clang" >> $GITHUB_ENV
echo "CXX=clang++" >> $GITHUB_ENV
- name: Configure CPython
run: ./configure --with-pydebug --with-openssl=$OPENSSL_DIR
- name: Generate coverage report
# Using "-j1" is important, or the Github Action runs out of memory
run: EXTRATESTOPTS=-j1 xvfb-run make coverage-report
- name: Publish coverage-report
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does every run clobber the previous run's report? What about CI runs on PRs and CI runs from release branches? Can this be configured to commit the coverage results into a branch in the repo matching the reponame+branchname? Or do branches not render in gh-pages? in which case it'd need to be subdirectories in the repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, this clobbers the previous run. That is easily changed with a flag, but then we could run into Github repository size limits (each result is around 100MB of HTML). It's probably possible to keep the last N commits, but the tool I'm using to publish doesn't support that directly.

This is currently configured to just run on the main branch once a day. We could do multiple branches that publish to subdirectories if you think there's a good use case for that. (Github pages only publishes a single branch, but subdirectories would work).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we would get some funding to run a persistent VM with public hostname from ... let's say Azure, then it would be trivial to create a buildbot worker and serve the LCOV results from HTTP server. They are static HTTP and JS files on the file system after all. Just saying :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind main branch only and once a day. Anyone working on coverage is presumably doing their own local coverage runs while creating PRs.

Regarding a buildbot configured to host the results, while I could simply set one up it'd probably make more sense for mdboom to do that and be an admin given who's driving this work. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd be happy to admin that if it comes to it. I think it's fine to go with the Github Action here as an MVP, and if the amount of history or frequency of runs isn't good enough, we can revisit migrating to our own VM down the road.

uses: JamesIves/github-pages-deploy-action@v4
with:
folder: coverage-report
repository-name: '' # TODO Destination
token: ${{ secrets.COVERAGE_DEPLOY_TOKEN }} # TODO: Use an organization-level token
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What permissions does the token need?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question, and it isn't terribly well-documented: https://github.com/JamesIves/github-pages-deploy-action#required-setup

I think the permission to push to the coverage results repo is all that's needed, but we may not know until we try.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need two separate tokens then, one (with read permissions) for the cpython repo, and another (with write permissions) for the coverage results repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need a token for read permissions to the cpython repo. Just one to write to a repo under the python org.

single-commit: true
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -23,6 +23,7 @@
*.gc??
*.profclang?
*.profraw
*.profdata
*.dyn
.gdb_history
.purify
4 changes: 3 additions & 1 deletion Lib/test/test_subprocess.py
Original file line number Diff line number Diff line change
@@ -797,9 +797,11 @@ def is_env_var_to_ignore(n):
# This excludes some __CF_* and VERSIONER_* keys MacOS insists
# on adding even when the environment in exec is empty.
# Gentoo sandboxes also force LD_PRELOAD and SANDBOX_* to exist.
# LLVM coverage adds __LLVM_PROFILE_RT_INIT_ONCE variable.
return ('VERSIONER' in n or '__CF' in n or # MacOS
n == 'LD_PRELOAD' or n.startswith('SANDBOX') or # Gentoo
n == 'LC_CTYPE') # Locale coercion triggered
n == 'LC_CTYPE' or # Locale coercion triggered
n == '__LLVM_PROFILE_RT_INIT_ONCE')

with subprocess.Popen([sys.executable, "-c",
'import os; print(list(os.environ.keys()))'],
64 changes: 41 additions & 23 deletions Makefile.pre.in
Original file line number Diff line number Diff line change
@@ -50,6 +50,8 @@ PGO_PROF_USE_FLAG=@PGO_PROF_USE_FLAG@
LLVM_PROF_MERGER=@LLVM_PROF_MERGER@
LLVM_PROF_FILE=@LLVM_PROF_FILE@
LLVM_PROF_ERR=@LLVM_PROF_ERR@
LLVM_PROFDATA=@LLVM_PROFDATA@
LLVM_COV=@LLVM_COV@
DTRACE= @DTRACE@
DFLAGS= @DFLAGS@
DTRACE_HEADERS= @DTRACE_HEADERS@
@@ -310,12 +312,15 @@ HOST_GNU_TYPE= @host@
# PROFILE_TASK="-m test --pgo-extended"
PROFILE_TASK= @PROFILE_TASK@

# report files for gcov / lcov coverage report
COVERAGE_INFO= $(abs_builddir)/coverage.info
COVERAGE_REPORT=$(abs_builddir)/lcov-report
COVERAGE_LCOV_OPTIONS=--rc lcov_branch_coverage=1
COVERAGE_REPORT_OPTIONS=--rc lcov_branch_coverage=1 --branch-coverage --title "CPython $(VERSION) LCOV report [commit $(shell $(GITVERSION))]"

# parameters for coverage
COVERAGE_CC=@COVERAGE_CC@
COVERAGE_CFLAGS=@COVERAGE_CFLAGS@
COVERAGE_LDFLAGS=@COVERAGE_LDFLAGS@
COVERAGE_GEN_TARGET=@COVERAGE_GEN_TARGET@
COVERAGE_INFO=$(abs_builddir)/@COVERAGE_INFO@
COVERAGE_REPORT=$(abs_builddir)/coverage-report
COVERAGE_OPTIONS=@COVERAGE_OPTIONS@
COVERAGE_REPORT_OPTIONS=@COVERAGE_REPORT_OPTIONS@

# === Definitions added by makesetup ===

@@ -655,25 +660,41 @@ bolt-opt: @PREBOLT_RULE@
rm -f $(BUILDPYTHON).bolt_inst
mv $(BUILDPYTHON).bolt $(BUILDPYTHON)

# Compile and run with gcov
.PHONY=coverage coverage-lcov coverage-report
# Support generating coverage reports
.PHONY=coverage-report coverage coverage-generate-lcov coverage-generate-profdata
coverage-report: regen-token regen-frozen
@ # build with coverage info
$(MAKE) coverage
@ # run tests, ignore failures
@ # The LLVM_PROFILE_FILE must specify %m so results can be collected in parallel.
@ # Also, must be an absolute path since test suite changes working directory.
LLVM_PROFILE_FILE=${abs_builddir}/python%m.profraw $(TESTRUNNER) $(TESTOPTS) || true
@ # build lcov report
$(MAKE) $(COVERAGE_GEN_TARGET)
@echo
@echo "coverage report at $(COVERAGE_REPORT)/index.html"
@echo

# Compile and run generating coverage
coverage:
@echo "Building with support for coverage checking:"
$(MAKE) clean
$(MAKE) @DEF_MAKE_RULE@ CFLAGS="$(CFLAGS) -O0 -pg --coverage" LDFLAGS="$(LDFLAGS) --coverage"
$(MAKE) @DEF_MAKE_RULE@ CC="$(COVERAGE_CC)" \
CFLAGS="$(CFLAGS) $(COVERAGE_CFLAGS)" \
LDFLAGS="$(LDFLAGS) $(COVERAGE_LDFLAGS)"

coverage-lcov:
coverage-generate-lcov:
@echo "Creating Coverage HTML report with LCOV:"
@rm -f $(COVERAGE_INFO)
@rm -rf $(COVERAGE_REPORT)
@lcov $(COVERAGE_LCOV_OPTIONS) --capture \
@lcov $(COVERAGE_OPTIONS) --capture \
--directory $(abs_builddir) \
--base-directory $(realpath $(abs_builddir)) \
--path $(realpath $(abs_srcdir)) \
--output-file $(COVERAGE_INFO)
@ # remove 3rd party modules, system headers and internal files with
@ # debug, test or dummy functions.
@lcov $(COVERAGE_LCOV_OPTIONS) --remove $(COVERAGE_INFO) \
@lcov $(COVERAGE_OPTIONS) --remove $(COVERAGE_INFO) \
'*/Modules/_blake2/impl/*' \
'*/Modules/_ctypes/libffi*/*' \
'*/Modules/_decimal/libmpdec/*' \
@@ -688,18 +709,15 @@ coverage-lcov:
@genhtml $(COVERAGE_INFO) \
--output-directory $(COVERAGE_REPORT) \
$(COVERAGE_REPORT_OPTIONS)
@echo
@echo "lcov report at $(COVERAGE_REPORT)/index.html"
@echo

# Force regeneration of parser and frozen modules
coverage-report: regen-token regen-frozen
@ # build with coverage info
$(MAKE) coverage
@ # run tests, ignore failures
$(TESTRUNNER) $(TESTOPTS) || true
@ # build lcov report
$(MAKE) coverage-lcov
coverage-generate-profdata:
@echo "Creating Coverage HTML report with llvm-profdata/llvm-cov:"
@rm -f $(COVERAGE_INFO)
@rm -rf $(COVERAGE_REPORT)
@ # Merge coverage results
$(LLVM_PROFDATA) merge -sparse python*.profraw -o $(COVERAGE_INFO)
@ # Generate HTML
$(LLVM_COV) show -format=html -output-dir=$(COVERAGE_REPORT) -instr-profile=$(COVERAGE_INFO) $(COVERAGE_REPORT_OPTIONS) python .

# Run "Argument Clinic" over all source files
.PHONY=clinic
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
The ``coverage-report`` Makefile now supports both the ``gcc/lcov`` and ``clang/llvm-profdata`` stacks to generate coverage reports, and will select the correct one based on the compiler in use.
139 changes: 139 additions & 0 deletions configure

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

34 changes: 34 additions & 0 deletions configure.ac
Original file line number Diff line number Diff line change
@@ -2056,6 +2056,40 @@ case $CC in
;;
esac

# Coverage flags

case $CC in
*clang*)
COVERAGE_CC="\$(CC) -fprofile-instr-generate -fcoverage-mapping"
COVERAGE_CFLAGS=""
COVERAGE_LDFLAGS=""
COVERAGE_GEN_TARGET="coverage-generate-profdata"
COVERAGE_INFO="coverage.profdata"
COVERAGE_OPTIONS=""
COVERAGE_REPORT_OPTIONS="-show-branches=count -show-regions"
;;
*gcc*)
COVERAGE_CC="\$(CC)"
COVERAGE_CFLAGS="-O0 -pg --coverage"
COVERAGE_LDFLAGS="--coverage"
COVERAGE_GEN_TARGET="coverage-generate-lcov"
COVERAGE_INFO="coverage.info"
COVERAGE_OPTIONS="--rc lcov_brange_coverage=1"
COVERAGE_REPORT_OPTIONS="\$(COVERAGE_OPTIONS) --branch-coverage --title \"CPython \$(VERSION) LCOV report [commit \$(shell \$(GITVERSION))]\""
;;
*)
esac

AC_SUBST(COVERAGE_CC)
AC_SUBST(COVERAGE_CFLAGS)
AC_SUBST(COVERAGE_LDFLAGS)
AC_SUBST(COVERAGE_GEN_TARGET)
AC_SUBST(COVERAGE_INFO)
AC_SUBST(COVERAGE_OPTIONS)
AC_SUBST(COVERAGE_REPORT_OPTIONS)
AC_SUBST(LLVM_COV)
AC_PATH_TOOL(LLVM_COV, llvm-cov, '', ${llvm_path})

# XXX Shouldn't the code above that fiddles with BASECFLAGS and OPT be
# merged with this chunk of code?