Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SBOMs generation for Windows artifacts #99

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 107 additions & 2 deletions sbom.py
Original file line number Diff line number Diff line change
Expand Up @@ -525,10 +525,115 @@ def create_sbom_for_source_tarball(tarball_path: str):
return sbom_data


def create_sbom_for_windows_artifact(exe_path):
exe_name = os.path.basename(exe_path)
cpython_version = re.match(r"^python-([0-9abrc.]+)(?:-|\.exe)", exe_name).group(1)
cpython_version_without_suffix = re.match(r"^([0-9.]+)", cpython_version).group(1)
exe_download_location = f"https://www.python.org/ftp/python/{cpython_version_without_suffix}/{exe_name}"

with open(exe_path, mode="rb") as f:
exe_checksum_sha256 = hashlib.sha256(f.read()).hexdigest()

# Start with the CPython source SBOM as a base
with open("Misc/externals.spdx.json") as f:
sbom_data = json.loads(f.read())

# Add all the packages from the source SBOM
# We want to skip the file information because
# the files aren't available in Windows artifacts.
with open("Misc/sbom.spdx.json") as f:
source_sbom_data = json.loads(f.read())
for sbom_package in source_sbom_data["packages"]:
sbom_data["packages"].append(sbom_package)

sbom_data["relationships"] = []
sbom_data["files"] = []

sbom_data.update({
"SPDXID": "SPDXRef-DOCUMENT",
"spdxVersion": "SPDX-2.3",
"name": "CPython SBOM",
"dataLicense": "CC0-1.0",
# Naming done according to OpenSSF SBOM WG recommendations.
# See: https://github.com/ossf/sbom-everywhere/blob/main/reference/sbom_naming.md
"documentNamespace": f"{exe_download_location}.spdx.json",
"creationInfo": {
"created": (
datetime.datetime.now(tz=datetime.timezone.utc)
.strftime("%Y-%m-%dT%H:%M:%SZ")
),
"creators": [
"Person: Python Release Managers",
f"Tool: ReleaseTools-{get_release_tools_commit_sha()}",
],
# Version of the SPDX License ID list.
# This shouldn't need to be updated often, if ever.
"licenseListVersion": "3.22",
},
})

# Create the SBOM entry for the CPython package. We use
# the SPDXID later on for creating relationships to files.
sbom_cpython_package = {
"SPDXID": "SPDXRef-PACKAGE-cpython",
"name": "CPython",
"versionInfo": cpython_version,
"licenseConcluded": "PSF-2.0",
"originator": "Organization: Python Software Foundation",
"supplier": "Organization: Python Software Foundation",
"packageFileName": exe_name,
"externalRefs": [
{
"referenceCategory": "SECURITY",
"referenceLocator": f"cpe:2.3:a:python:python:{cpython_version}:*:*:*:*:*:*:*",
"referenceType": "cpe23Type",
}
],
"primaryPackagePurpose": "APPLICATION",
"downloadLocation": exe_download_location,
"checksums": [{"algorithm": "SHA256", "checksumValue": exe_checksum_sha256}],
}

# The top-level CPython package depends on every vendored sub-package.
for sbom_package in sbom_data["packages"]:
sbom_data["relationships"].append({
"spdxElementId": sbom_cpython_package["SPDXID"],
"relatedSpdxElement": sbom_package["SPDXID"],
"relationshipType": "DEPENDS_ON",
})

sbom_data["packages"].append(sbom_cpython_package)

# Final relationship, this SBOM describes the CPython package.
sbom_data["relationships"].append(
{
"spdxElementId": "SPDXRef-DOCUMENT",
"relatedSpdxElement": sbom_cpython_package["SPDXID"],
"relationshipType": "DESCRIBES",
}
)

# Apply the 'supplier' tag to every package since we're shipping
# the package in the tarball itself. Originator field is used for maintainers.
for sbom_package in sbom_data["packages"]:
sbom_package["supplier"] = "Organization: Python Software Foundation"
# Source packages have been compiled.
if sbom_package["primaryPackagePurpose"] == "SOURCE":
sbom_package["primaryPackagePurpose"] = "LIBRARY"

normalize_sbom_data(sbom_data)

return sbom_data


def main() -> None:
tarball_path = sys.argv[1]
sbom_data = create_sbom_for_source_tarball(tarball_path)
artifact_path = sys.argv[1]
if artifact_path.endswith(".exe"):
sbom_data = create_sbom_for_windows_artifact(artifact_path)
else:
sbom_data = create_sbom_for_source_tarball(artifact_path)
print(json.dumps(sbom_data, indent=2, sort_keys=True))


if __name__ == "__main__":
main()
8 changes: 7 additions & 1 deletion windows-release/azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,12 @@ stages:
${{ if and(parameters.SigningCertificate, ne(parameters.SigningCertificate, 'Unsigned')) }}:
SigningCertificate: ${{ parameters.SigningCertificate }}

- stage: SBOM
displayName: Create SBOMs
dependsOn: Build
jobs:
- template: stage-sbom.yml

- stage: Layout
displayName: Generate layouts
dependsOn: Sign
Expand Down Expand Up @@ -218,7 +224,7 @@ stages:
- ${{ if eq(parameters.DoMSI, 'true') }}:
- stage: PublishPyDotOrg
displayName: Publish to python.org
dependsOn: ['Test_MSI', 'Test']
dependsOn: ['SBOM', 'Test_MSI', 'Test']
jobs:
- template: stage-publish-pythonorg.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we're going to get chances to this template as well to SSH the files up to the server?

Copy link
Collaborator Author

@sethmlarson sethmlarson Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's the plan, I might do that in a follow-up PR though. Maybe I'll remove this dependsOn for now.


Expand Down
45 changes: 45 additions & 0 deletions windows-release/stage-sbom.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
jobs:
- job: SBOM_Files
displayName: Create SBOMs for Python binaries

pool:
vmImage: windows-2022

workspace:
clean: all

strategy:
matrix:
win32:
Name: win32
amd64:
Name: amd64
arm64:
Name: arm64

steps:
- task: UsePythonVersion@0
displayName: 'Use Python 3.6 or later'
inputs:
versionSpec: '>=3.6'

- template: ./checkout.yml

- task: DownloadPipelineArtifact@1
displayName: 'Download artifact: bin_$(Name)'
inputs:
artifactName: bin_$(Name)
targetPath: $(Build.BinariesDirectory)\bin

- powershell: >
python
"$(Build.SourcesDirectory)\sbom.py"
(gci msi\*\python-*.exe | select -First 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we wouldn't/shouldn't just do all of them? (Omitting the select -First 1 should pass them all as separate args, and then sys.argv[1:] in Python can get them all.)

workingDirectory: $(Build.BinariesDirectory)
displayName: 'Create SBOMs for binaries'

- task: PublishPipelineArtifact@0
displayName: 'Publish artifact: sbom'
inputs:
targetPath: '$(Build.BinariesDirectory)\sbom'
artifactName: sbom
Comment on lines +41 to +45
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- task: PublishPipelineArtifact@0
displayName: 'Publish artifact: sbom'
inputs:
targetPath: '$(Build.BinariesDirectory)\sbom'
artifactName: sbom
- publish: '$(Build.BinariesDirectory)\sbom'
artifact: sbom
displayName: 'Publish artifact: sbom'

This is the preferred format for simple cases now (should auto-update when they need to make changes to the publish task).