Verify Release URLs using publish attestations #2833

facutuesca · 2024-06-21T16:16:55Z

This PR adds a new field to Release, trusted_publisher_url, which stores the publisher URL used to create that release.

More specifically, this field is only populated when the first file uploaded (the one that creates the release) is uploaded using Trusted Publishing and has a PEP-740 publish attestation.

This allows PyPI to verify a release's URLs (the ones returned by the Release.urls property), by comparing them with the trusted_publisher_url. This means we put those URLs in the Verified details section of the project page:

(compare with the mockup provided in this comment: pypi#8635 (comment))

This is an alternative implementation to these two PRs:
pypi#15862
pypi#15891

cc @woodruffw

woodruffw · 2024-06-21T18:25:52Z

tests/unit/packaging/test_models.py

+    @pytest.mark.parametrize(
+        ("url", "trusted_publisher_url", "expected"),
+        [
+            (
+                "https://github.com/owner/project",
+                "https://github.com/owner/project",
+                True,
+            ),
+            (
+                "https://github.com/owner/project/",
+                "https://github.com/owner/project",
+                True,
+            ),
+            (
+                "https://github.com/owner/project/issues",
+                "https://github.com/owner/project",
+                True,
+            ),
+            ("https://github.com/owner/", "https://github.com/owner/project", False),
+            (
+                "https://gitlab.com/owner/project",
+                "https://github.com/owner/project",
+                False,
+            ),
+        ],
+    )


Nitpick: might be good to have some ActiveState URL examples in here as well 🙂

warehouse/forklift/legacy.py

woodruffw · 2024-06-21T18:27:53Z

warehouse/packaging/models.py

+    def verify_url(self, url: str) -> bool:
+        if not self.trusted_publisher_url:
+            return False
+        return url.startswith(self.trusted_publisher_url)


NB: This doesn't do case normalization or anything else, which is probably fine but might be worth calling out in a comment.

With the new implementation, now we do normalization. I added a detailed docstring to document the behavior

facutuesca · 2024-06-24T13:31:04Z

@woodruffw While looking into URL normalization, I realized this implementation has a pretty bad security bug:

publisher_url = "https://github.com/org/project"
release_url = "https://github.com/org/project22"
verify_url(release_url) == True  # because release_url.startswith(publisher_url) is True

Which led me to look into urllib to see if we can parse and compare URL elements. But urllib doesn't do any validation on inputs, which means there's no guarantee a malicious input will not parse as a valid URL that matches the publisher URL.

Any ideas on how we can tackle URL comparison here?
edit: Would something like this be enough?:

    def verify_url(self, url: str) -> bool:
        if not self.trusted_publisher_url:
            return False
        return (url == self.trusted_publisher_url or 
                url.startswith(self.trusted_publisher_url + '/'))  # if checking with startswith, ensure the character after the publisher URL is a forward slash

woodruffw · 2024-06-24T14:09:26Z

Which led me to look into urllib to see if we can parse and compare URL elements. But urllib doesn't do any validation on inputs, which means there's no guarantee a malicious input will not parse as a valid URL that matches the publisher URL.

Yeah, I think we should avoid urllib entirely for this, and use a more standards-conformant URL parser that offers strong validation. One good candidate is rfc3986 -- technically that'll be stricter than WHATWG requires for URL validation, but this shouldn't be an issue in practice (and when it is, it'll be a good validation check).

TL;DR: I think we should use rfc3986 + test the bejeezus out of it, including negative and backstop cases 🙂

facutuesca · 2024-06-24T15:56:14Z

TL;DR: I think we should use rfc3986 + test the bejeezus out of it, including negative and backstop cases 🙂

@woodruffw Done! The Release.verify_url now does more exhaustive checks, and I added several more tests for the error cases I could think of.

woodruffw · 2024-06-24T22:21:26Z

warehouse/packaging/models.py

@@ -22,6 +22,7 @@
 from github_reserved_names import ALL as GITHUB_RESERVED_NAMES
 from pyramid.authorization import Allow, Authenticated
 from pyramid.threadlocal import get_current_request
+from rfc3986 import api


Nitpick: api is a pretty generic import, so maybe we do import rfc3986.api instead and refer to things with that fully qualified path 🙂

woodruffw · 2024-06-24T22:25:30Z

warehouse/packaging/models.py

+        is_subpath = publisher_uri.path == user_uri.path or user_uri.path.startswith(
+            publisher_uri.path + "/"
+        )


NB: Probably need to check what happens when either URL doesn't have a path component; I suspect path will be None in that case 🙂

Good catch! Added a check, and a test

woodruffw · 2024-06-25T14:58:33Z

tests/unit/packaging/test_models.py

+            (  # URL path component is empty
+                "https://github.com",
+                "https://github.com/owner/project",
+                False,


A testcase for the inverse (TP has path, expected has no path) would also be good! And same for both having no path.

facutuesca · 2024-07-02T13:11:15Z

Superseeded by pypi#16205

woodruffw reviewed Jun 21, 2024

View reviewed changes

warehouse/forklift/legacy.py Show resolved Hide resolved

woodruffw reviewed Jun 21, 2024

View reviewed changes

facutuesca marked this pull request as draft June 24, 2024 13:25

woodruffw reviewed Jun 24, 2024

View reviewed changes

facutuesca force-pushed the upload-attestations branch from 50a69a9 to b9ff4ed Compare June 25, 2024 11:56

woodruffw reviewed Jun 25, 2024

View reviewed changes

facutuesca force-pushed the upload-attestations branch 4 times, most recently from b4ba06f to 6d8a0ab Compare June 26, 2024 14:54

facutuesca force-pushed the verify-project-urls branch from be61563 to 077a24e Compare June 26, 2024 15:36

facutuesca force-pushed the upload-attestations branch from 6d8a0ab to f01e393 Compare June 27, 2024 14:59

Add support for uploading attestations in legacy API

9345a6f

facutuesca force-pushed the upload-attestations branch from f01e393 to 9345a6f Compare July 1, 2024 11:44

Verify Release URLs using publish attestations

37af4ca

facutuesca force-pushed the verify-project-urls branch from 077a24e to 37af4ca Compare July 1, 2024 11:55

facutuesca force-pushed the upload-attestations branch 2 times, most recently from dd1b1bc to dc943b5 Compare July 1, 2024 20:39

di mentioned this pull request Jul 1, 2024

Create and populate verified field for ReleaseUrl pypi/warehouse#15891

Closed

facutuesca closed this Jul 2, 2024

facutuesca deleted the verify-project-urls branch August 16, 2024 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verify Release URLs using publish attestations #2833

Verify Release URLs using publish attestations #2833

facutuesca commented Jun 21, 2024

woodruffw Jun 21, 2024

facutuesca Jun 24, 2024

woodruffw Jun 21, 2024

facutuesca Jun 24, 2024

facutuesca commented Jun 24, 2024 •

edited

Loading

woodruffw commented Jun 24, 2024

facutuesca commented Jun 24, 2024

woodruffw Jun 24, 2024

facutuesca Jun 25, 2024

woodruffw Jun 24, 2024

facutuesca Jun 25, 2024

woodruffw Jun 25, 2024

facutuesca Jun 26, 2024

facutuesca commented Jul 2, 2024

Verify Release URLs using publish attestations #2833

Verify Release URLs using publish attestations #2833

Conversation

facutuesca commented Jun 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facutuesca commented Jun 24, 2024 • edited Loading

woodruffw commented Jun 24, 2024

facutuesca commented Jun 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facutuesca commented Jul 2, 2024

facutuesca commented Jun 24, 2024 •

edited

Loading