Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPM Registry: primary.xml <location> should not url encode caret (^) characters #32021

Closed
nephatrine opened this issue Sep 10, 2024 · 4 comments · Fixed by #32038
Closed

RPM Registry: primary.xml <location> should not url encode caret (^) characters #32021

nephatrine opened this issue Sep 10, 2024 · 4 comments · Fixed by #32038

Comments

@nephatrine
Copy link
Contributor

Description

When uploading a file with a name like hello-test-0.0.1^15.git7166d1f2-1.el9.x86_64.rpm to the RPM registry, I am able to see the file correctly in the Gitea packages UI and can manually download the RPM from there, but DNF cannot download it on an actual system. DNF does see that the package exists and tries to download it, but gets 404 errors:

Downloading Packages:
[MIRROR] hello-test-0.0.1%5E15.git7166d1f2-1.el9.x86_64.rpm: Status code: 404 for https://example.com/api/packages/testuser/rpm/almalinux/el9/package/hello-test/0.0.1%255E15.git7166d1f2-1.el9/x86_64/hello-test-0.0.1%255E15.git7166d1f2-1.el9.x86_64.rpm (IP: 136.56.234.199)
[MIRROR] hello-test-0.0.1%5E15.git7166d1f2-1.el9.x86_64.rpm: Status code: 404 for https://example.com/api/packages/testuser/rpm/almalinux/el9/package/hello-test/0.0.1%255E15.git7166d1f2-1.el9/x86_64/hello-test-0.0.1%255E15.git7166d1f2-1.el9.x86_64.rpm (IP: 136.56.234.199)
[FAILED] hello-test-0.0.1%5E15.git7166d1f2-1.el9.x86_64.rpm: No more mirrors to try - All mirrors were already tried without success

In my gitea logs, I can see those 404 attempts.

172.17.0.1 - - [10/Sep/2024:09:40:55 -0400] "GET /api/packages/testuser/rpm/almalinux/el9/repodata/repomd.xml HTTP/1.0" 200 1244 "" "libdnf (AlmaLinux 9.4; generic; Linux.x86_64)"
172.17.0.1 - - [10/Sep/2024:09:40:57 -0400] "GET /api/packages/testuser/rpm/almalinux/el9/package/hello-test/0.0.1%255E15.git7166d1f2-1.el9/x86_64/hello-test-0.0.1%255E15.git7166d1f2-1.el9.x86_64.rpm HTTP/1.0" 404 22 "" "libdnf (AlmaLinux 9.4; generic; Linux.x86_64)"
172.17.0.1 - - [10/Sep/2024:09:40:57 -0400] "GET /api/packages/testuser/rpm/almalinux/el9/package/hello-test/0.0.1%255E15.git7166d1f2-1.el9/x86_64/hello-test-0.0.1%255E15.git7166d1f2-1.el9.x86_64.rpm HTTP/1.0" 404 22 "" "libdnf (AlmaLinux 9.4; generic; Linux.x86_64)"

If I try to put one of those URLs into my web browser like https://example.com/api/packages/testuser/rpm/almalinux/el9/package/hello-test/0.0.1%255E15.git7166d1f2-1.el9/x86_64/hello-test-0.0.1%255E15.git7166d1f2-1.el9.x86_64.rpm, I indeed get a message package does not exist.

If, however, I change those instances of %255E in the URL to %5E, the URL does work so it seems the caret is being url encoded twice. Looking in the repodata/primary.xml.gz that gitea produces, I see that the location field it produces has the caret already encoded to %5E, but in major RPM repositories like EPEL (https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/repodata/) this is not the case and the location does not have carets already url encoded. In my own testing producing a version of the gitea repository files that does not have the caret pre-urlencoded in the field works and allows packages to be downloaded by the package manager and seems to match the behaviour of other RPM repositories.

I have reproduced the issue on the demo site here: https://demo.gitea.com/nephatrine/-/packages/rpm/hello-test/0.0.1%5E15.git7166d1f2-1.el9

Gitea Version

1.22.2

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

No response

Git Version

2.45.2

Operating System

Alpine 3.20

How are you running Gitea?

I build Gitea myself and run it from my own docker container.

Database

SQLite

@nephatrine nephatrine changed the title RPM Registry: primary.xml <location> should not have url encoded caret (^) characters RPM Registry: primary.xml <location> should not url encode caret (^) characters Sep 10, 2024
@nephatrine
Copy link
Contributor Author

This is the Fedora documentation on the usage of the caret in the package version (https://docs.fedoraproject.org/en-US/packaging-guidelines/Versioning/#_handling_non_sorting_versions_with_tilde_dot_and_caret) just to state that this is a real use case. There are packages on EPEL8 and EPEL9 that contain such characters as well.

I tested that the issue occurs using the standard DNF package manager on AlmaLinux 8 and AlmaLinux 9. Presumably actual RHEL and Rocky Linux would 8/9 would have the same issues as they're all using DNF.

As RHEL7 does not support the caret operator in the version to begin with, I did not test Centos 7 or anything else that old as its a moot point.

It looks like openSUSE has different versioning guidelines on post-versions so a caret theoretically wouldn't appear in packages intended for it, but I did test on an openSUSE system and Zypper has the same encoding behaviour as DNF and so it 404s trying to use a URL containing %255E instead of just %5E. It works fine if the has the unencoded ^.

It might be that there's some other bizarre RPM-based package manager that both can have a caret appear in the package version and requires it to be url-encoded in the field, but I am not aware of any. I do not think correcting this would break any extant RPM package manager and it brings Gitea in line with how other RPM repos appear to function.

@KN4CK3R
Copy link
Member

KN4CK3R commented Sep 12, 2024

I encode the names in the link:

Location: Location{
Href: fmt.Sprintf("package/%s/%s/%s/%s", url.PathEscape(pd.Package.Name), url.PathEscape(packageVersion), url.PathEscape(pd.FileMetadata.Architecture), url.PathEscape(fmt.Sprintf("%s-%s.%s.rpm", pd.Package.Name, packageVersion, pd.FileMetadata.Architecture))),
},

I could not find infos if this field should be url safe. It's rather unusual to not escape the url. But it looks like the clients parse the url and encode the parts (again)?!

@wxiaoguang
Copy link
Contributor

It seems that the field "location" is just a relative path, not a URL, so it doesn't (shouldn't) need to be encoded.

See the official example:

image

@wxiaoguang
Copy link
Contributor

One more example, no encoding for the "location" relative path:

image

@lunny lunny added this to the 1.22.3 milestone Sep 16, 2024
@lunny lunny closed this as completed in f528df9 Sep 16, 2024
GiteaBot pushed a commit to GiteaBot/gitea that referenced this issue Sep 16, 2024
lunny pushed a commit that referenced this issue Sep 17, 2024
Backport #32038 by @KN4CK3R

Fixes #32021

Do not escape the relative path.

Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
@go-gitea go-gitea locked as resolved and limited conversation to collaborators Dec 15, 2024
project-mirrors-bot-tu bot pushed a commit to project-mirrors/forgejo-as-gitea-fork that referenced this issue Jan 23, 2025
Fixes go-gitea#32021

Do not escape the relative path.

(cherry picked from commit f528df9)
(cherry picked from commit 0cafec4)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants