Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsanitize user and org names in DB #4762

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open

Conversation

pat-s
Copy link
Contributor

@pat-s pat-s commented Jan 23, 2025

fix #3614

As discussed in chat, preferred to use non-sanitized values everywhere.

  • Store user and org names using the casing provided by the forge
  • Allow to search for users / orgs using case insensitive names
  • Removed all sanitation in the code
  • AFAICS only the org names were stored sanitized in the DB. User names and repo names are not affected

@pat-s pat-s added bug Something isn't working server labels Jan 23, 2025
Copy link

codecov bot commented Jan 23, 2025

Codecov Report

Attention: Patch coverage is 40.00000% with 18 lines in your changes missing coverage. Please review.

Project coverage is 28.28%. Comparing base (fac744d) to head (e638c62).
Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
...ore/migration/024_unsanitize_org_and_user_names.go 39.28% 14 Missing and 3 partials ⚠️
server/store/datastore/migration/common.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4762      +/-   ##
==========================================
- Coverage   28.29%   28.28%   -0.01%     
==========================================
  Files         398      399       +1     
  Lines       28295    28318      +23     
==========================================
+ Hits         8005     8011       +6     
- Misses      19580    19594      +14     
- Partials      710      713       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pat-s pat-s changed the title Sanitize user names in DB to match org names Unsanitize user and org names in DB Jan 23, 2025
@pat-s pat-s mentioned this pull request Jan 23, 2025
2 tasks
pat-s and others added 5 commits January 23, 2025 16:52
…ames.go

Co-authored-by: Robert Kaussow <mail@thegeeklab.de>
…ames.go

Co-authored-by: Robert Kaussow <mail@thegeeklab.de>
…ames.go

Co-authored-by: Robert Kaussow <mail@thegeeklab.de>
…ames.go

Co-authored-by: Robert Kaussow <mail@thegeeklab.de>
@xoxys
Copy link
Member

xoxys commented Jan 23, 2025

Tests are failing.

server/store/datastore/migration/common.go Outdated Show resolved Hide resolved
server/store/datastore/org.go Outdated Show resolved Hide resolved
server/store/datastore/repo.go Outdated Show resolved Hide resolved
@xoxys
Copy link
Member

xoxys commented Jan 24, 2025

After thinking of it again, I'm not sure if thats the best approach. Cant really say why but I see a lot of potential for issues and corner cases. Looks like this issue only occurs with forgejo/gitea and the api returns mixed capitalization while the forge only supports case-insensitive values. What do you think?

Edit: Tested it. You can create an org "Foo" in gitea and in that case its even displayed as "Foo" in the url.... But you cant create "foo" with the error "The organization name is already taken." This is a somewhat inconsistent behavior upstream, however I think we should switch back to the initial approach from @pat-s

@pat-s
Copy link
Contributor Author

pat-s commented Jan 27, 2025

If all users are sanitized, the user list in the UI would also reflect a non-optimal state.

The simplest fix which also wouldn't require a migration would probably be to just sanitize the comparison call of user.Login and org.Name?

@qwerty287
Copy link
Contributor

Why does the license header of the migration start with // Excerpt from:?

@pat-s
Copy link
Contributor Author

pat-s commented Jan 28, 2025

Appreciate a reply on what should be done now / which solution is preferred. There are various potentially working solutions but I don't wanna refactor back and forth.

@anbraten
Copy link
Member

Would store the forges casing in the database and allow api / url routes (the db find functions like orgFindByName) to return values case insensitive (similar to how the forges handle it).

@xoxys
Copy link
Member

xoxys commented Jan 28, 2025

OK with it, but in that case Im still wondering how we want to handle it if multiple case-insensitive items are returned from the DB?

@anbraten
Copy link
Member

anbraten commented Jan 28, 2025

OK with it, but in that case Im still wondering how we want to handle it if multiple case-insensitive items are returned from the DB?

Why would they? Forges do not allow projects with same name but different casing (at least for what we have checked). We migrate case insensitive and change to the casing from the forge.

@xoxys
Copy link
Member

xoxys commented Jan 28, 2025

If they would or not doesn't matter IMO. In worst case we end up with "Foo" and "foo" in the DB (for whatever reason) as we store it case-sensitive but expect it to be case-insensitive. If we can add constraint to the column to prevent this, I'm fine with it, but otherwise this could lead to issues later.

@anbraten
Copy link
Member

anbraten commented Jan 28, 2025

I don't think this should be an issue. If we check if an org exists using its case insensitive name or forge id to update or create it, we shouldn't have any trouble here. The unique constraint should actually be the forge-id + org-id anyways as multiple forges could use the same org names. Will check the code for both situations.

@pat-s
Copy link
Contributor Author

pat-s commented Jan 31, 2025

Can somebody please summarize the resolution here now, so I can make the required adjustments?

@xoxys
Copy link
Member

xoxys commented Jan 31, 2025

Would store the forges casing in the database and allow api / url routes (the db find functions like orgFindByName) to return values case insensitive (similar to how the forges handle it).

Thats the way to go.

I would like to ensure we can not store the same forge/org combination twice with differene casing, @anbraten wanted to check the code, see the last comment.

@pat-s
Copy link
Contributor Author

pat-s commented Jan 31, 2025

So this PR is waiting for a review from @anbraten now, is this correct? AFAICS the PR is currently

  • storing the raw casing for org/usernames
  • sanitizing API calls

Please let me know if this needs more time overall, then I'll build a standalone image and apply this as a hotfix, so the issue on CB is solved (as there are already a few pending users waiting for a fix before they can be onboarded).

@xoxys
Copy link
Member

xoxys commented Jan 31, 2025

For someone to verify that we have proper unique constraints in place.

@anbraten
Copy link
Member

anbraten commented Feb 1, 2025

Checked the code and can't see a place that it would add it twice. There is some issue with multiple forges mixing orgs I've detected, but that's something else.

@xoxys
Copy link
Member

xoxys commented Feb 1, 2025

Ok I dont have time to look into this on my own. Feel free to go on if you think the current state is fine.

@xoxys
Copy link
Member

xoxys commented Feb 1, 2025

In best case we add a test case. Try to write "org1/foo" and "org1/Foo" to the db. If it fails, we are good to go if not we should still add db level restrictions.

@pat-s
Copy link
Contributor Author

pat-s commented Feb 1, 2025

Added two tests which explicitly check that attempting to add a case-sensitive duplicate for orgs and users results in an error.

@xoxys
Copy link
Member

xoxys commented Feb 1, 2025

Thanks!

xoxys
xoxys previously approved these changes Feb 1, 2025
@xoxys xoxys dismissed their stale review February 1, 2025 12:44

Comments

@woodpecker-bot
Copy link
Contributor

woodpecker-bot commented Feb 2, 2025

Deployment of preview was successful: https://woodpecker-ci-woodpecker-pr-4762.surge.sh

@@ -183,29 +183,34 @@ func HandleAuth(c *gin.Context) {
// create or set the user's organization if it isn't linked yet
if user.OrgID == 0 {
// check if an org with the same name exists already and assign it to the user if it does
if org, err := _store.OrgFindByName(user.Login); err == nil && org != nil {
// TODO: find the org by name and forgeID directly
org, err := _store.OrgFindByName(user.Login)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed this bug as the error was not handled by the if below as it was inline in the previous if.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

@anbraten anbraten requested a review from xoxys February 2, 2025 14:13
@anbraten anbraten mentioned this pull request Feb 2, 2025
@xoxys
Copy link
Member

xoxys commented Feb 2, 2025

Code LGTM, have you tested the PR locally?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CamelCase usernames get added as lower-case into DB and result in access issues
5 participants