-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make table column branch.name case sensitive #28633
Conversation
f3a702b
to
e4dd341
Compare
d0ebd57
to
f361ab4
Compare
f361ab4
to
dca09f8
Compare
CI passes, although this is only a quick/partial fix. |
I think XORM should do this automatically |
FWIW, branch names aren't the only thing that break with case insensitivity. And there are a fair number of other, similar issues present. You may also want to fix |
Maybe we should also change the COLLATE from utf8mb4_general_ci to utf8mb4_bin on |
@algernon Thank you for the information, it's thoughtful. My opinions for these problems are:
It's hard to say it is right or wrong. For example, some people depends on the case-insensitive behavior to do database I would try to propose a complete PR later, to leave a chance to end users, while make the behavior consistent by default for most users.
You are right, running
This PR's purpose is just to make a partial/quick fix for the branch name case conflicting, including MSSQL, so the migration has the same behavior at the moment. There are more TODOs IMO. To address the concern "doctor convert by mistake", I think this PR could do more, eg: skip the tables which are already "utf8mb4" |
7a0a789
to
7e7326a
Compare
But it works differently when using another DB engine, so it's a bug. Either because it is case sensitive in other engines, or because it is case insensitive with MySQL. Inconsistency between engines is a bug, I hope we can agree on that. It needs to be the same on all engines. So either it needs to use something like If you want case insensitive behaviour, that should be explicit, like it's done for usernames, and in a couple of other places. Gitea has helper functions like So the real fix here is to make sure that the database is case sensitive, once and for all. For cases where case insensitivity is desired, that should be made explicit, just like it is done for usernames, or issue search (where searching by title or content are explicitly case insensitive, unlike searching by label name, which is case sensitive [but that depends on the database engine, which I assert is a bug]).
Again, the problem is that the behaviour is different between DB engines. If you "leave a chance to end users", that inconsistency will remain. To fix this problem, you need to get rid of the inconsistency in the first place, either by making every engine case sensitive, or making them all case insensitive. Anything else is just magical handwaving that essentially does nothing to address the root cause. You'll just be playing cat and mouse, trying to catch all the instances of the same root cause.
Consider this scenario: the instance has no case sensitive branch names, and the database is using a case insensitive collation. You run the migration, so it gets converted to case sensitive. Later on, the user runs Thus,
Or, you could just do a complete fix that addresses the root cause (case insensitive collation), rather than playing cat and mouse trying to fix instances of the same problem. Like I said, it's not only about branch names. It affects issue labels, and a whole lot more. Fixing it on the column or table level also leaves you open to the same issue in the future, as soon as someone forgets to guard case sensitivity on a new table or column. Adjusting Yes, it does not fix MSSQL (but patching each instance of the problem doesn't, either, only temporarily, at great cost). I suspect there's a reasonable fix there too, but I can't test that, so someone else will have to figure that part out.
Yeah, that'd make it so it doesn't break the database. But if the collation would be changed in |
That's also one of the root problem. TBH, |
Yes. That (and changing the |
Due to |
It isn't perfect, indeed, but it is better than Changing the database collation is cheap, because it doesn't convert any tables, it just sets a default, pretty much, for future tables. So we can use this to figure out which collation is supported. That, or there's |
OK, I didn't post all my plans because I haven't got a full picture. Since we already have a deep discussion (thank you very much), I would like to write all what I thought: My basic rules:
So this PR does point 1: only fix the problem temporarily, and then when we have a complete fix, this migration (and TableCollation) could be removed. Then about the complete fix in my mind: It should never report fatal errors to end users. It should warn users that "your database collation seems wrong, please do ....." The detailed steps are:
I think (just a brief idea) by this approach, everything should be fine eventually. |
I like the proper fix plan for the most part. I don't think we can easily distinguish MySQL from MariaDB from the settings alone, so setting a different default for them is going to be problematic. This is one of the reasons - and user convenience - I suggested auto-detecting it. Having a I already have most of this written, by the way (I started working on the fix yesterday), including plenty of test cases, and I can adjust it to have configurable default charset / collation. If you're interested, I can submit my work in a couple of hours, once I finish it up. |
I have read your PR, I am not sure whether it could implement that plan. And it lacks MSSQL support. Another thing is that I think the tests don't need to be that complex. As long as we can guarantee the database tables are using expected collation, I think "model" related tests could be removed, or just simplify them and put them into their own(related) test file. TBH, the code in my mind might be quite different from your PR 🤣 In history, some F* guys submitted buggy PRs and I made some changes, then they wrote long posts to criticize me and Gitea. I am really afraid of co-working with other PRs now. |
I have not pushed my latest changes yet, because it's a work in progress, and there's still a few things to adjust. Locally, I have:
For docker, running
I like tests that exercise slightly different problems: for example, the branch name test (unpushed atm) tests that with case insensitivity, you can't create branches that differ only in case, while the issue search test tests that you can have unexpected results returned. Same underlying problem, different symptom, so I think testing both is worth the (minimal, really) effort.
Your call, really. I'll submit my PR anyway, and you can judge whether its worth using it. It just feels like a waste of time to do it again, when I already did most of it. |
Thank you. I will close this one and spend more time for a complex fix.
Do you mean I can co-work and edit your PR? |
Yes. Just give me an hour or two to push my WIP, and you can have a go at it, and edit it at your leisure. |
No hurry, I think for existing users they could manually bypass the bug (run the There is enough time before 1.22 release. |
The bug was introduced by #22743 , because some databases are default to "case-insensitive"
Fix #28131
Fix #25909
and more
This PR is only a quick fix, it only fixes the "branch" case-insensitive bug. It is not a complete fix for the database collation problem.
To fix various potential "case" bugs in the future, it's better to make all database collation case-sensitive (mainly MySQL/MSSQL)