-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Use intermediate table to compute user permissions #9787
Conversation
models/user_repo_unit.go
Outdated
// This list may be updated as needed. Requisites are: | ||
// - New units need to abide by the same rules as the others | ||
// - Units from all repos must be rebuilt after this change | ||
supportedUnits = []UnitType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this list for? Why only these units?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list is for choosing what kinds of units we will actually be checking using this table (e.g. in listings, link generation, and such). There are some units that don't make much sense caching, like UnitTypeExternalTracker
, perhaps? Or surely UnitTypeReleases
, that can't be disabled in a user by user basis. Since I didn't get to the part where I'm actually using this information, I still don't know which units makes sense to include, but I'd like to avoid processing more units than necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with this change you would still keep the Access
table?
If you included all units you can use it in getUserRepoPermission
, you don't think it is worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with this change you would still keep the
Access
table?
If you included all units you can use it ingetUserRepoPermission
, you don't think it is worth it?
Yes, this will end as a replacement of the access
table. And yes, all units we expect to use with getUserRepoPermission
should go in here. The thing is (AFAIU) that some units exists for repositories to have, not for users. From the back of my head I believe we can't deny UnitTypeReleases
to a particular user or group, so that unit I'd leave out. But including one unit is only a matter of adding it to the list. If we were to include all units, we'd just leave userRepoUnitSupported
empty.
I'm just trying to optimize storage and SQL operations, but if we think that the performance hit won't be that big of a deal, we can certainly enable them all. Those are easy changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least team setting page allow specifying access to Release unit.
models/user_repo_unit.go
Outdated
log.Trace("RebuildUserUnits: rebuilding permissions for repository %d on batch %d", repo.ID, batchID) | ||
|
||
// Make sure we start from scratch; we intend to recreate all pairs | ||
_, err = e.Delete(&UserRepoUnit{RepoID: repo.ID}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a concurrency issue. It should only be done in the end of the batch while copying from the work table in same session?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about the concurrency problem. Moving the DELETE
down until the last moment will certainly help. However, I'm thinking of implementing some locking mechanism as well. Since we're not enforcing TRANSACTION MODE = SERIALIZED
(or whatever name this mode has in every other database), source information can change in the middle of our calculations (this is true for other processes in Gitea as well). I'm still evaluating options for that (a queue, an auxiliary table, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use one db session and commit in the end, would that suffice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not according to my tests. e.g. in PostgreSQL:
Initial state of table a
id | value |
---|---|
1 | 'hello' |
2 | 'bye bye' |
- Sesion 1:
BEGIN TRANSACTION;
- Sesion 1:
SELECT value FROM a WHERE ID = 1;
gets'hello'
.- Sesion 2:
BEGIN TRANSACTION;
- Sesion 2:
UPDATE a SET value = 'good bye' WHERE id = 2;
- Sesion 2:
COMMIT;
- Sesion 2:
- Sesion 1:
SELECT value FROM a WHERE ID = 2;
gets'good bye'
, expected'bye bye'
.
I mean, the problem is that there will potentially be certainly be several sessions updating data on the instance (a rare case, but two admins might be creating teams for the same organization at the same time).
Updated so it would (hopefully) passed the tests. |
Since all permissions check are now from And I would like we could add a guest user which userid = 0 to store this table so that we could public some units on private repositories. |
@lunny this change is most useful for listing pages (e.g.
Yes, this does that, but I've set two different cases apart:
Restricted users (recently merged) are supported as well. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 months. Thank you for your contributions. |
This pull request has been automatically closed because of inactivity. You can re-open it if needed. |
should we realy close this? |
Reopening to bring the discussion back for 1.13, but this PR is still WIP and very much outdated. |
we may not actually completely need an intermediate table. A table that is effectively a superset of:
may simply be enough. That is the columns would be:
We then rationalise collaborator permissions into a collaborator team etc. The locking situation is no worse. It's still a bit of a mess to work with it's better than we currently have. |
@zeripath the proposed table kind of is a superset of those tables. My goal was to get a table that allows for 100% permissions matching between user, repository and unit (type of access). See this function to get what I mean: Lines 401 to 446 in 3bcfbc5
The goal is to eliminate a lot of guessing in the code about what should and shouldn't be allowed so we can maintain a single file for calculating what can and can't be done by any user and thus prevent possible bugs regarding permissions. Although these changes allow for an important improvement on listing queries, it doesn't strictly translates to all possible listing optimizations (e.g. one may want to list repositories that are private or not private, or semi-public, or from org private but public, etc.); those other attributes I believe belong to a different set of data (e.g. an additional column on the repository table, as mentioned in Discord). IMHO we should maintaining permission optimization separated from query optimization in order to avoid painting ourselves into a corner. 😄 |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 months. Thank you for your contributions. |
This is in progress for a while now, I'm closing it. Please reopen when it's ready for review. |
This PR will implement a new table (
user_repo_unit
) for easy check of permissions on listing and actions as described in #9613. This table would make theaccess
table obsolete.The main goal is to simplify and speed up permission checking, which is performed significantly more often than changing any permissions.
I did my best in order to document via comments for easier analysis and future maintenance. I've also tested most of the statements in all the supported databases.
Work in progress
user_repo_unit
remains in sync.access.go
functions to useuser_repo_unit
as the main source of information.user_repo_unit
(some of the most used in this PR, others can be gradually modified in future PRs).Will close #9613.