-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
fix(systemtags): remove duplicates, cleanup existing tags #56079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
f8eaffa to
6e099a3
Compare
e30d0af to
5ceb404
Compare
5ceb404 to
7927592
Compare
b659742 to
19aa6fc
Compare
|
Ok @artonge and @CarlSchwan I think I addressed most of the performance issues. |
Signed-off-by: skjnldsv <skjnldsv@protonmail.com>
19aa6fc to
59871af
Compare
| $objectCounts = []; | ||
| foreach ($this->getAllTagObjectCounts() as $tagId => $count) { | ||
| $objectCounts[$tagId] = $count; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this be done in getAllTags by joining both tables and stored as a property in each $tag array?
| foreach ($tags as $tag) { | ||
| $sanitizedMap[$tag['sanitizedName']][] = $tag; | ||
| $totalTags++; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defeat a bit the purpose of having an iterator as we store all the $tag in memory in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be fine with loading everything in memory, as this is a manually run repair step, so there are not a lot of risks with it failing.
If needed due to memory exhaustion in potential instances with a lot of tags, the approach would probably be to be optimistic by updating all tag names, and only do the merging when it is failing.
Something like:
$tags = $getAllTags();
foreach ($tags as $tag) {
try {
sanitizeName($tag);
} catch ($ex) {
mergeTags($tag);
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, worse case scenario, we can adjust/optimise further as follow-ups as this is an optoinal step anyway
| use OCP\SetupCheck\ISetupCheck; | ||
| use OCP\SetupCheck\SetupResult; | ||
|
|
||
| class RepairSanitizeSystemTagsAvailable implements ISetupCheck { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m afraid of the perf impact of this setup step on an instance with lots of tags.
All tags will be read on each setup check run, even if the repair step was run already. Is it really necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a one time repair anyway, once it's done, it should never happens again.
Should we add an app config flag to return true once it's done ?
Or we could also add a timeout for frontend setup check and only says to run this with cli ?
I dunno 🤷
Maybe we're over thinking it 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would simply remove the setup check I think.
We should not add a setup check for each repair step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when will it be run? Never if the admins are not aware it exists 😕
Summary
Some tags can be created with empty whitespace or
\tor\r... etcThis leads to confusion on the handling of tags.
Details
Important
Testing
Here is a SQL script to populate the tables with fake data (click to expand)
- Sanitize and merge duplicate system tags - Starting sanitization of system tags... - Found 21 tags with 10 unique sanitized names. - WARNING: Cannot merge tag group 'Confidential': tags have different visibility or editable settings. Manual verification required. Tag IDs: 9, 8, 7 - Merged tags [17, 16, 15] into ID 14 (sanitized: 'ToDo') - Merged tags [3, 2] into ID 1 (sanitized: 'Important') - WARNING: Cannot merge tag group 'Archive': tags have different visibility or editable settings. Manual verification required. Tag IDs: 11, 10 - Merged tags [19] into ID 18 (sanitized: 'Internal') - Sanitized tag ID 13: 'Meeting Notes ' → 'Meeting Notes' - Merged tags [21] into ID 20 (sanitized: 'Popular') - Sanitized tag ID 12: 'Project Alpha ' → 'Project Alpha' - Merged tags [6] into ID 4 (sanitized: 'Urgent') - System tag sanitization and merge completed.Checklist
3. to review, feature component)stable32)