-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run existing nemesis with 90% storage utilization test function #9155
Comments
Notes:
|
Unsupported Nemesisdue to Tablets constraints:
Tested nemesis status:
Expected failed nemesis due to 90% utilization:
Expected Skipped Nemeses:
SLA Nemeses are blocked by #9671:
Last updated: 13/01/2025 |
Please post stuff directly here, I do not think we need yet another document |
I think this effort need to be handled with several approaches in parallel:
The main challenge here is if you use a workload that is also writing (include overriding) data during the longevity, there is very low predictability of the space it will consume on disk accounting compactions overhead and tombstones.
|
@roydahan , many of the above suggestions are already addressed in one way or another and tested this weekend. i waited for the test results in order to update the issue.
Not sure i follow the idea in (1) - we don't want to run each nemesis separately due to the long setup time. |
I believe we are doing replace with our writes for the other cases, does replace simply not work (i.e. the space reclaimaition is too slow) when hit with nemesis? |
it would be difficult to count on space reclamation i think. Perhaps unless the writes are to a really small token range. |
Adding @Lakshmipathi @cezarmoise, since their testcases work with mixed workloads |
@pehala , @roydahan , |
Please look at what the nemesis is doing, if it is creating large MV, than I would say that is expected |
Adding MV is basically doubling the space of the original table. |
@roydahan , it depends what columns are selected. In this case only 1 out of 8 columns is selected for the MV:
But perhaps 1 out of 8 is still too big for the left capacity. |
the nemesis of
|
If the data is spread equally among those 8 columns, then 1/8 of additonal data would bring us above 100%, so it makes sense |
I think we could include it in the list and run it in the test. The nemesis doesnt have a fundamental problem with 90%, so it might uncover a bug |
ok, @pehala , please let me know if we want to open issues for such cases. |
We definitely do want to file bug for this, but given that you couldnt replicate it since, I think it is enough to file the bug when you encounter it again. I would mark the nemesis in the table as "unstable" and continued with the investigation of others |
To summarize the main testing blockers - |
Is this expected? I am not sure why a simple service restart could increase storage space utilization |
regardless, sounds like an issue need to be resolved in scylla end, or with test expectations. I don't think a machinery for track disk utilization and clear it, is beneficial to testing |
opened scylladb/scylladb#22020 |
The nemesis of
but i'm not sure what actually happened. There were 4 nodes. i don't know if one of them is zero-token.
Then
|
Not all nemesis are gonna work, but we should try running with as many as possible.
The text was updated successfully, but these errors were encountered: