-
Notifications
You must be signed in to change notification settings - Fork 489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage: Enforce size based retention for topic #2179
Comments
Hi @sehz, I'd like to take a look at this issue. At first glace it looks to be a good first issue that is beyond just a simple CLI fix. Would it be a good idea to work this one as a new contributor? |
Sounds good |
@sehz Do we really want to have max size per topic or per partition will suffice? Per topic size calculation will require communication between nodes whereas per partition can be done locally. |
Per partition make sense but specified at topic level |
This PR includes the following changes: 1. Added `max_partition_size` parameter to `fluvio-cli` and topic and partition specs. 2. `Cleaner` in `fluvio-storage` is responsible for enforcing replica size. Replica size is index size + log file size. If the size is exceeded so the first segment is removed. 3. `Cleaner` was moved from segments level to replica level to be able to access replica size. 4. Introduced `Size64` type as the partition size can easily overflow current `Size`. 5. `human_bytes` crate is replaced by `bytesize`. The latter is able to convert bytes to string and from string whereas the former only to string. Closes #2179
The current retention policy only supports the time-based property. This may result in a incorrect estimate of the total data size. We should add max data size properties as part of retention.
The topic should have new property: 'max_size' to indicate maximum data size will be allowed to accumulate.
The text was updated successfully, but these errors were encountered: