-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High disk space utilization #24
Comments
Any idea on a timeline for this? If you can point me in the right direction I'd be open to helping out. |
I have some code ready, and I'll push a branch for testing in about a week. I have to change the way the data is organized on disk to not waste space and provide durability at the same time. Instead of having a single data file updated in-place, Pogreb will have multiple append-only data files (that periodically could be compacted). The indexing part will remain mostly the same, but data file management will change. |
@AusIV I pushed changes to the datalog branch (https://github.com/akrylysov/pogreb/tree/datalog), feel free to test it. I'll need to run more benchmarks and do more testing before merging it. |
@akrylysov Hey sorry for delay on getting back to you here. We've been occupied, but I would like to check this out when I have some downtime, if you've not had a chance test it already. Have you had a chance to do any benchmarks on this yet? |
@samuelmattjohnston the benchmarks look very promising, the space utilization issue was fixed. |
Hm. Is there some parameter I should be passing to the database for lower disk utilization? A geth database created with the same code from the github issue you linked seems to be using 7.5GB where a normal sync with leveldb seems to be using 2.5GB. -- this was on goerli. I am running another test right now to see what it looks like on mainnet ethereum, and will post back. Hopefully it isn't more than 1TB (a normal sync is 250GB). I'm using the datalog branch as today, in case there is another branch with more fixes I should be testing with. |
If you have deletes or updates you should run the Compact method after loading the data, the background compaction is not enabled by default on this branch (take a look at the fields of the Options) struct. Could you please tell more about your workload? Ratio of puts/updates/deletes, the average size of keys and values? |
I'm not a database guy, unfortunately. I am a DevOps guy, but I can work with you to get that information if you've got time to help me through the code. I can send an email to you to work out something if you have time. |
@akrylysov - I can answer those questions for @samuelmattjohnston. The vast majority of our operations are puts. We have about 2 updates for every 200 puts, and those are both for a key / value pair where both the key and the value are 32 bytes. Our keys are almost all 32 bytes (a small handful are smaller), and the values average 150 bytes, though they can range from 32 bytes to over 1 MB (the theoretical maximum is about 2.2 MB, though we haven't seen many over 1 MB, and the 99th percentile is around 30KB). We pretty much don't delete anything ever. For a bit of context, we're managing Ethereum nodes, which track blockchain data. Since a blockchain is an append-only structure, we don't delete old data. The few updates we have are for a small handful of pointers that track things like the latest block and latest header. |
@AusIV thanks for the detailed answer! In my tests the
Pogreb should work perfectly for append-only workloads. I recommend having two databases - one for the blockchain transactions that never change and another for tracking metadata like the pointers you describe. This way you will never need to run compaction on the first database. When opening the second database you should pass I'm positive we can make Pogreb work for your use cases, happy to help! |
I have geth running pogreb in my current fork/branch: https://github.com/samuelmattjohnston/go-ethereum/tree/pogreb -- Completely in testing, and not tested much, but it is able to get up to speed with goerli. The server on mainnet seems to be taking it's time to get up to speed. If you want to run yourself:
to run mainnet:
Assuming linux. mainnet is going to have larger data for each block than goerli. Goerli gets much less traffic than mainnet, but is easier and quicker to test on. Database code lives in: |
Thanks. I tested storing 100k items using https://github.com/akrylysov/pogreb-bench. The results are:
|
I had that already changed to the datalog branch but some reason that got reset.. Thanks for the catch. I'll redo my testing now. |
just a quick update, 5GB usage now for pogreb on goerli. I've just pushed up the go.mod fix. |
Mainnet is giving me issues. I seem to be getting some errors when there is a rollback of headers..
if I try to resume:
I'm not sure what the cause is atm but if I get time I'll be working on it again. |
Thanks, I'll test your branch of go-ethereum this weekend. Here is an updated design document if you are curious https://github.com/akrylysov/pogreb/blob/datalog/docs/design.md. |
For what it's worth, /var/lib/ethereum is one mount, /tmp/ is another. If there's a way we can tell pogreb to use the /var/lib/ethereum mount for recovery I think that's what we need. |
Thanks for the bug report, I updated the branch to not use |
I'm having difficulties testing go-ethereum. geth panics after some time. Pogreb is not in the stack trace, so I'm not sure if it's related:
|
Replaces the unstructured data file for storing key-value pairs with a write-ahead log. - In the event of a crash or a power loss the database is automatically recovered (#27). - Fixes disk space overhead when storing small keys and values (#24). - Optional background compaction allows reclaiming disk space occupied by overwritten or deleted keys (#28). See docs/design.md for more details.
Fixed in version 0.9.0. |
Details ethereum/go-ethereum#20029.
When storing small keys/values Pogreb wastes too much space by making all writes 512-byte aligned.
The text was updated successfully, but these errors were encountered: