Replies: 4 comments 4 replies
-
Further suggestions. |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot @stevenj for the detailed proposal! And my apologies for not having suggested to write a discussion in the first place :) |
Beta Was this translation helpful? Give feedback.
-
Thanks @stevenj for this idea! I guess that it could probably work and save substantial time when restoring a node. We had already been thinking about creating an incremental signature of the database, but it has not been implemented yet: the idea was to sign every immutable file in a Merkle tree, store them in separate archives and provide a download/verification mechanism for a range, but leveraging rsync looks like a very interesting alternative. I am not convinced that the aggregator should support the rsync server on its own, but we could create another node that would have this responsibility. The main modifications would probably be on the client side:
Regarding the compression level We could probably start a proof of concept in order to test an implementation 👍 |
Beta Was this translation helpful? Give feedback.
-
I love the idea of supporting some form of a delta snapshot download, supporting minimal transfer to go from the last snapshot downloaded to the current. I also love rsync, and have used it to great effect for many years in multiple IT situations. I'm not 100% convinced that using rsync in this case is the correct approach to take, or at least not a traditional rsync server/client model. However, this might align well with plans for how aggregators would operate, as I'm not as aware of the functionality and/or future plans for aggregators, but my understanding is they simply point to the locations snapshots are available and the client tries to find one place to download from. That said, what comes to mind for me is the centralization that a (traditional) rsync client/server model produces, but that also seems similar to what the client is currently doing. Once the blockchain grows by a few 100GB more, or is multiple TBs in size, I think this could be a limiting factor when starting a new relay or node doing a complete sync of a snapshot. What comes to mind for me is essentially like hybrid approach of a torrent protocol for decentralization and rsync for deltas, something akin to a BitTorrent Sync where the client would benefit from a decentralized download of the entire snapshot, or the delta between the most recent snapshot downloaded and the current snapshot available. The aggregator would act similar to a torrent tracker (which it kind of does now) and provides the list of "swarm" servers (places that have current snapshots available). Instead of the client picking a single server to download from, it distributes the load over multiple (as many as possible?) snapshot providers while only requesting the chunks that it does not already have. |
Beta Was this translation helpful? Give feedback.
-
Mithril snapshots are large and getting larger, however the majority of data between them remains static.
This means getting the latest snapshot when you have a previous one wastes a lot of bandwidth and time re-downloading files you already have.
Solutions like making epoch sized chunks have been discussed but this is a different packaging mechanism and would require reasonable tooling changes and would not be compatible with current processes.
The proposal here is to simply build the entire archive in a way to maximize redundancy between the archives such that an aware downloader can be made to only download the changes and recover the full file from that delta download.
This would save a lot of bandwidth on the download server, make it faster to sync updates to the mithril snapshots and this method also seemed to produce smaller archives overall.
I have made a simple POC that could enable delta mithril snapshots to be distributed so that a later snapshot only needs to download the differences from the previous one a user has.
This is fully symetrical, so if i had a snapshot made the way i propose and i then tried to sync a snapshot that was 10 versions newer, I would only need to download the differences that occured in those 10 snapshots ffrom the latest snapshot. The TLDR for making the archive is this:
The archive made this way is repeatable, in a test i ran locally only 3 bytes in the first 1MB block of the latest snapshot changed, and then the next change occurred after 1.3GB of data.
Technically the easiest way to utilize this is with rsync, make a "preprod-latest.tar.zst" and rsync will happily update it to the latest version anytime the latest version changes it. Still, it would also be possible to build this functionality into the mithril cli itself. The client using the CLI could send the hash of the snapshot it has locally to the server over an API call, the backend could check that snapshot of that hash against the latest and return the list of blocks that need to be changed (in this case [0, 1244+]).
Using the http file download ability to download partial chunks of a file we could download block 0, and then start fetching all blocks from 1244 onwards, and reconstruct the new "latest" archive and save ourselves a 1.3GB transfer in the process.
This would be very efficient and very quick compared to fetching the otherwise 100% redundant 1.3GB of data every time.
On mainnet this would be a very large performance win.
Its also fully backward compatible with existing processes you could still pull a named snapshot every time and it doesn't matter that its been constructed this way.
Notes:
find
ensure the immutable data always appears first in the archive and is sorted, so that data that doesn't change is always in exactly the same place in the archive. This is critical to maximize overlap in the compressed data.In zstd i used
-22 -ultra
because why not use the strongest compression available (this turned the latest snapshot from 1.7GB to 1.4GB in my testing on a late preprod snapshot.--check
adds a zstd checksum to the file at the end, so once the new latest has been synced its easy to check if the archive still has integrity.--rsyncable
is the flag which is necessary to ensure the files blocks are orderly and there is maximum redundancy between archives.-B1048576
sets the rsync block to 1MB so its efficient just to check for any change in a 1MB block and just pull a copy of that block vs trying to do a delta diff on every byte.Doing this could save a massive amount of time, and make it feasible to run a full local node simply on delta snapshots, especially if the node is just being used to populate dbsync or what not. As new snapshots are posted every 1:15hrs and fetching a delta snapshot would take minutes on any reasonable connection, which could be more than reasonable timeliness of data for many applications.
This could also be done in reverse, when a new mithril archive is made it could be sent to the aggregator as a delta from the previous archive. Saving this bandwidth on the aggregator could reduce its operational costs and load.
Beta Was this translation helpful? Give feedback.
All reactions