Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opt-in decentralized backup for s3 #69

Open
dzhelezov opened this issue Jul 6, 2023 · 0 comments
Open

Opt-in decentralized backup for s3 #69

dzhelezov opened this issue Jul 6, 2023 · 0 comments

Comments

@dzhelezov
Copy link
Contributor

Currently, we rely heavily on the s3 file system to manage chunks. Replacing it with ipfs is hard, but what can be make IPFS sourcing and verification opt-in:

  • publish metadata file each time the s3 storage is updated with new blocks
  • the metadata file may optionally contain s3 buckets for the CID for fast access
  • the metadata file will contain the whole tree structure of s3
  • at the leafs, for each file store IPFS CIDs of the parquet files stored in s3

When downloading, replace ls method with downloading the metadata file from an IPFS gateway (will be fast) and traversing the tree there. For download, one can still use s3 (which is fast) with the additional step to verify the CID (can be done locally w/o running an ipfs node)

For data ingestion, we need a separate process which would periodically monitor the s3 buckets and:

  • seed the new files with CID
  • place storage order to the Crust Network
  • update the metadata file on-chain
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant