-
Notifications
You must be signed in to change notification settings - Fork 391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3 storage backend is not sync-friendly #60
Comments
Oh man, I am so sorry you ran into this issue, and that you had to spend so much time debugging it. Thank you for getting to the bottom of it tho. I had no idea that my storage key scheme would affect S3 in this horrible way. It really sucks that a key cannot be a substring of another. That blows. I'm actually working on a brand new version of my pixl-server-storage module (which Cronicle uses for all file or S3 storage), which has a new feature where you can fully customize the keys by way of a "template" string. The original idea was to prepend some hash directories, for S3 performance reasons, but you could also use this for adding a suffix to all keys, which I think would effectively work around this issue. Here is a snippet from my updated docs: S3 Key TemplateNote that Amazon recommends adding a hash prefix to all your S3 keys, for performance reasons. To that end, if you specify a "keyTemplate": "##/##/[key]" This would replace the 4 hash marks with the first 4 characters from the key's MD5, followed by the full key itself, e.g. Besides hash marks, the special macro Hi, I'm back. So for this particular S3 issue you found, you can basically force a suffix onto the end of every key, by setting the template config property to something like this: "keyTemplate": "[key]/foo" The
In this case no key ends up being a substring of another. The other piece needed here is a storage upgrade script, which I still have to write. This would accept two different storage configurations, and transfer all data from one implementation to another. This would allow you to "upgrade" to the new key template, for example. It would be Cronicle specific. I don't have an official timeline on my new storage module release (it's version 2.0.0 and has a ton of new features and breaking changes, so I'm taking it nice and slow), but it should be soon, like maybe 2 months or less. Thanks again for catching this, and reporting the issue. |
Hey @aldanor, This should be all fixed now, as of Cronicle v0.8.4. You can now include a special https://github.com/jhuckaby/pixl-server-storage#s3-file-extensions Note that since you already have a working Cronicle installation, you can't just enable the property after the fact. You will need to "migrate" your storage to a new location. It can simply be a new S3 key prefix, new S3 bucket, or AWS region. Full docs on migrating here: https://github.com/jhuckaby/Cronicle#storage-migration-tool Thanks again for reporting this issue! - Joe |
This took me a good while to figure out: I was moving my S3-backed Cronicle db from one S3 storage provider to another, and decided to sync the whole thing with rclone. Everything was seemingly fine, except for the fact that the UI was stuck in "waiting for a master" state. After a fair bit of digging and debugging, I figured that the sync was only partially complete, and looked like this in the destination bucket:
... and like this in the source bucket:
The core problem here being - that there's both a "folder"
servers
and a "file"servers
(which apparently serves as a sort of a metadata header to the "folder"). Soon as the sync tool discovers the "file", it doesn't consider it a "folder" anymore and doesn't go digging any further, hence ignoring everything that looks likeservers/*
. I think this happened with s3cmd as well in the past, not just rclone, but I haven't given it any thought back then - so it might need additional checking. You could say it's an rclone problem (which it is, partially), but given that I couldn't find a single person running into the same issue with either rclone/s3cmd, it looks like people just don't name their s3 objects this way, so the cli tools wouldn't care to handle it either.Another side of the same problem here would be trying to sync S3 tree to a local disk -- since you can't have a file and a folder with the same name, what would you expect it to do? (and this is most likely the reason sync tools behave the way they do, ignoring
servers/0
)I understand that this might be too much of a change, and even if a workaround was implemented (like an optional suffix/prefix for the metadata object so it doesn't collide with the contents), some thought would have to be given to compatibility and migration questions. I'd be happy to discuss this though if it helps anything.
If allowing to change this convention happens to be an absolute no-no, maybe there should at least be a note somewhere saying "don't ever try to sync your cronicle s3 installation with any of the standard s3 command line tools like rclone, it will fail miserably (and worst of all, quietly)".
Thanks!
The text was updated successfully, but these errors were encountered: