-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set "Remove" operation removes all tags repeatedly #38
Comments
Thanks for looking into this! I think your intuitions are correct and this is an optimization we can probably do. I want to sleep over it, but my thinking now is that the order imposed by the DAG allows to do this without consequences, even when multiple branches are present, and when they are being synced in parallel (something that does not happen now but that I would like to address some day). As mentioned, this should improve delta-sizes for scenarios where a key has a large number of tombstones because it is being added and removed repeteadly. |
Thanks for the quick feedback. I'll dig into this more, just wanted to make sure I wasn't missing something. From what I can tell, and obviously, since I created the issue, we should be able to do away with the duplicate tombstonining without issue. The only issue would be #23 but that is independent of this, and more of a global issue across the entire design. I'll follow up after I do some tests with this optimization. |
Yes, I was thinking of #23 too and same thought. |
@jsimnz do you have patch you want to submit for this? |
I do! Caught up with some other work stuff but can open a PR |
Fixed in #48 . Thanks very much for the contribution! |
Currently, calling set.Rmv(key) runs an element prefix query for the given key, and tombstones all the results it gets. This means a full element removal, every time a Rmv() operation is run. This results in redundant tombstones ending up in the Delta. Should it not only tombstone the elements that haven't yet been tombstoned.
Example.
(Assume there is a merge operation called between each action)
In the example above,
Qa
andQb
were tombstoned twice (Once for each Rmv() call). Additionally, this problem only gets worse if you were to make several hundred updates to a key, followed by removals. Given that the entire BlockID is included in a tombstone Delta, this inflates the storage requirements exponentially.Given that each Merge call updates the local state key value store, the record of which keys have been tombstoned is already available. In the case where we are syncing from a remote store, we already sync the Merkle Clock DAG and its deltas as well, so again we have a consistent state of which keys have been tombstoned when. In either of these cases, do we not already have the necessary information to only add tombstones of elements which havent actually been tombstoned yet?
Is this an intended design for the Merkle CRDT semantics of an ORSet, or just taken verbatim from the original design from the delta-state paper.
The text was updated successfully, but these errors were encountered: