Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential eventual consistency issues with SharedDirectory subdirectory deletion/creation #6069

Closed
ChumpChief opened this issue May 7, 2021 · 6 comments
Assignees
Labels
area: dds Issues related to distributed data structures
Milestone

Comments

@ChumpChief
Copy link
Contributor

Assume a subdirectory /foo and supposing two users do the following:

User1: Sets key "bar" to value "baz" inside subdirectory "/foo"
User2: Deletes subdirectory /foo
User2: Creates a new subdirectory /foo

// At this point, User1 sees 1 key in /foo and User2 sees 0 keys, which is OK since the messages have not been ordered/transmitted yet.

Messages for the above are ordered and transmitted

// At this point, User1 sees 0 keys in /foo and User2 sees 1 key, which is not OK. Expected is that both see 0 keys.

The issue being that User2 doesn't realize that they saw the "bar" set before they saw the ack of their subdirectory delete, and so will happily apply the set operation to the new subdirectory they just created rather than tossing it out as applying to the already-deleted subdirectory.

Note that this could happen in other variations too:

  1. Deeper subdirectory nesting, e.g. /foo/bar/baz/woz and User2 deletes/recreates the full chain of bar/baz/woz while User1 sets a key in woz.
  2. Creation of subdirectories rather than key sets -- e.g. User1 is trying to create subdirectory /foo/bar while User2 is deleting/recreating /foo.

Discovered by code inspection while looking at #6015.

@ghost ghost added the triage label May 7, 2021
@ChumpChief ChumpChief added the area: dds Issues related to distributed data structures label May 7, 2021
@vladsud
Copy link
Contributor

vladsud commented May 7, 2021

DDS implementations are essentially OT algorithms, and thus the complexity is square of number of operations that are supported (actual types of OPs). Two generic questions:

  • Are we simplifying by reducing number of operations supported?
  • Are we testing every combination x ordering of them?

@vladsud
Copy link
Contributor

vladsud commented Jan 25, 2022

I think it's a corner case of #8823 and should be closed as dup (with #8823 potentially changing name to reflect it's not just about events).

@ChumpChief
Copy link
Contributor Author

This bug tracks making subdirectory creation/deletion eventually consistent, regardless of whether events are added. I'd prefer to track separately as the former is a data corruption bug whereas the latter is a feature ask to expand scenario support, so it's not obvious to me that we'd prioritize both equally or fix both in a single PR. I've noted in #8823 that its prerequisite is this bug.

@vladsud
Copy link
Contributor

vladsud commented Jan 25, 2022

That's fine, but I think then #8823 should not start with consistency issue description and all such discussions should move here.
To me, events are simple, and they reflect local state - users can poke at directory at any moment in time and learn if sub-directory exists or not - each transition (exists <-> does not exist) is the point where event should fire, and it really does not depend on presence or lack of eventual consistency bugs.

Maybe other way to put it - if we are afraid to add events in current state, then we should remove ability to learn if sub-directory exists or not :) I.e., there should be no createSubDirectory() / getSubDirectory() / deleteSubDirectory() APIs, there should be only getOrCreateSubDirectory(): IDirectory, and no events (no way to observe existence of the tree structure, only leaf nodes). That's basically a map, with some sugar coating.

@ChumpChief
Copy link
Contributor Author

That's basically a map, with some sugar coating.

My understanding is that longer-term, the Tree DDS may become the recommendation for most or all of the scenarios that sound Directory-like, at which point we may want to consider dropping Directory in order to present a clear Map/Tree choice for customers. I have no particular attachment to Directory, my priority is to ensure customers who say "I have hierarchical data that I want to collaborate on" have a clear best choice available.

@vladsud
Copy link
Contributor

vladsud commented Feb 1, 2022

Content moved to #8823, which now tracks all eventual consistency issues / discussions.
I've opened #8969 specifically to track addition of events.

@vladsud vladsud closed this as completed Feb 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: dds Issues related to distributed data structures
Projects
None yet
Development

No branches or pull requests

5 participants