Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RCORE-2029 Fix crash when removing dictionary key #7569

Merged
merged 2 commits into from
Apr 15, 2024

Conversation

danieltabacaru
Copy link
Collaborator

What, How & Why?

Syncing removal of a dictionary key can cause a crash if the dictionary key is already removed locally. We've relaxed the instruction applier so removing a key that does not exist anymore is not an error (it also aligns with removing elements from a set).

Fixes #7488.

☑️ ToDos

  • 📝 Changelog update
  • 🚦 Tests (or not relevant)
  • [ ] C-API, if public C++ API changed
  • [ ] bindgen/spec.yml, if public C++ API changed

Copy link

Pull Request Test Coverage Report for Build daniel.tabacaru_788

Details

  • 48 of 48 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.02%) to 92.103%

Totals Coverage Status
Change from base Build 2207: 0.02%
Covered Lines: 245284
Relevant Lines: 266314

💛 - Coveralls

@danieltabacaru
Copy link
Collaborator Author

danieltabacaru commented Apr 10, 2024

An alternative to this would be to always discard the incoming erase if it matches a local erase (but it should be changed on the server too). What do you think @tkaye407? I personally like the current proposal since it aligns with sets (and even removing objects)

@tkaye407
Copy link
Contributor

I agree in the sense that I think its nice for things to be a bit more lenient in general and we seem to have already chosen that behavior for EraseObject / SetErase as you mention above. I worry about adding too much logic into OT to solve seemingly not super-important issues (this one being that changing the applier behaviour seems easier, simpler, and more flexible)

@jbreams
Copy link
Contributor

jbreams commented Apr 10, 2024

What's the summary of the sets of instructions involved here? Is the issue that we have an instruction that erased the dictionary element but its timestamp came before the timestamp from the server so it gets discarded?

@danieltabacaru
Copy link
Collaborator Author

danieltabacaru commented Apr 10, 2024

I worry about adding too much logic into OT to solve seemingly not super-important issues

I fully agree. It was just an alternative (not worth considering imo for all the reasons mentioned).

@danieltabacaru
Copy link
Collaborator Author

What's the summary of the sets of instructions involved here? Is the issue that we have an instruction that erased the dictionary element but its timestamp came before the timestamp from the server so it gets discarded?

indeed that's the case

Copy link
Contributor

@michael-wb michael-wb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I'm surprised we haven't seen more issues reported about this before now...

CHANGELOG.md Outdated Show resolved Hide resolved
@jbreams
Copy link
Contributor

jbreams commented Apr 10, 2024

Does the server already have the behavior where trying to remove an element from a dictionary that doesn't exist silently fails? My first inclination is that OT should never result in a changeset that cannot be applied cleanly and that this is a correctness bug. Also seems like another place where having Update mean many things leads to subtle bugs. I could be convinced this is okay though.

@danieltabacaru
Copy link
Collaborator Author

My first inclination is that OT should never result in a changeset that cannot be applied cleanly and that this is a correctness bug.

I agree in principle. I see OT as part of the puzzle. It needs to guarantee correctness and convergence, but being able to apply the instructions is all about the local data and the other instructions. Moving all that context (and special casing) from the InstructionApplier will bloat the rules. I personally believe it doesn't belong to the rules.

Does the server already have the behavior where trying to remove an element from a dictionary that doesn't exist silently fails?

@tkaye407 do you have any input on this?

CHANGELOG.md Outdated Show resolved Hide resolved
Copy link
Contributor

@jbreams jbreams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline with @danieltabacaru . I still think there's value in making the OT rules produce changesets we're guaranteed to be able to integrate without errors, but am okay with this work-around for now so we don't crash the process when this happens.

@danieltabacaru danieltabacaru merged commit 979a4d2 into master Apr 15, 2024
3 of 4 checks passed
@danieltabacaru danieltabacaru deleted the dt/remove_dict_key_crash branch April 15, 2024 07:06
jedelbo pushed a commit that referenced this pull request Apr 15, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash when integrating removal of already removed dictionary key
4 participants