Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of strong consistency using locks #387

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

peterzeller
Copy link
Member

This PR adds support for strongly consistent (serializable) transactions by using Locks.

Current status:

There is still one failing test case:
The cluster_failure_test_2 sometimes fails. The test crashes a node holding the locks and then restarts it after some time. After the restart, another DC should be able to get the locks again, but the crashed DC sometimes is not able to read its CRDT state after restarting. I still have to debug this case - I am not yet sure if this is a problem with the lock implementation or a general problem in Antidote. Unfortunately, the bug occurs less frequently the more log messages I add for debugging.

The rest of the implementation should be working.

Documentation is available in https://github.com/AntidoteDB/antidote/blob/strong-consistency-3/src/antidote_locks.md

@bieniusa
Copy link
Contributor

From the documentation:

The call to start_transaction may fail if Antidote is unable to acquire the lock.

What message is then sent to the client?

@@ -0,0 +1,58 @@
%% -------------------------------------------------------------------
%%
%% Copyright <2013-2018> <
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this become 2019?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same in other files. I can create a separate PR to update the date everywhere.

read_write_process = spawn_link(fun() -> read_write_process(Self) end)
}}.

%%check_lock_state_process(Pid) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be deleted?


- *Dynamic Membership*: Adding and removing DCs is currently not supported.
- *Fault tolerance*: In the following scenarios, the current implementation will not work correctly:
- CRDT state cannot be updated locally: Implementation will wait for the update indefinitely.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment: Which CRDT state update is blocking here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a lock is updated I am using an CRDT update operation. This update might fail, e.g. if the partition with the lock is currently not available. There is no error handling for this case yet.

{Actions, NewState}.


%%% computes lock request actions for the locally waiting locks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still required?

@albsch
Copy link
Member

albsch commented Sep 21, 2020

Using -erl_args -hidden causes every suite to restart and reconnect every node. This causes an infinite request log_read loop in the clocksi_SUITE, which indicates either a problem with the hidden parameter or an inter-dc communication problem (which goes back even before my inter-dc simpliification).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants