Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cft election #1641

Merged
merged 40 commits into from
Sep 30, 2020
Merged

Fix cft election #1641

merged 40 commits into from
Sep 30, 2020

Conversation

achamayou
Copy link
Member

@achamayou achamayou commented Sep 22, 2020

Fixes #589

In a nutshell, the change amounts to:

  1. Publishing a last committable index on election, and the term of that committable index
  2. Resolving the election on that basis, rather than using the last seen commit index and associated term
  3. Only rolling back uncommittable transactions, rather than transactions not yet uncommitted

@ghost
Copy link

ghost commented Sep 22, 2020

fix_cft_election@13369 aka 20200930.8 vs master ewma over 50 builds from 12767 to 13349
images

@jumaffre
Copy link
Contributor

As recovery.py mentions #589, could you remove the check there as well?

CCF/tests/recovery.py

Lines 59 to 63 in 3ca2946

# In theory, check_commit should be sufficient to guarantee that the new primary
# will know about all the recovery shares submitted so far. However, because of
# https://github.com/microsoft/CCF/issues/589, we have to wait for all nodes
# to have committed all transactions.
recovered_network.wait_for_all_nodes_to_catch_up(primary)

src/consensus/aft/raft.h Outdated Show resolved Hide resolved
@jumaffre
Copy link
Contributor

The solution in this PR seems simpler than what is described in #589 (comment). I think we should record the "why" of the changes in this PR (either in the original ticket or in this PR) as well as the "how" as this is an important change.

@achamayou
Copy link
Member Author

@eddyashton pointed out that this probably isn't right, and that the election should strictly be based on committable_idx/term(committable_idx). I don't have time just now, but I'll update the PR and describe that case later.

@eddyashton eddyashton mentioned this pull request Sep 23, 2020
@achamayou
Copy link
Member Author

PR is up to date, please review. There are a few more tests I want to add before it's ready.

achamayou and others added 2 commits September 23, 2020 14:58
@achamayou achamayou marked this pull request as ready for review September 24, 2020 16:11
@achamayou achamayou requested a review from a team as a code owner September 24, 2020 16:11
@achamayou
Copy link
Member Author

Looking at the test failure, the election change invalidates our current view history reconstruction logic. In particular, we can no longer assume that commit index is in the term being replayed, and we cannot issue signature transactions until we have "caught up" with previous terms.

I am making necessary changes to address this and will change the PR status again once it is ready.

@achamayou
Copy link
Member Author

raft_test is still failing because we replicate transactions marked as committable but that do not deserialise as signatures, so committable indices aren't correct on followers, and now that our term history depends on committable rather than committed indices, it's also inaccurate. Will fix tomorrow.

DOCTEST_CHECK(r2.get_last_idx() == 3);
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Axed in favour of the end to end committable.py for now. The existing raft test scaffolding doesn't have a way to accurately replicate signatures at the moment.

@achamayou achamayou marked this pull request as ready for review September 29, 2020 13:41
src/consensus/aft/raft.h Show resolved Hide resolved
src/node/history.h Outdated Show resolved Hide resolved
# Suspend three of the backups to prevent commit
backups[1].suspend()
backups[2].suspend()
backups[3].stop()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between suspend() and stop() here? (We suspend 2 and stop 1?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suspend allows resuming, stop is killing the node. We do not need 3 anymore, since we want only three nodes to force a unanimous election, and it's unnecessary to operate after that. We only ever start backup 4 because we want f=2.

tests/committable.py Outdated Show resolved Hide resolved
uc.post("/app/log/private", {"id": 100 + i, "msg": "Hello world"})
)

# Wait for a signature to ensure those transactions are committable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of sleep is something we do in several places. It now potentially doesn't do what was intended - if we're in the fancy election state and Raft is not returning a signable index, we're simply skipping signatures and not producing them at regular intervals.

This is an argument in favour of continuing to produce signatures (making a max-lengthed block of transactions committable) without any global commit claim in them, but that's uglier for ledger parsing. I'm happy to keep the current behaviour (signatures are sometimes skipped) for this PR and discuss alternative approaches offline/in future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's true although that will not cause the test to silently pass.

The issue here is that we do not have a way to observe what's committable, the real condition would be, the last of those tx is committable on backup[1]. Even if we shortened the sigx-tx-interval, we wouldn't know that a signature was produced and replicated to backup[1].

@achamayou
Copy link
Member Author

Reconfiguration suite failure in nightly

45: 20:22:39.960 | INFO     | reconfiguration:node_configs:17 - 200 {"0":{"address":"127.65.38.252:45699"},"1":{"address":"127.114.112.214:42871"},"2":{"address":"127.114.247.179:39983"},"3":{"add + 69 chars
45: 20:22:39.960 | ERROR    | __main__:run:88 - Test reconfiguration.test_retire_primary failed
45: Traceback (most recent call last):
45: 
45:   File "/mnt/build/2/s/tests/e2e_suite.py", line 163, in <module>
45:     run(args)
45:     │   └ Namespace(app_script=None, binary_dir='.', consensus='cft', debug_nodes=[], domain=None, enclave_type='release', enforce_reqs...
45:     └ <function run at 0x7fe830d0d1f0>
45: 
45: > File "/mnt/build/2/s/tests/e2e_suite.py", line 78, in run
45:     new_network = test(network, args)
45:                   │    │        └ Namespace(app_script=None, binary_dir='.', consensus='cft', debug_nodes=[], domain=None, enclave_type='release', enforce_reqs...
45:                   │    └ <infra.network.Network object at 0x7fe8292eb3d0>
45:                   └ <function test_retire_primary at 0x7fe8293598b0>
45: 
45:   File "/mnt/build/2/s/tests/suite/test_requirements.py", line 20, in wrapper
45:     return func(*args, **kwargs)
45:            │     │       └ {}
45:            │     └ (<infra.network.Network object at 0x7fe8292eb3d0>, Namespace(app_script=None, binary_dir='.', consensus='cft', debug_nodes=[]...
45:            └ <function test_retire_primary at 0x7fe8293599d0>
45: 
45:   File "/mnt/build/2/s/tests/suite/test_requirements.py", line 43, in wrapper
45:     return func(network, args, *nargs, **kwargs)
45:            │    │        │      │        └ {}
45:            │    │        │      └ ()
45:            │    │        └ Namespace(app_script=None, binary_dir='.', consensus='cft', debug_nodes=[], domain=None, enclave_type='release', enforce_reqs...
45:            │    └ <infra.network.Network object at 0x7fe8292eb3d0>
45:            └ <function test_retire_primary at 0x7fe829359940>
45: 
45:   File "/mnt/build/2/s/tests/reconfiguration.py", line 125, in test_retire_primary
45:     pre_count = count_nodes(node_configs(network))
45:                 │           │            └ <infra.network.Network object at 0x7fe8292eb3d0>
45:                 │           └ <function node_configs at 0x7fe8293569d0>
45:                 └ <function count_nodes at 0x7fe829356a60>
45: 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Raft persistence bug
3 participants