Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a propose request vote message to speed up some reconfigurations #5697

Merged
merged 28 commits into from
Oct 5, 2023

Conversation

achamayou
Copy link
Member

@achamayou achamayou commented Sep 29, 2023

The purpose of this change is to speed up code updates by skipping the election timeout, but extends more generally to any reconfiguration in which the primary is retired. Instead of waiting for a node in the next configuration to time out, the retiring primary can nudge a plausible candidate [*] to their succession.

[*] Plausible is currently defined as having the highest match index, with a tie breaker on belonging to a comparatively most recent configuration. This favours the "quickest/most up to date" node, even though it may itself be about to retire, because it has the best chances of winning the election and can itself do another quick handover later.

An alternative heuristic, discussed in comment, would be to prioritise nodes in the latest configuration, even if they aren't as up to date, so long as they are at least up to date enough to potentially win an election. I suspect this would be a worse choice, because it leads to greater chances of a multi-round election.

Quick comparison with the rotation test, doing two leader retirements in a row:

Without:

amchamay@amchamay:~/CCF/build-main$ ./tests.sh -VV -R rotation_test_cft | grep -i "new primary" 
62: 13:10:10.813 | INFO     | infra.network:wait_for_new_primary:1389 - Waiting up to 75.0s for a new primary different from 0 (51291d9b801bd6bf568f3640efa0cc9945ce8ec2cdccdce2d2ee63e8b28ff68d) to be elected...
62: 13:10:13.888 | INFO     | infra.network:wait_for_new_primary:1406 - New primary after 3.07s is 2 (64f3181bea310b99480561c0eaa70707e83f1e5bd7bced547a09c2ca369428fc) in term 3
...
62: 13:10:22.582 | INFO     | infra.network:wait_for_new_primary:1389 - Waiting up to 75.0s for a new primary different from 2 (64f3181bea310b99480561c0eaa70707e83f1e5bd7bced547a09c2ca369428fc) to be elected...
62: 13:10:24.735 | INFO     | infra.network:wait_for_new_primary:1406 - New primary after 2.15s is 1 (ad0f0cee5a96800378d3f6a9d205f8a42e0e31b82df7c485f52ac1948cd8af43) in term 4
...

With:

amchamay@amchamay:~/CCF/build$ ./tests.sh -VV -R rotation_test_cft | grep -i "new primary" 
62: 13:00:00.061 | INFO     | infra.network:wait_for_new_primary:1389 - Waiting up to 75.0s for a new primary different from 0 (f742655a06c574d0b727c551f5bd3b1516f5b458181443037b946530d2185d02) to be elected...
62: 13:00:00.146 | INFO     | infra.network:wait_for_new_primary:1406 - New primary after 0.08s is 1 (731a4a7753e5360086697dfe22edfd57dedc6a94e06d9ec5bc110d9e1592872e) in term 3
...
62: 13:00:08.496 | INFO     | infra.network:wait_for_new_primary:1389 - Waiting up to 75.0s for a new primary different from 1 (731a4a7753e5360086697dfe22edfd57dedc6a94e06d9ec5bc110d9e1592872e) to be elected...
62: 13:00:08.588 | INFO     | infra.network:wait_for_new_primary:1406 - New primary after 0.09s is 2 (96474d47986161d1826427eb0b3a7556c89d19943bbea0d5661b0141d725f364) in term 4
...

@ghost
Copy link

ghost commented Sep 29, 2023

propose_vote_request@76920 aka 20231004.43 vs main ewma over 20 builds from 76474 to 76915

Click to see table

main

build_id build_number pi_basic_mt_sgx_cft^ pi_basic_mt_sgx_cft_mem pi_basic_mt_virtual_cft^ Commit latency factor tpcc_virtual_cft^ ls_virtual_cft^ tpcc_sgx_cft^ tpcc_sgx_cft_mem pi_ls_virtual_cft^ pi_basic_virtual_cft^ pi_basic_js_virtual_cft^ ls_jwt_virtual_cft^ pi_ls_jwt_virtual_cft^ ls_sgx_cft^ ls_sgx_cft_mem ls_js_virtual_cft^ pi_ls_sgx_cft^ pi_ls_sgx_cft_mem ls_full_js_virtual_cft^ pi_basic_sgx_cft^ pi_basic_sgx_cft_mem ls_js_jwt_virtual_cft^ pi_basic_js_sgx_cft^ pi_basic_js_sgx_cft_mem ls_jwt_sgx_cft^ ls_jwt_sgx_cft_mem hist_sgx_cft^ pi_ls_jwt_sgx_cft^ pi_ls_jwt_sgx_cft_mem ls_js_sgx_cft^ ls_js_sgx_cft_mem ls_full_js_sgx_cft^ ls_full_js_sgx_cft_mem ls_js_jwt_sgx_cft^ ls_js_jwt_sgx_cft_mem RB put (/s)^ CHAMP put (/s)^ RB get (/s)^ CHAMP get (/s)^
76474 20230927.1 27575.6 2.30851e+07 69514.1 0.766724 17281.2 45879.6 5608.81 8.59996e+07 47971.1 53380.4 4463 17307.8 19704 14057 1.88908e+07 17087.1 14233.1 1.05021e+07 14716.8 15637.3 1.46964e+07 9919.16 1437 1.25993e+07 6861.92 1.67936e+07 45630.4 6967.8 6.30784e+06 5795.6 1.67936e+07 5771.67 1.67936e+07 3999.03 1.67936e+07 831956 1.17776e+06 8.15585e+06 3.07203e+07
76488 20230927.7 28284.8 2.30851e+07 63452.7 0.790189 17354.7 45823.4 5550.33 8.59996e+07 47430.7 54927.3 4447.6 17260.7 19870.8 13976.3 1.88908e+07 17176.1 14123.5 1.05021e+07 14914.4 15542.1 1.25993e+07 9796.07 1436.7 1.25993e+07 7237.01 1.67936e+07 41180.1 6956.7 6.30784e+06 5812.98 1.67936e+07 5448.81 1.67936e+07 3998.52 1.67936e+07 831215 1.18125e+06 8.14751e+06 3.09361e+07
76520 20230928.1 27622.7 2.30851e+07 65695.6 0.811302 17264.1 45732.8 5528.73 8.59996e+07 47735.2 54858.6 4433.4 17130.3 19066.5 13953 1.88908e+07 16924.4 13992.5 1.05021e+07 16923.5 15250 1.46964e+07 9805.85 1419.2 1.25993e+07 6863.35 1.88908e+07 42824 6818.3 6.30784e+06 5785.81 1.67936e+07 5461.2 1.67936e+07 3978.65 1.67936e+07 831666 1.18348e+06 8.15342e+06 3.10661e+07
76543 20230928.11 27915.4 2.51822e+07 72582.5 0.803093 17380.6 45620.7 5559.9 8.59996e+07 47866.8 54588.7 4463.5 17525.7 19217.1 14011.7 1.88908e+07 16974.1 13946.2 1.05021e+07 16888.9 15425.8 1.46964e+07 9918.21 1423.5 1.25993e+07 6834.74 1.88908e+07 43403 6929.7 6.30784e+06 5778.19 1.67936e+07 5724.81 1.88908e+07 3988.81 1.67936e+07 836597 1.17951e+06 8.15283e+06 3.14936e+07
76558 20230928.18 27885.8 2.30851e+07 88575.4 0.830452 17234 45454.2 5604.7 8.59996e+07 47870 54772.5 4433.7 17341.5 19225.2 14047.9 1.88908e+07 17098.4 14110.8 1.05021e+07 14922.3 15503.8 1.25993e+07 9863.52 1430.7 1.25993e+07 6843.02 1.67936e+07 42154.9 6962.7 6.30784e+06 5769.09 1.67936e+07 5491.12 1.67936e+07 3993.7 1.67936e+07 831276 1.18056e+06 8.13202e+06 3.02641e+07
76583 20230928.28 27848.7 2.51822e+07 65434.1 0.801501 17308 45863.7 5537.92 8.59996e+07 47715.4 54659.8 4426.7 17042.4 19021.5 13999.1 1.88908e+07 17126.6 14007.6 1.05021e+07 16809.9 15378.9 1.46964e+07 9867.79 1419.9 1.25993e+07 6870.19 1.88908e+07 44707.7 6977.8 6.30784e+06 5781.9 1.67936e+07 5471.22 1.67936e+07 3964.63 1.67936e+07 831168 1.17972e+06 8.14965e+06 3.15285e+07
76590 20230928.31 27807.9 2.51822e+07 84636.6 0.776866 17442.6 45792.5 5620.03 8.59996e+07 48114.9 54905.7 4462.3 17281.7 19342.4 14032.4 1.88908e+07 17404.5 14144 1.05021e+07 14968.6 15646.5 1.46964e+07 9799.22 1431.9 1.25993e+07 6892.39 1.88908e+07 46452.5 7205.4 6.30784e+06 5782.23 1.67936e+07 5744.12 1.67936e+07 4005.35 1.67936e+07 830337 1.17868e+06 8.1539e+06 3.27507e+07
76597 20230928.34 27790.7 2.30851e+07 65237.8 0.810287 17299.4 45785.5 5572.97 8.59996e+07 47963.7 54568 4407.2 17445.4 19516.4 13976.1 1.88908e+07 17451.8 14061.9 1.05021e+07 15049.7 15478.4 1.25993e+07 9773.48 1421.7 1.25993e+07 6790.65 1.88908e+07 44717.7 6979.9 6.30784e+06 5802.39 1.67936e+07 5473.96 1.67936e+07 3989.84 1.67936e+07 830892 1.17553e+06 8.13929e+06 3.15329e+07
76603 20230929.1 27929.8 2.30851e+07 74029.2 0.812755 17276.6 43947.2 5509.39 8.59996e+07 40355.5 54828.9 4361.2 17117.4 19493.2 13963.9 1.88908e+07 16917 14003.2 1.05021e+07 14630.2 15460.2 1.25993e+07 9936.59 1421.9 1.25993e+07 6808.22 1.88908e+07 45786.1 6853.3 6.30784e+06 5760.56 1.67936e+07 5446.9 1.67936e+07 3984.27 1.67936e+07 828785 1.17327e+06 8.14904e+06 3.05535e+07
76674 20230929.32 28135.8 2.51822e+07 79416.4 0.824286 17258 44130.7 5650.04 8.59996e+07 47868.2 54216.1 4439.1 17162.5 19472.3 14044.4 1.88908e+07 17470.5 14151.8 1.05021e+07 14776.5 15668.6 1.46964e+07 10176.7 1441.1 1.25993e+07 6848.37 1.88908e+07 42393.6 7030.2 6.30784e+06 5779.43 1.67936e+07 5785.88 1.67936e+07 4002.96 1.67936e+07 834800 1.17705e+06 8.15572e+06 3.08671e+07
76681 20231002.1 27742.7 2.51822e+07 77661.1 0.778316 17317.1 45761.1 5610.63 8.59996e+07 47867.7 54907.1 4426.1 17211.6 19449.2 14016.2 1.67936e+07 17053.9 14116.8 1.05021e+07 14750.7 15668.2 1.46964e+07 9744.29 1441.3 1.25993e+07 7241.2 1.67936e+07 43611.5 6972.2 6.30784e+06 5780.61 1.67936e+07 5740.75 1.67936e+07 4007.12 1.67936e+07 832338 1.182e+06 8.17173e+06 3.15261e+07
76724 20231002.20 27904.4 2.30851e+07 65839 0.836357 17138.9 45994 5500.05 8.59996e+07 47515 54332.6 4407.9 17567.5 19677.7 13942.2 1.88908e+07 17216.8 14010.7 1.05021e+07 15003.6 15376 1.46964e+07 9764.04 1426.4 1.25993e+07 6865.24 1.88908e+07 38966.2 6825.1 6.30784e+06 5791.28 1.67936e+07 5466.41 1.67936e+07 3982.56 1.67936e+07 827863 1.17557e+06 8.17144e+06 3.07438e+07
76730 20231003.1 28013.8 2.51822e+07 71051.7 0.81692 17191.7 45513.1 5607.21 8.59996e+07 48269.1 54442.6 4481.1 17361.9 19564.9 14019 1.88908e+07 17099.1 14147.9 1.05021e+07 14813.6 15694.6 1.25993e+07 9839.85 1444.4 1.25993e+07 6811.01 1.88908e+07 45281.2 7029 6.30784e+06 5811.58 1.67936e+07 5768.4 1.67936e+07 3971.63 1.67936e+07 833206 1.18303e+06 8.14586e+06 3.06784e+07
76737 20231003.4 27593.5 2.30851e+07 64150.5 0.819855 17191.7 45892.7 5602.23 8.59996e+07 48119 54628 4416 17162 19516.8 14099.4 1.67936e+07 17353 14185.4 1.05021e+07 14663.4 15720.3 1.46964e+07 9885.05 1442.7 1.25993e+07 6888.2 1.88908e+07 45195.8 6947.2 6.30784e+06 5809.12 1.67936e+07 5745.68 1.67936e+07 3987.27 1.67936e+07 833711 1.17516e+06 8.15378e+06 3.0875e+07
76798 20231003.31 27510.7 2.51822e+07 81597 0.832841 17353.1 45905.3 5507.34 8.59996e+07 48154 53871 4439.9 17065.3 19658.3 13926.2 1.88908e+07 17561 14064.4 1.05021e+07 15029.1 15503.8 1.25993e+07 10165.7 1426.1 1.25993e+07 6835.05 1.88908e+07 43728.3 6934.2 6.30784e+06 5797.3 1.67936e+07 5480.27 1.67936e+07 3989.1 1.67936e+07 828433 1.17822e+06 8.15582e+06 3.11739e+07
76817 20231004.1 27941.5 2.51822e+07 69954.7 0.780939 17068.1 43943.3 5626.78 8.59996e+07 47753 54465.2 4457.9 17248 19168.1 13971.1 1.88908e+07 17214.9 14077.8 1.05021e+07 16923.5 15496.4 1.46964e+07 9820.54 1437 1.05021e+07 7251.7 1.67936e+07 41295.8 6941.9 6.30784e+06 5773.46 1.67936e+07 5738.5 1.67936e+07 3993.81 1.67936e+07 841933 1.17864e+06 8.15569e+06 3.07938e+07
76832 20231004.7 27629.2 2.30851e+07 68893.5 0.791761 17248.6 43721.4 5615.29 8.59996e+07 48138.9 46993.4 4444.2 17038.4 19713.4 13975.4 1.67936e+07 17646.2 14056.3 1.05021e+07 14882.5 15601.7 1.25993e+07 9901.95 1434 1.25993e+07 6873.72 1.88908e+07 42527.8 7089.4 6.30784e+06 5806.39 1.67936e+07 5726.45 1.67936e+07 3976.51 1.67936e+07 835979 1.17914e+06 8.15303e+06 3.17647e+07
76869 20231004.22 27762.1 2.51822e+07 70744.9 0.806157 17245.9 45658.4 5552.49 8.59996e+07 48077.4 54561.1 4464.3 17322.8 18923.3 14003.1 1.88908e+07 17202.1 14065.4 1.05021e+07 15082.4 15372.4 1.46964e+07 9913.2 1427 1.25993e+07 6868.91 1.88908e+07 45382.2 7022 6.30784e+06 5804.63 1.67936e+07 5756.12 1.67936e+07 3991.19 1.67936e+07 834342 1.18098e+06 8.14978e+06 3.08076e+07
76888 20231004.30 27780.7 2.30851e+07 67387.1 0.810034 17390.9 45776.9 5604.3 8.59996e+07 48301 51883.1 4390.1 17189.2 20048.9 13985 1.88908e+07 15378.1 14115.9 1.05021e+07 14886.4 15556.5 1.46964e+07 10356.8 1434.9 1.25993e+07 6863.79 1.67936e+07 42002.2 7032.1 6.30784e+06 5815.76 1.67936e+07 5751.88 1.67936e+07 3969.63 1.67936e+07 823286 1.18056e+06 8.1555e+06 3.03318e+07
76915 20231004.42 27485.2 2.51822e+07 80712.1 0.774478 17274.8 46033.9 5624.97 8.59996e+07 48315.9 53167.5 4402.4 17273.2 19310.2 14042.4 1.88908e+07 17384.9 14174.9 1.05021e+07 14995.5 15627.4 1.25993e+07 10159.5 1433.2 1.25993e+07 6910.89 1.88908e+07 40358.8 6998.2 6.30784e+06 5784.96 1.67936e+07 5748.49 1.67936e+07 4003.02 1.67936e+07 835408 1.17924e+06 8.12898e+06 3.07817e+07

propose_vote_request

build_id build_number pi_basic_mt_sgx_cft^ pi_basic_mt_sgx_cft_mem pi_basic_mt_virtual_cft^ Commit latency factor tpcc_sgx_cft^ tpcc_sgx_cft_mem tpcc_virtual_cft^ ls_virtual_cft^ pi_ls_virtual_cft^ pi_basic_virtual_cft^ ls_sgx_cft^ ls_sgx_cft_mem pi_ls_sgx_cft^ pi_ls_sgx_cft_mem pi_basic_js_virtual_cft^ pi_basic_sgx_cft^ pi_basic_sgx_cft_mem ls_jwt_virtual_cft^ pi_ls_jwt_virtual_cft^ ls_js_virtual_cft^ ls_full_js_virtual_cft^ ls_js_jwt_virtual_cft^ pi_basic_js_sgx_cft^ pi_basic_js_sgx_cft_mem ls_jwt_sgx_cft^ ls_jwt_sgx_cft_mem pi_ls_jwt_sgx_cft^ pi_ls_jwt_sgx_cft_mem ls_js_sgx_cft^ ls_js_sgx_cft_mem hist_sgx_cft^ ls_full_js_sgx_cft^ ls_full_js_sgx_cft_mem ls_js_jwt_sgx_cft^ ls_js_jwt_sgx_cft_mem RB put (/s)^ CHAMP put (/s)^ RB get (/s)^ CHAMP get (/s)^
76850 20231004.13 27895.3 2.51822e+07 74550.6 0.831391 5591.91 8.59996e+07 17250.8 43556.7 47494 53917.8 14070 1.88908e+07 14172.9 1.05021e+07 4367.7 15657 1.25993e+07 17200.7 19618.6 17582.3 14887 9896.67 1439.2 1.25993e+07 6868.54 1.88908e+07 6974.3 6.30784e+06 5796.7 1.67936e+07 45566.2 5787.46 1.67936e+07 3979.55 1.67936e+07 830019 1.1792e+06 8.13279e+06 3.07461e+07
76852 20231004.14 27678.8 2.51822e+07 87384.6 0.793135 5596.69 8.59996e+07 17307.4 43738.3 47438.8 54231.6 14090.3 1.88908e+07 14158.4 1.05021e+07 4428.9 15518.6 1.46964e+07 17611.7 19744.1 17277 14808.8 9838.27 1422.7 1.25993e+07 6882.69 1.67936e+07 6935.5 6.30784e+06 5768.21 1.67936e+07 42765.2 5482.49 1.88908e+07 4001.88 1.67936e+07 832722 1.18086e+06 8.15053e+06 3.13385e+07
76887 20231004.29 27712 2.30851e+07 88291.6 0.797916 5562.37 8.59996e+07 17282.4 43738.6 47673.5 53697.4 14043.3 1.88908e+07 14112.6 1.05021e+07 4417.1 15592.6 1.25993e+07 17292.7 19558.4 17514.1 15194.1 9921.39 1436.7 1.25993e+07 6762.93 1.67936e+07 7006.3 6.30784e+06 5799.83 1.67936e+07 46253.9 5487.6 1.67936e+07 3976.75 1.67936e+07 840555 1.18326e+06 8.15365e+06 3.07725e+07
76920 20231004.43 27706.1 2.51822e+07 79991.4 0.820236 5524.98 8.59996e+07 17286.3 45861.2 48950.5 54235.3 14006.8 1.88908e+07 14089.1 1.05021e+07 4464.4 15544.8 1.25993e+07 17313.8 19759.7 17385.8 14755 9926 1422.1 1.25993e+07 6882.55 1.88908e+07 7042.3 6.30784e+06 5767.28 1.67936e+07 42741.5 5505.52 1.88908e+07 4007.18 1.67936e+07 827380 1.18522e+06 8.13512e+06 3.12205e+07

images

@achamayou achamayou marked this pull request as ready for review September 29, 2023 16:25
@achamayou achamayou requested a review from a team September 29, 2023 16:25
@achamayou achamayou changed the title Initial draft of a propose request vote message Adding a propose request vote message to speed up reconfigurations Sep 29, 2023
@achamayou achamayou changed the title Adding a propose request vote message to speed up reconfigurations Adding a propose request vote message to speed up some reconfigurations Sep 29, 2023
tla/ccfraft.tla Show resolved Hide resolved
tla/ccfraft.tla Outdated Show resolved Hide resolved
tla/ccfraft.tla Outdated Show resolved Hide resolved
tla/ccfraft.tla Outdated Show resolved Hide resolved
tla/ccfraft.tla Outdated Show resolved Hide resolved
@achamayou achamayou merged commit 438ff87 into microsoft:main Oct 5, 2023
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants