Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STR-734: OPERATOR REASSIGNMENT #557

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

voidash
Copy link
Contributor

@voidash voidash commented Dec 23, 2024

Description

Functional test where we make deposits, then trigger the withdrawal. client associated with assigned operator id is stopped. We produce l1 blocks so that the dispatch assignment duration is over, and later check if new operator is being assigned or not

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature/Enhancement (non-breaking change which adds functionality or enhances an existing one)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactor
  • New or updated tests
  • Dependency Update

Notes to Reviewers

Checklist

  • I have performed a self-review of my code.
  • I have commented my code where necessary.
  • I have updated the documentation if needed.
  • My changes do not introduce new warnings.
  • I have added tests that prove my changes are effective or that my feature works.
  • New and existing tests pass with my changes.

Related Issues

STR-734

@voidash voidash requested review from a team as code owners December 23, 2024 07:27
Copy link

codecov bot commented Dec 23, 2024

Codecov Report

Attention: Patch coverage is 0% with 32 lines in your changes missing coverage. Please review.

Project coverage is 57.27%. Comparing base (506fd5c) to head (613ba96).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...n/bridge-client/src/modes/operator/task_manager.rs 0.00% 18 Missing ⚠️
bin/bridge-client/src/modes/operator/bootstrap.rs 0.00% 8 Missing ⚠️
crates/bridge-exec/src/handler.rs 0.00% 3 Missing ⚠️
crates/chaintsn/src/transition.rs 0.00% 3 Missing ⚠️
@@            Coverage Diff             @@
##             main     #557      +/-   ##
==========================================
- Coverage   57.33%   57.27%   -0.06%     
==========================================
  Files         309      309              
  Lines       31510    31520      +10     
==========================================
- Hits        18067    18054      -13     
- Misses      13443    13466      +23     
Files with missing lines Coverage Δ
bin/bridge-client/src/args.rs 0.00% <ø> (ø)
crates/bridge-exec/src/handler.rs 0.00% <0.00%> (ø)
crates/chaintsn/src/transition.rs 82.86% <0.00%> (ø)
bin/bridge-client/src/modes/operator/bootstrap.rs 0.00% <0.00%> (ø)
...n/bridge-client/src/modes/operator/task_manager.rs 0.00% <0.00%> (ø)

... and 3 files with indirect coverage changes

Copy link
Member

@storopoli storopoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should create a proper parent class for tests that needs withdrawal/deposits that can have all the yet-to-move utils.py functions that I commented below in the NITs.

Then all the DRT, get balances, withdraw etc functionality is encapsulated in it and we can create proper concrete classes for the other tests that inherit from this parent class.

Sorry to push all this refactoring into your plate.
Since you're already editing these files it is convenient.
However, feel free to reject and open a ticket for this so that we can do it later.

functional-tests/fn_bridge_reassignment.py Outdated Show resolved Hide resolved
Comment on lines 7 to 13
from fn_bridge_withdraw_happy import (
DEPOSIT_AMOUNT,
check_initial_eth_balance,
deposit_twice,
do_withdrawal,
setup_services,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should move these to utils?
What you think?
They are being used elsewhere than fn_bridge_withdraw_happy.

Comment on lines +37 to +41
# Check initial balance is 0
check_initial_eth_balance(rethrpc, el_address, self.debug)

# Perform two deposits
deposit_twice(ctx, web3, seqrpc, rethrpc, el_address, self.debug)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there's a way for you to setup an environment with already balances in BTC and Strata, AFAIK @evgenyzdanovich implemented that feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This aligns with the "minimal resources" principle above.


def __init__(self, ctx: flexitest.InitContext):
# Example: we spin up 5 operators
ctx.set_env(testenv.BasicEnvConfig(101, n_operators=5))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that you can also disable auto-generate BTC blocks. This is important to avoid flaky tests when you need total control of the block production (i.e. waiting for 64 blocks below).

functional-tests/fn_bridge_withdraw_happy.py Show resolved Hide resolved
functional-tests/fn_bridge_withdraw_happy.py Show resolved Hide resolved
functional-tests/fn_bridge_withdraw_happy.py Show resolved Hide resolved
Comment on lines 64 to 73
# Re-check duties
duties = seqrpc.strata_getBridgeDuties(0, 0)["duties"]
withdraw_duty = [d for d in duties if d["type"] == "FulfillWithdrawal"][0]
new_assigned_op_idx = withdraw_duty["payload"]["assigned_operator_idx"]
new_assigned_operator = ctx.get_service(f"bridge.{new_assigned_op_idx}")

# Ensure a new operator is assigned
assert new_assigned_operator != assigned_operator, "No new operator was assigned"

return True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are testing only 2 out of 3 things that are highlighted in the original ticket (STR-734):

  1. Is the user’s withdrawal request confirmed in a Strata batch?
  2. After the first operator fails to initiate the withdrawal, is the withdrawal request properly assigned to another operator?
  3. After the withdrawal request is assigned to another operator, is it signed by all other operators, broadcast, and confirmed on signet?

We need to find a way to test number 3 above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess once we observe that it's been reassigned we can restart the operator we shut down and then get it signed.

@delbonis
Copy link
Contributor

I think it's reasonable to include the fix for this ticket in this PR too, since it's all related.

"""
# Get bridge pubkey
bridge_pk = utils.get_bridge_pubkey(seqrpc)
if debug_fn:
Copy link
Contributor

@bewakes bewakes Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not just omit this if clause? Its annoying to use this here and there. The default value in the debug_fn argument indicates that this clause will always return true, unless None is passed(but this can be avoided by enforcing argument type).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! The way to enable debug messages like these should be to set the LOG_LEVEL and not pass in an explicit argument at runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are test specific log, so this is the easiest way to inject logs or we have to pass test name to getLogger

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I second the approach by @bewakes i.e., annotating with Callable[str, None] perhaps and removing the if clause.

@voidash voidash requested review from a team as code owners December 24, 2024 10:50
@voidash voidash requested a review from a team as a code owner December 24, 2024 11:05
Comment on lines 83 to 88

// if none of the duties failed, update the duty index so that the
// next batch is fetched in the next poll.
//
// otherwise, don't update the index so that the current batch is refetched and
// ones that were not executed successfully are executed again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to move these comments on top of the if !any_failed check as well.

@@ -297,6 +297,9 @@ where
if let Ok(msg) = msg {
let payload = msg.payload();
let parsed_payload = borsh::from_slice::<Payload>(payload);
let scope = msg.scope();
let parsed_scope = borsh::from_slice::<Scope>(scope);
debug!(msg=?msg, payload=?parsed_payload, scope=?parsed_scope, "ABC: got message from the L2 Client");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
debug!(msg=?msg, payload=?parsed_payload, scope=?parsed_scope, "ABC: got message from the L2 Client");
debug!(msg=?msg, payload=?parsed_payload, scope=?parsed_scope, "got message from the L2 Client");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also fix the formatting here. If the name of the log field is the same as the variable being used for it, you can just omit the field name and make it ?msg (or %msg for Display print). In any case, there should be spaces around the =. Rustfmt can't understand the macro conventions so it can't automatically fix this.

Comment on lines +75 to +89
# Re-check duties
def my_val():
btcrpc.proxy.generatetoaddress(64, UNSPENDABLE_ADDRESS)
duties = seqrpc.strata_getBridgeDuties(0, 0)["duties"]
withdraw_duty = [d for d in duties if d["type"] == "FulfillWithdrawal"][0]
new_assigned_op_idx = withdraw_duty["payload"]["assigned_operator_idx"]
new_assigned_operator = ctx.get_service(f"bridge.{new_assigned_op_idx}")
self.debug(f"new assigned operator is {new_assigned_op_idx}")
return new_assigned_operator

new_assigned_operator = wait_until_with_value(
my_val,
predicate=lambda v: v != assigned_operator,
timeout=10,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check can be removed now since the reassignment logic is supposed to always compute a different assignee than the current one (by computing offset).

# Ensure a new operator is assigned
assert new_assigned_operator != assigned_operator, "No new operator was assigned"
assigned_operator.start()
abc = assigned_operator.create_rpc()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
abc = assigned_operator.create_rpc()
bridge_rpc = assigned_operator.create_rpc()

"""

def __init__(self, ctx: flexitest.InitContext):
# Example: we spin up 5 operators
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Example: we spin up 5 operators
# Example: we spin up 3 operators

Actually, you can just remove this comment as it is very self-evident.

@@ -4,5 +4,6 @@ pub(super) const DEFAULT_RPC_PORT: u32 = 4781; // first 4 digits in the sha256 o

pub(super) const DEFAULT_RPC_HOST: &str = "127.0.0.1";

pub(super) const DEFAULT_DUTY_TIMEOUT_DURATION: u64 = 65;
Copy link
Contributor

@Rajil1213 Rajil1213 Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should increase this default value to at least 600. I know this is a temporary measure but this is to avoid breaking stuff if we ever deploy this change to our signet (whether accidentally or not).

} else {
// If there is 0 or 1 operator, we remain with the current assignee.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If there is 0 or 1 operator, we remain with the current assignee.
// If there is only a single operator, we remain with the current assignee.

There can't technically be 0 operators or dstate.assignee() wouldn't work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we refuse to start the client with 0 operators.

Makes two DRT deposits, then triggers the withdrawal.
The bridge client associated with assigned operator id is stopped.
After the dispatch assignment duration is over,
Check if new operator is being assigned or not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Check if new operator is being assigned or not
Check if new operator is being assigned or not
Ensure that the withdrawal resumes and completes successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants