Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correcting order for actions occuring within the same block for the same account #1784

Merged
merged 14 commits into from
Aug 19, 2024

Conversation

45930
Copy link
Contributor

@45930 45930 commented Jul 31, 2024

The community has raised this issue where their reducer cannot correctly derive the action hash given the response from the archive node. This only happens when multiple actions are dispatched within the same block.

The zk noid team shared this minimal repro: https://github.com/aii23/actions_issue_poc

The issue is that events from the archive node arrive in the wrong order for producing the actions hash.

For example, running their script against a light node can produce a graphQL response like:

[
  {
    accountUpdateId: '3',
    data: [
      '23293912831932499872034135232525254163343702531942971872613724597787050731163',
      '1'
    ]
  }
]
[
  {
    accountUpdateId: '5',
    data: [
      '19698539030616983614904223910467017498097850769349730675723779555348012696719',
      '0'
    ]
  }
]
[
  {
    accountUpdateId: '7',
    data: [
      '23344670544599826568585832059189005555232761671965680539590523406038183782503',
      '1'
    ]
  },
  {
    accountUpdateId: '6',
    data: [
      '23379676726753250474018172284972553782392077412995035117267089127972991216710',
      '1'
    ]
  }
]

accountUpdateIds 3 and 5 were dispatched in separate blocks, while accountUpdateIds 6, and 7 were dispatched within the same block. Note the order of the actions top to bottom is [3, 5, 7, 6].

This PR corrects the order by sorting the array by account update id within a single block of actions to correctly represent the order as [3, 5, 6, 7]. I confirmed this fix by running the zk noid script locally against this build of o1js.

@45930 45930 changed the title proposed fix for multiple actions issue Correcting order for actions occuring within the same block for the same account Jul 31, 2024
@45930 45930 marked this pull request as ready for review July 31, 2024 20:15
Copy link
Collaborator

@mitschabaude mitschabaude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome that you fixed this @45930, I have a couple of small comments on code style

src/lib/fixtures/fetchActionsResponse.ts Outdated Show resolved Hide resolved
src/lib/fixtures/fetchActionsResponse.ts Outdated Show resolved Hide resolved
src/lib/mina/fetch.test.ts Outdated Show resolved Hide resolved
src/lib/mina/fetch.ts Outdated Show resolved Hide resolved
src/lib/mina/fetch.ts Outdated Show resolved Hide resolved
@asimaranov
Copy link

We found out that fix works not in all the scenarios. We used this patch on our backend. It started working. Then it thrown the same issue.
We checked and the previous fetch function started to work while the patched stopped.
Address on devnet: B62qmbX7xkXxLhKQpi4iabLzEtzyXi9uA5pFU1PbACoXCMt6DEC3t1g
Not patched fetch works on it
Patched fetch throws

Error: Failed to derive correct actions hash for B62qmbX7xkXxLhKQpi4iabLzEtzyXi9uA5pFU1PbACoXCMt6DEC3t1g.
        Derived hash: 18711700257838647448684440256433891881288423153285399691473064599228530899996, expected hash: 23151667407356824933654059015358591226495725733024978946167172163412083267949).

@45930
Copy link
Contributor Author

45930 commented Aug 1, 2024

@asimaranov could you please recreate the graphQL JSON response that fails and the order that the patched o1js arrives at?

@asimaranov
Copy link

@asimaranov could you please recreate the graphQL JSON response that fails and the order that the patched o1js arrives at?

archive_gql_resp.json

You mean this?

@45930
Copy link
Contributor Author

45930 commented Aug 1, 2024

@asimaranov No, I mean can you log out the response from makeGraphqlRequest in fetchActions, then log out the actionsList generated by createActionsList in this patched version of o1js. If we're still ordering the actions incorrectly, it's helpful to know what the inputs and outputs are.

There are other reasons why we may fail to derive the hash, so this will help us pinpoint the issue.

@aii23
Copy link

aii23 commented Aug 1, 2024

Tried to recreate issue, that @asimaranov mentioned with current fix. Run on background script that dispatch several actions at once first, and then dispatch 40 more actions one by one.
So after dispatching several actions fetch is working, but after 40 more actions it is starting to fail with the same error.
Script is located here https://github.com/aii23/actions_issue_poc/blob/fix_test/lightnet_reproduce_2.sh.
I will investigate it further later, to find a which number of transactions it starts to fail(Because 40 is obviously overkill)

@45930
Copy link
Contributor Author

45930 commented Aug 1, 2024

Thanks @aii23! 40 should still be a processable number so we want to get to the bottom of it. When you do the test with 40, do all 40 actions show up in the graphql response?

@45930
Copy link
Contributor Author

45930 commented Aug 1, 2024

@aii23 , there is another theory that perhaps the archive node is inconsistent about the most recent actions for actions within the same block. If you do the same test but wait for 2-3 blocks for the archive node to get settled, does that work? Perhaps this can work in prod by setting an endActionState to a couple blocks in the past.

@aii23
Copy link

aii23 commented Aug 1, 2024

I will check this later today, or maybe tomorrow, to find exact condition for it to happen.

@dfstio
Copy link

dfstio commented Aug 1, 2024

Tried to recreate issue, that @asimaranov mentioned with current fix. Run on background script that dispatch several actions at once first, and then dispatch 40 more actions one by one. So after dispatching several actions fetch is working, but after 40 more actions it is starting to fail with the same error. Script is located here https://github.com/aii23/actions_issue_poc/blob/fix_test/lightnet_reproduce_2.sh. I will investigate it further later, to find a which number of transactions it starts to fail(Because 40 is obviously overkill)

I believe from my experience that this number of actions depends upon the time when you check it again, for 128 actions you will see this problem even after an hour, for few actions it should go away almost immediately, so it is better to check the actions immediately after the block is created.

@dfstio
Copy link

dfstio commented Aug 2, 2024

I would also propose adding pagination to the action fetching logic, as it will be impossible to fetch a large number of actions from the archive node at once.

You can see it, for example, trying to fetch events on mainnet for the contract B62qjwDWxjf4LtJ4YWJQDdTNPqZ69ZyeCzbpAFKN7EoZzYig5ZRz8JE:

import { fetchEvents, UInt32 } from "o1js";

const contractAddress =
  "B62qjwDWxjf4LtJ4YWJQDdTNPqZ69ZyeCzbpAFKN7EoZzYig5ZRz8JE";

describe("Should get events", () => {
  it(`should get events`, async () => {
    console.log("Fetching events for", contractAddress);
    let blockHeight = 375561;
    const interval = 1000;
    while (blockHeight > 359609) {
      try {
        const events1 = await fetchEvents(
          { publicKey: contractAddress },
          "https://api.minascan.io/archive/mainnet/v1/graphql",
          {
            from: UInt32.from(blockHeight - interval),
            to: UInt32.from(blockHeight),
          }
        );
        console.log("Events1:", blockHeight, events1?.length);
      } catch (e) {
        console.error("Error fetching events for blocks", blockHeight, e);
      }
      blockHeight -= interval;
      await sleep(1000);
    }
  });
});

function sleep(ms: number) {
  return new Promise((resolve) => setTimeout(resolve, ms));
}

Increasing the interval to 2000 gives the error:

[11:45:42 AM] Fetching events for B62qjwDWxjf4LtJ4YWJQDdTNPqZ69ZyeCzbpAFKN7EoZzYig5ZRz8JE
Error fetching events for blocks 375561 Error: Gateway Time-out
    at fetchEvents2 (/Users/mike/Documents/DeFi/MinaNFT/minanft-tools/node_modules/o1js/dist/node/index.cjs:22299:11)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at Object.<anonymous> (/Users/mike/Documents/DeFi/MinaNFT/minanft-tools/tests/punk.test.ts:14:25)

@aii23
Copy link

aii23 commented Aug 2, 2024

Still trying to find exact conditions on witch all of this happening. Is there a way to configure lightnode, so it produce block faster. Because now to ensure that all transaction goes into different blocks, I need to wait hell lot of time.

@45930
Copy link
Contributor Author

45930 commented Aug 2, 2024

@aii23

I'd appreciate if you would give the latest commit a spin: d2c9320

Instead of always reversing the order, this commit attempts to find the right order by sorting on account update id. It may solve the edge cases you were hitting.

@@ -14,8 +14,7 @@ case $TEST_TYPE in

"Reducer integration tests")
echo "Running reducer integration tests"
./run src/examples/zkapps/reducer/actions-as-merkle-list.ts --bundle
./run src/examples/zkapps/reducer/actions-as-merkle-list-iterator.ts --bundle
./run src/examples/zkapps/reducer/run.ts --bundle
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change here is because previously these files would run a local test as they were imported. I wrapped the tests in an export and call them explicitly in this run file. This allows the run-live file to import the same smart contracts without too many excess diffs.

// push some actions

tic('dispatch transactions');
let dispatchTx = await Mina.transaction(senderSpec, async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this transactions, one sender creates many actions to reduce in the same block


tic('building');
const txs = [];
let dispatchTx1 = await Mina.transaction(senderSpec, async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In these transactions, different senders create actions to reduce within the same block

@@ -21,6 +21,8 @@ DEX_PROC=$!
FETCH_PROC=$!
./run src/tests/transaction-flow.ts --bundle | add_prefix "TRANSACTION_FLOW" &
TRANSACTION_FLOW_PROC=$!
./run src/examples/reducer/run-live.ts --bundle | add_prefix "REDUCER" &
REDUCER_FLOW_PROC=$!
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New reducer test in CI handles some action serialization cases using lightnet graphql API. The coverage on this is not ultra high, but it's a starting point, and covers the issue that zk noid originally isolated in their repo.

@45930 45930 requested a review from mitschabaude August 2, 2024 22:02
Copy link
Member

@Trivo25 Trivo25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure to update the bindings to the proper commit from main

Copy link
Member

@Trivo25 Trivo25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, good job! Left some comments - do you think we should test some other action-ranges? eg. maxing out all action field sizes and lengths?

src/examples/utils/randomAccounts.ts Outdated Show resolved Hide resolved
src/examples/utils/randomAccounts.ts Outdated Show resolved Hide resolved
src/examples/utils/randomAccounts.ts Outdated Show resolved Hide resolved
src/examples/zkapps/reducer/actions-as-merkle-list.ts Outdated Show resolved Hide resolved
src/examples/zkapps/reducer/run-live.ts Outdated Show resolved Hide resolved
src/examples/zkapps/reducer/run-live.ts Outdated Show resolved Hide resolved
src/examples/zkapps/reducer/run-live.ts Outdated Show resolved Hide resolved
src/lib/mina/fixtures/fetchActionsResponse.ts Outdated Show resolved Hide resolved
src/lib/mina/mina.ts Show resolved Hide resolved
@45930
Copy link
Contributor Author

45930 commented Aug 14, 2024

make sure to update the bindings to the proper commit from main

@Trivo25 I don't know what this means. Is there a script to run or something?

@Trivo25
Copy link
Member

Trivo25 commented Aug 14, 2024

make sure to update the bindings to the proper commit from main
@Trivo25 I don't know what this means. Is there a script to run or something?

git submodule update --init --recursive initialises all submodules (recursively), you can then go into the bindings submodule via cd src/bindings and check out specific commits like normal git checkout .. and then add that commit to the parent in o1js

but it looks like in this particular instance we can just merge main into this branch to fix the discrepancy

@45930 45930 requested a review from Trivo25 August 14, 2024 18:14
@45930
Copy link
Contributor Author

45930 commented Aug 14, 2024

do you think we should test some other action-ranges?

In general, that seems reasonable. I don't want to bloat the scope of this fix though. This fix is currently blocking usage of actions for devs,so I want to get the corrected functionality out ASAP.

@45930 45930 merged commit 721ca6a into main Aug 19, 2024
24 checks passed
@45930 45930 deleted the multi-actions-order branch August 19, 2024 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants