Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After a restart, a node doesn't respond to chunk requests that have non-empty list of requested receipts #2916

Closed
SkidanovAlex opened this issue Jun 27, 2020 · 0 comments
Assignees
Labels
A-chain Area: Chain, client & related C-bug Category: This is a bug

Comments

@SkidanovAlex
Copy link
Collaborator

No description provided.

@SkidanovAlex SkidanovAlex added the A-chain Area: Chain, client & related label Jun 27, 2020
@SkidanovAlex SkidanovAlex self-assigned this Jun 27, 2020
@ilblackdragon ilblackdragon added the C-bug Category: This is a bug label Jun 29, 2020
SkidanovAlex added a commit that referenced this issue Jul 24, 2020
SkidanovAlex added a commit that referenced this issue Jul 24, 2020
SkidanovAlex added a commit that referenced this issue Jul 30, 2020
SkidanovAlex added a commit that referenced this issue Jul 30, 2020
nearprotocol-bulldozer bot pushed a commit that referenced this issue Jul 31, 2020
…#3036)

After this change stress.py node_restart passes relatively consistently, and is reintroduced to nightly.

Nearcore fixes:

- We had a bug in the syncing logic (with a low chance of being
triggered in the wild): if a block is produced, and between 1/3 and 2/3
of block producers received it, and the rest have not, the system
stalls, because no 2/3 of block producers have the same head, but also
nobody is two blocks behind the highest peer to start syncing. Fixing it
by forcing sync if we've been 1 block behind for too long.
stress.py was reproducing this issue in every run

- (#2916) we had an issue that if a node produced a chunk, and then
crashed, on recovery it was not able to serve it because it didn't have
all the parts and receipts stored in the storage from which we recover
cache entries in the shards manager. Fixing it by always storing all the
parts and receipts (redundantly) for chunks in the shards we care about.

Test fixes

[v] Fixing a scenario in which a failure to send a transaction to all
validators resulted in recording an incorrect tx hash alongside the tx.
Later when checking balances using the incorrect hash resulted in
getting incorrect success value, and thus applying incorrect corrections
to the expected balances;

[v] Changing the order of magnitude of staking transactions, so that the
validator set actually changes.

Other issues discovered while fixing stress.py:
- #2906
bowenwang1996 pushed a commit that referenced this issue Aug 14, 2020
…#3036)

After this change stress.py node_restart passes relatively consistently, and is reintroduced to nightly.

Nearcore fixes:

- We had a bug in the syncing logic (with a low chance of being
triggered in the wild): if a block is produced, and between 1/3 and 2/3
of block producers received it, and the rest have not, the system
stalls, because no 2/3 of block producers have the same head, but also
nobody is two blocks behind the highest peer to start syncing. Fixing it
by forcing sync if we've been 1 block behind for too long.
stress.py was reproducing this issue in every run

- (#2916) we had an issue that if a node produced a chunk, and then
crashed, on recovery it was not able to serve it because it didn't have
all the parts and receipts stored in the storage from which we recover
cache entries in the shards manager. Fixing it by always storing all the
parts and receipts (redundantly) for chunks in the shards we care about.

Test fixes

[v] Fixing a scenario in which a failure to send a transaction to all
validators resulted in recording an incorrect tx hash alongside the tx.
Later when checking balances using the incorrect hash resulted in
getting incorrect success value, and thus applying incorrect corrections
to the expected balances;

[v] Changing the order of magnitude of staking transactions, so that the
validator set actually changes.

Other issues discovered while fixing stress.py:
- #2906
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-chain Area: Chain, client & related C-bug Category: This is a bug
Projects
None yet
Development

No branches or pull requests

3 participants