Security: Handle gossiped blocks that are a long way ahead of the current tip #1389

teor2345 · 2020-11-25T07:19:38Z

Is your feature request related to a problem? Please describe.

In #1328, Zebra downloads and verifies blocks gossiped via AdvertiseBlock requests.

But these blocks can be a long way ahead of the local chain tip. If they are, they wait on:

their range being added to the checkpoint queue,
their parent being added to the non-finalized state queue, or
their UTXO requests from the block verifier (consensus: add timeout to UTXO queries #1391).

Keeping these blocks around might fill up memory with blocks and tasks that won't complete for a long time.

These issues can also happen if peers send incorrect ObtainTips or ExtendTips responses to the syncer, but that's much less likely - and it should be handled by sync restarts.

Describe the alternatives you've considered

There are a number of different solutions to this issue:

if a downloaded block is more than the lookahead limit (configurable) ahead of the current non-finalized tip, drop it
- limits the size of the non-finalised queue, preventing memory denial of service
if the checkpoint queue grows too large, drop a checkpoint's worth (400) of the highest blocks
put a timeout on UTXO requests (consensus: add timeout to UTXO queries #1391)
put a timeout on block download and verification requests (Add sync and inbound timeouts #1586)

We'll need a solution for each different kind of queue.

Additional context

This change isn't a priority unless we actually see memory usage issues in practice.

But it's a DoS risk, so we might want to fix it before the first stable release.

The text was updated successfully, but these errors were encountered:

hdevalence · 2020-11-25T19:00:45Z

I believe we'll get the behavior we want as a side effect of other work we have to do (#1390).

The state service API says explicitly that AwaitUTXO requests should be coupled with a timeout layer. I didn't add this when I was testing and fixing the UTXO lookup code (#1348, #1358) because causing zebrad to hang on a failed dependency was useful for identifying cases where the code wasn't useful (and then inspecting execution traces). As a side effect, I believe this closes #1389, because far-future gossiped blocks will have their UTXO lookups time out, though we may wish to do other work as part of debugging the combined sync+gossip logic.

teor2345 · 2020-11-26T23:45:28Z

I believe we'll get the behavior we want as a side effect of other work we have to do (#1390).

I think #1390 addresses the UTXO case, but not the other internal queues.

See my detailed response in #1391 (comment)

The state service API says explicitly that AwaitUTXO requests should be coupled with a timeout layer. I didn't add this when I was testing and fixing the UTXO lookup code (#1348, #1358) because causing zebrad to hang on a failed dependency was useful for identifying cases where the code wasn't useful (and then inspecting execution traces). As a side effect, this ticket resolves most of the hangs in #1389, because far-future gossiped blocks will have their UTXO lookups time out, though we may wish to do other work as part of debugging the combined sync+gossip logic.

teor2345 added C-enhancement Category: This is an improvement S-needs-triage Status: A bug report needs triage labels Nov 25, 2020

teor2345 added this to the First stable / major version release 📦 milestone Nov 25, 2020

hdevalence mentioned this issue Nov 25, 2020

Timeouts for UXTO lookup in script verification #1390

Closed

hdevalence mentioned this issue Nov 25, 2020

consensus: add timeout to UTXO queries #1391

Merged

This was referenced Jan 13, 2021

Add sync and inbound timeouts #1586

Merged

Tune RocksDB memory usage #1486

Closed

Tracking: sync correctness #884

Closed

mpguerra removed the S-needs-triage Status: A bug report needs triage label Feb 4, 2021

dconnolly added A-rust Area: Updates to Rust code P-Medium labels Feb 4, 2021

teor2345 added C-bug Category: This is a bug and removed C-enhancement Category: This is an improvement labels Feb 4, 2021

mpguerra removed this from the First stable / major version release 📦 milestone May 28, 2021

mpguerra mentioned this issue Nov 23, 2021

Epic: Zebra Release Candidate #3096

Closed

76 tasks

teor2345 changed the title ~~Handle gossiped blocks that are a long way ahead of the current tip~~ Security: Handle gossiped blocks that are a long way ahead of the current tip Dec 6, 2021

teor2345 added C-security Category: Security issues I-unbounded-growth Zebra keeps using resources, without any limit and removed C-bug Category: This is a bug labels Dec 6, 2021

teor2345 added this to the 2021 Sprint 25 - Last Sprint of 2021 milestone Dec 6, 2021

teor2345 self-assigned this Dec 7, 2021

This was referenced Dec 7, 2021

Security: Drop blocks that are a long way ahead of the tip #3167

Merged

Fix syncer download order and add sync tests #3168

Merged

mpguerra modified the milestones: 2021 Sprint 25 - Last Sprint of 2021, 2021 Sprint 24 Dec 9, 2021

mpguerra modified the milestones: 2021 Sprint 24, 2021 Sprint 25 - Last Sprint of 2021 Dec 17, 2021

conradoplg closed this as completed in #3167 Dec 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security: Handle gossiped blocks that are a long way ahead of the current tip #1389

Security: Handle gossiped blocks that are a long way ahead of the current tip #1389

teor2345 commented Nov 25, 2020 •

edited

Loading

hdevalence commented Nov 25, 2020

teor2345 commented Nov 26, 2020 •

edited

Loading

Security: Handle gossiped blocks that are a long way ahead of the current tip #1389

Security: Handle gossiped blocks that are a long way ahead of the current tip #1389

Comments

teor2345 commented Nov 25, 2020 • edited Loading

hdevalence commented Nov 25, 2020

teor2345 commented Nov 26, 2020 • edited Loading

teor2345 commented Nov 25, 2020 •

edited

Loading

teor2345 commented Nov 26, 2020 •

edited

Loading