Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPSubscriber: detect when list is not append-only #39

Open
Tracked by #126 ...
lidel opened this issue Apr 22, 2024 · 2 comments
Open
Tracked by #126 ...

HTTPSubscriber: detect when list is not append-only #39

lidel opened this issue Apr 22, 2024 · 2 comments
Labels
need/triage Needs initial labeling and prioritization

Comments

@lidel
Copy link
Member

lidel commented Apr 22, 2024

Extracted from https://github.com/protocol/badbits.dwebops.pub/issues/32733

We are working on simplifying denylist handling at ipfs.io/dweb.link (https://github.com/ipshipyard/waterworks-infra/issues/113) and want to solely rely on nopfs support in rainbow, where a denylist is passed via RAINBOW_DENYLISTS=https://badbits.dwebops.pub/badbits.deny.

Problem

Right now, the list is sorted, and the new double-hashed entries can be added in the middle of the denylist.

This runs into the current limitation of HTTPSubscriber.downloadAndAppend() which assumes every list is append-only, and makes a blind range request for new bytes beyond the ones it already has.

This means subscribing to current badbits or some third-party list that is not append-only is error-prone: client will be missing updates that were inserted in the middle of the file, and not appended at the end (example).

Proposed solution

HTTPSubscriber could remember the last rule, and if the same rule is seen again in Range response beyond the end of old file, we know the update has likely inserted some entries earlier.

In such case we would discard Range response and refresh the entire file to ensure we don't miss any updates.

The file would be downloaded only on actual update once #38 is also implemented.

@hsanjuan
Copy link
Collaborator

It is not sustainable to re-download and re-process everything every minute because of a single-line added to a file (like it happens now for ipfs.io gateways). Adding such feature allows people to keep using non-append only lists. What I'd like is to force people to use append-only lists.

It also means more code, when no more code is necessary really.

@hsanjuan
Copy link
Collaborator

Also, it is simpler to fix badbits to publish append-only lists than to implement dealing with sorted lists.

@lidel lidel mentioned this issue Aug 28, 2024
34 tasks
@gammazero gammazero mentioned this issue Oct 17, 2024
28 tasks
@lidel lidel mentioned this issue Nov 14, 2024
47 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

2 participants