Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: handle incoming blob filters #956

Merged
merged 10 commits into from
Nov 20, 2024
Merged

feat: handle incoming blob filters #956

merged 10 commits into from
Nov 20, 2024

Conversation

EvanHahn
Copy link
Contributor

@EvanHahn EvanHahn commented Nov 7, 2024

I recommend reviewing this PR one commit at a time.

When you receive a blob filter from another peer, this updates their sync states. For example, if you receive a blob filter that says "I only want photo thumbnails", that peer's "wants" bitfield will be reduced.

Closes #682 and #905.

src/blob-store/downloader.js Outdated Show resolved Hide resolved
This change should have no impact.
…Stream`

This is a types-only change that should have no functionality impact.
This change should have no impact on functionality. It makes the
following changes to `PeerState`:

- Instead of a `#wants` bitfield and a `#wantAll` boolean, we only store
  one value: `#wants`, which can be `null` or a bitfield. This helps
  avoid states where you want everything *and* the bitfield has data
  inside.

- `setWantRange` is renamed to `addWantRange`.

- `addWantRange` takes two numbers (a start and length) instead of an
  object. This is for performance and consistency with future code.

- `clearWantRanges` is a new method that resets the bitfield.

I think these are useful changes on their own, but will make upcoming
changes smaller too.
This should have no functionality impact.
We weren't doing this when `SyncApi` was instantiated. Now we are.
This is the bulk of the work required in [#905]. It takes incoming blob
filters and converts them to want bitfields.

It starts in `MapeoProject` and gets all the way down to `PeerState`.

[#905]: #905
@EvanHahn EvanHahn marked this pull request as ready for review November 12, 2024 16:52
src/blob-store/index.js Outdated Show resolved Hide resolved
Copy link
Member

@gmaclennan gmaclennan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok looks good, good work on this, just needs some work managing the lifecycle of the entries stream.

src/mapeo-project.js Show resolved Hide resolved
src/mapeo-project.js Show resolved Hide resolved
*/
setPeerWants(peerId, ranges) {
addWantRange(peerId, start, length) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried about performance (because of the cost of calculating state), but I think the way this is implemented, the throttle on SyncState should avoid state calculation happening too frequently, so I think it is ok to just call this with every range (there will be a lot of calls to this).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a simple benchmark script with Deno. On my machine, I could call setRange 7,364 times per second, at least with my simple benchmark. I realize my computer is probably much faster than most phones, but even a 100x slowdown is probably okay here.

I'm going to leave this as is, but let me know if that's wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the lack of clarity, I was thinking out loud. I was concerned about the performance not because of the cost of this function, but because it triggers a state update, and reading state is expensive (when cores get large, which they will do for the blob core). However, the way we currently trigger state updates, we don't actually calculate state as part of the update event, and then in SyncState we throttle the handler for update events so that getState() is only called ever 200ms, so with the current implementation this is all fine. The reason this was originally ranges was to avoid triggering a state-recalculation for every range set, but our throttle in SyncState solves that problem for now. We just need to remember this in the future.

Copy link
Member

@gmaclennan gmaclennan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, fantastic to have this working.

@EvanHahn EvanHahn merged commit 5ae541d into media-manager-v1 Nov 20, 2024
9 checks passed
@EvanHahn EvanHahn deleted the want-bitfield branch November 20, 2024 14:34
@EvanHahn EvanHahn mentioned this pull request Nov 20, 2024
EvanHahn added a commit that referenced this pull request Nov 20, 2024
This is a squashed commit of:

- #940
- #957
- #956

Co-authored-by: Gregor MacLennan <gmaclennan@digital-democracy.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants