Skip to content
This repository has been archived by the owner on Apr 29, 2020. It is now read-only.

perf: switch out pull-block for bl #12

Merged
merged 7 commits into from
Jan 4, 2019
Merged

Conversation

achingbrain
Copy link
Collaborator

@achingbrain achingbrain commented Dec 21, 2018

Swaps out pull-block for bl

  • We could see a bit more of a gain by using bl.shallowSlice in the this.queue calls but it returns instances of BufferList instead of Buffer. These fail Buffer.isBuffer which is used by the protons module further down the pull-stream pipeline so until we refactor or replace we can't use it. Thankfully we have forked protons so can improve it, though in WIP: new ipld format api ipld/js-ipld-dag-pb#105 @vmx has started to use pbf instead of protons so we'll have to examine the performance impact of that change.
  • When we've consumed part of the buffer we can either call bl.consume to remove the bytes from the front of the buffer list, or just create a new buffer list and slice in the bytes we've yet to consume - from my testing, creating a new buffer list was faster so that's what this PR does. Using bl.consume was a little slower than using pull-block.

Results (lower is better):

image

  • bl.consume was about 1% slower than pull-block
  • newing up BufferLists and bl.shallowSliceing to remove consumed data was about 12% faster than the current pull-block based implementation

It uses the test from #11

@ghost ghost assigned achingbrain Dec 21, 2018
@ghost ghost added the in progress label Dec 21, 2018
@achingbrain
Copy link
Collaborator Author

cc @mcollina

@mcollina
Copy link

Can you do a flamegraph of that benchmark?

The constructor of bl initialize a readable stream, and that is not cheap. You might want to send a PR to BL to move to a lazy-initialization, like we do in core for some modules: https://github.com/nodejs/node/blob/master/lib/internal/streams/lazy_transform.js.

Alternative, introduce a require('bl/compact') that does everything that BL does, minus the streams (and play some prototype magic to avoid code duplication). I'm a maintainer of BL, and I think I can get that out quickly (2-3 days after I'm back from vacation).

@achingbrain
Copy link
Collaborator Author

@mcollina download it and take the .txt extension off: flamegraph.html.txt

@mcollina
Copy link

Just checked, all is well and Readable initialization is not an issue.

I concur that using shallowSlice instead of slice while enqueuing would lead to a performance improvement, especially if protons could understand and compose its messages using BL as well.

Copy link

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@achingbrain
Copy link
Collaborator Author

Readable initialization is not an issue

It didn't seem that way to me either.

I concur that using shallowSlice instead of slice while enqueuing would lead to a performance improvement

I hacked in support for Buffer-like objects to protons quickly locally so we can use shallowSlice:

image

Like this bl with slice is 12% faster than pull-block, with shallowSlice it's more like almost 19% faster, so worth doing!

@achingbrain achingbrain merged commit 4e5b618 into master Jan 4, 2019
@ghost ghost removed the in progress label Jan 4, 2019
@achingbrain achingbrain deleted the swap-pull-block-for-bl branch January 4, 2019 17:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants