Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(performance): file-source testing and benchmarks #6742

Merged
merged 8 commits into from
Mar 16, 2021
Merged

chore(performance): file-source testing and benchmarks #6742

merged 8 commits into from
Mar 16, 2021

Conversation

blt
Copy link
Contributor

@blt blt commented Mar 12, 2021

This PR commit is the start of a process to address #6730 by way of implementation of RFC 3480. The major changes introduced here are criterion benchmarks for an internal component of the file-source crate and modification of the testing regime to make use of Quickcheck. The results are nothing ground shaking: the new model tests do not turn up any flaw in the read_until... function and the benchmarks show it to be reasonably fast. That's good. Now we know. The more recent indications from vector-test-harness are that our issue centers not on any individually expensive operation but on the cooperative, fair scheduling of this crate. Approaches to that are beyond the scope of this PR.

Signed-off-by: Brian L. Troutwine brian@troutwine.us

@binarylogic binarylogic changed the title fix(performance): Refactor file-source fix(performance): Refactor file source Mar 12, 2021
Comment on lines 261 to 267
/// The return is unusual. In the Err case this function has not written into
/// `buf` and the caller should not examine its contents. In the Ok case if the
/// inner value is None the caller should retry the call as the buffering read
/// hit the end of the buffer but did not find a `delim` yet, indicating that
/// we've sheered a write or that there were no bytes available in the `reader`
/// and the `reader` was very sure about it. If the inner value is Some the
/// interior `usize` is the number of bytes written into `buf`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's more context on why it's in this unusual state in the description of #4089. Very much a classic "we'll clean this up soon" 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good friend of mine was prone to say "there's nothing permanent like temporary". 😂

@blt blt changed the title fix(performance): Refactor file source chore(performance): file-source testing and benchmarks Mar 12, 2021
@blt blt added source: file Anything `file` source related domain: performance Anything related to Vector's performance labels Mar 12, 2021
@blt blt marked this pull request as ready for review March 12, 2021 22:43
@blt blt requested review from a team, bruceg and ktff and removed request for a team March 12, 2021 22:43
Copy link
Contributor

@ktff ktff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The errors seem to originate from upgrading bytes from 0.5 to 1.0, other than that this looks good.

lib/file-source/src/buffer.rs Outdated Show resolved Hide resolved
@blt
Copy link
Contributor Author

blt commented Mar 13, 2021

@ktff indeed. It looks like upgrading bytes is non-trivial and should be sequenced after #6513 since there's no pressing need to get bytes upgraded and most of the build issues are related to tokio concerns. I'll flip back to 0.5.

@@ -0,0 +1 @@

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We sure do not. Just junk I accidentally left in-tree.

blt added 8 commits March 15, 2021 15:44
This commit is the start of a process to address #6730. The major changes
introduced in this commit so far are application of clippy suggestions and model
checks of the `read_until_with_max_size` test. I have a pretty good idea of how
that function works now and will be introducing benchmarks.

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
In this commit I have introduce a benchmark for the read_until etc
function. To do this I've had to make it part of the public API of the crate,
but since the crate sits inside a larger project I'm less chuffed about this. I
have fiddled with the test layout some as well.

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
This crate looks to be challenging to upgrade. Best to do once
tokio is updated in this project.

Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
@blt blt merged commit bef52ef into vectordotdev:master Mar 16, 2021
@blt blt deleted the blt-file_performance branch March 16, 2021 04:06
@pablosichert
Copy link
Contributor

A merge conflict has summoned me, just came to confirm: bytes 1.x is included in #5872.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: performance Anything related to Vector's performance source: file Anything `file` source related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants