This repository has been archived by the owner on Oct 30, 2021. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This branch optimizes the
coalesce
function in these ways:context
objects usingrelev
score to mitigate the size of thestd::vector<Context>
object before sortingvector.reserve
that were, based on profiling, incurring more overhead than efficiency gains because they were overallocating memory that was never written to. The rule of thumb should be to only call reserve when we know for sure that we'll use all the memory. Otherwise incurring re-allocation overhead will be less of a cost than over-allocating up front.@apendleton @aaaandrea - I've published
dev
binaries for this, but have not done more. Could you take over here to:Open questions
relev
pre-filtering okay, or could it break something? If it breaks something downstream, does that mean we are missing critical test coverage here?__getmatching
and this PR does nothing to speed this up - no ideas there.__getmatching
and 80% idle. I'm not sure why the threads are not more busy - could memory allocations or locking in rocksdb be slowing down the ability to dispatch more work to the threads?carmen::ContextSortByRelev
as a significant bottleneck like I did previously in production (refs Speed up sorting / reduce overhead of sorting #120). Does this mean that things have changed or that the benchmark does not represent the production load that was placed on the machines when we got the traces from Speed up sorting / reduce overhead of sorting #120?