-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up synthetic keyword, ip, and text fields #87930
Speed up synthetic keyword, ip, and text fields #87930
Conversation
This speeds up synthetic source, especially when there are many fields in the index that are declared in the mapping but don't have values. This is fairly common with ECS, and the tsdb rally track uses that. And this improves fetch performance of that track: ``` | 50th percentile service time | default | 6.24029 | 4.85568 | ms | -22.19% | | 90th percentile service time | default | 7.89923 | 6.52069 | ms | -17.45% | | 99th percentile service time | default | 12.0306 | 16.435 | ms | +36.61% | | 100th percentile service time | default | 14.2873 | 17.1175 | ms | +19.81% | | 50th percentile service time | default_1k | 158.425 | 25.3236 | ms | -84.02% | | 90th percentile service time | default_1k | 165.46 | 30.8655 | ms | -81.35% | | 99th percentile service time | default_1k | 168.954 | 33.3342 | ms | -80.27% | | 100th percentile service time | default_1k | 174.341 | 34.8344 | ms | -80.02% | ``` There's a slight increase in the 99th and 100th percentile service time for fetching ten document which think is unlucky jitter. Hopefully. The average performance of fetching ten docs improves anyway so I think we're ok. Fetching a thousand documents improves 80% across the board which is lovely. This works by doing three things: 1. Teach the "leaf" layer of source loader to detect when the field is empty in that segment and remove it from the synthesis process entirely. This brings most of the speed up in tsdb. 2. Replace `hasValue` with a callback when writing the first value. `hasValue` was resulting in a 2^n-like number of calls that really showed up in the profiler. 3. Replace the `ArrayList` of leaf loaders with an array. Before fixing the other two issues the `ArrayList`'s iterator really showed up in the profiling. Probably much less worth it now, but it's small. All of this brings synthetic source much closer to the fetch performance of standard _source: ``` | 50th percentile service time | default_1k | 11.4016 | 25.3236 | ms | +122.11% | | 90th percentile service time | default_1k | 13.7212 | 30.8655 | ms | +124.95% | | 99th percentile service time | default_1k | 15.8785 | 33.3342 | ms | +109.93% | | 100th percentile service time | default_1k | 16.9715 | 34.8344 | ms | +105.25% | ``` One important thing, these perf numbers come from fetching *hot* blocks on disk. They mostly compare CPU overhead and not disk overhead.
Pinging @elastic/es-analytics-geo (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! The extra messiness is pretty well contained in FetchPhase and that's something we can look at separately.
leafReaderContext = context.searcher().getIndexReader().leaves().get(leafIndex); | ||
endReaderIdx = endReaderIdx(context, leafReaderContext, index, docs); | ||
int[] docIdsInLeaf = docIdsInLeaf(index, endReaderIdx, docs, leafReaderContext.docBase); | ||
if (leafReaderContext.reader()instanceof SequentialStoredFieldsLeafReader lf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: spacing before instanceof
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't! There's a bug in our formatter that needs, like, a fix upstream. The delivery folks are managing it. But, for now, you can't have a space there!
LeafNestedDocuments leafNestedDocuments = null; | ||
CheckedBiConsumer<Integer, FieldsVisitor, IOException> fieldReader = null; | ||
boolean hasSequentialDocs = hasSequentialDocs(docs); | ||
SourceLoader.Leaf leafSourceLoader = null; | ||
int leafIndex = -1; | ||
LeafReaderContext leafReaderContext = null; | ||
int endReaderIdx = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this still isn't pretty but FetchPhase as a whole is ugly already so I don't think this is making it actively worse. One day I'll wrap this up into a PerSegmentFetchPhase or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the LeafBlahBlah
model instead of the setNextLeaf
model. Not that I'm an expert, but I think the code is generally more readable.
@@ -583,12 +583,12 @@ protected SyntheticSourceSupport syntheticSourceSupport() { | |||
: DateFieldMapper.DEFAULT_DATE_TIME_NANOS_FORMATTER; | |||
|
|||
@Override | |||
public SyntheticSourceExample example() { | |||
public SyntheticSourceExample example(int maxValues) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this directly related to the main change or is it part of a separate refactor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related. I can cut it to it's own change. But it's related.
The new behavior only kicks in if you fetch more than one doc. And, in general, now that there is more state on the SourceLoader it's important to test more than one doc. So I wrote that "more than one doc" version of the test. To test the new behavior and out of paranoia. And that needs this. It needs it so that the "more than one doc" can consistently force singleton values. If we didn't have this then it'd be super rare to get singletons. But with this the "more than one doc" test can do int maxValue = randomBoolean() ? 1 : 5;
up front. And gets singletons half the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, sounds sensible to me, I'm happy to keep it as part of the same change
This speeds up synthesizing _source for keywords, text, and ip fields by
converting from ordinal to utf-8 one time. When loading from disk blocks
that are hot in the page cache the speed up is 7-10%:
This should be much more substantial when loading from scattered disk
blocks because it takes care to load the ordinals in increasing order.