Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up synthetic keyword, ip, and text fields #87930

Merged
merged 14 commits into from
Jun 24, 2022

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Jun 22, 2022

This speeds up synthesizing _source for keywords, text, and ip fields by
converting from ordinal to utf-8 one time. When loading from disk blocks
that are hot in the page cache the speed up is 7-10%:

|  50th percentile service time | default_1k | 39.4338 | 35.6935 | ms |  -9.49% |
|  90th percentile service time | default_1k | 41.3508 | 38.3602 | ms |  -7.23% |
|  99th percentile service time | default_1k | 50.6734 | 45.4686 | ms | -10.27% |
| 100th percentile service time | default_1k | 61.3771 | 56.4886 | ms |  -7.96% |

This should be much more substantial when loading from scattered disk
blocks because it takes care to load the ordinals in increasing order.

nik9000 added 11 commits June 20, 2022 19:42
This speeds up synthetic source, especially when there are many fields
in the index that are declared in the mapping but don't have values.
This is fairly common with ECS, and the tsdb rally track uses that. And
this improves fetch performance of that track:
```
|  50th percentile service time |    default |   6.24029 |  4.85568 | ms | -22.19% |
|  90th percentile service time |    default |   7.89923 |  6.52069 | ms | -17.45% |
|  99th percentile service time |    default |  12.0306  | 16.435   | ms | +36.61% |
| 100th percentile service time |    default |  14.2873  | 17.1175  | ms | +19.81% |
|  50th percentile service time | default_1k | 158.425   | 25.3236  | ms | -84.02% |
|  90th percentile service time | default_1k | 165.46    | 30.8655  | ms | -81.35% |
|  99th percentile service time | default_1k | 168.954   | 33.3342  | ms | -80.27% |
| 100th percentile service time | default_1k | 174.341   | 34.8344  | ms | -80.02% |
```

There's a slight increase in the 99th and 100th percentile service time
for fetching ten document which think is unlucky jitter. Hopefully. The
average performance of fetching ten docs improves anyway so I think
we're ok. Fetching a thousand documents improves 80% across the board
which is lovely.

This works by doing three things:
1. Teach the "leaf" layer of source loader to detect when the field is
   empty in that segment and remove it from the synthesis process
   entirely. This brings most of the speed up in tsdb.
2. Replace `hasValue` with a callback when writing the first value.
   `hasValue` was resulting in a 2^n-like number of calls that really
   showed up in the profiler.
3. Replace the `ArrayList` of leaf loaders with an array. Before fixing
   the other two issues the `ArrayList`'s iterator really showed up in
   the profiling. Probably much less worth it now, but it's small.

All of this brings synthetic source much closer to the fetch performance
of standard _source:
```
|  50th percentile service time | default_1k |  11.4016  | 25.3236  | ms | +122.11% |
|  90th percentile service time | default_1k |  13.7212  | 30.8655  | ms | +124.95% |
|  99th percentile service time | default_1k |  15.8785  | 33.3342  | ms | +109.93% |
| 100th percentile service time | default_1k |  16.9715  | 34.8344  | ms | +105.25% |
```

One important thing, these perf numbers come from fetching *hot* blocks
on disk. They mostly compare CPU overhead and not disk overhead.
@nik9000 nik9000 requested a review from romseygeek June 22, 2022 18:18
@nik9000 nik9000 added >non-issue :StorageEngine/TSDB You know, for Metrics labels Jun 22, 2022
@nik9000 nik9000 marked this pull request as ready for review June 22, 2022 18:18
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 22, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@nik9000 nik9000 mentioned this pull request Jun 22, 2022
50 tasks
Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! The extra messiness is pretty well contained in FetchPhase and that's something we can look at separately.

leafReaderContext = context.searcher().getIndexReader().leaves().get(leafIndex);
endReaderIdx = endReaderIdx(context, leafReaderContext, index, docs);
int[] docIdsInLeaf = docIdsInLeaf(index, endReaderIdx, docs, leafReaderContext.docBase);
if (leafReaderContext.reader()instanceof SequentialStoredFieldsLeafReader lf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spacing before instanceof

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't! There's a bug in our formatter that needs, like, a fix upstream. The delivery folks are managing it. But, for now, you can't have a space there!

LeafNestedDocuments leafNestedDocuments = null;
CheckedBiConsumer<Integer, FieldsVisitor, IOException> fieldReader = null;
boolean hasSequentialDocs = hasSequentialDocs(docs);
SourceLoader.Leaf leafSourceLoader = null;
int leafIndex = -1;
LeafReaderContext leafReaderContext = null;
int endReaderIdx = -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this still isn't pretty but FetchPhase as a whole is ugly already so I don't think this is making it actively worse. One day I'll wrap this up into a PerSegmentFetchPhase or something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the LeafBlahBlah model instead of the setNextLeaf model. Not that I'm an expert, but I think the code is generally more readable.

@@ -583,12 +583,12 @@ protected SyntheticSourceSupport syntheticSourceSupport() {
: DateFieldMapper.DEFAULT_DATE_TIME_NANOS_FORMATTER;

@Override
public SyntheticSourceExample example() {
public SyntheticSourceExample example(int maxValues) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this directly related to the main change or is it part of a separate refactor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related. I can cut it to it's own change. But it's related.

The new behavior only kicks in if you fetch more than one doc. And, in general, now that there is more state on the SourceLoader it's important to test more than one doc. So I wrote that "more than one doc" version of the test. To test the new behavior and out of paranoia. And that needs this. It needs it so that the "more than one doc" can consistently force singleton values. If we didn't have this then it'd be super rare to get singletons. But with this the "more than one doc" test can do int maxValue = randomBoolean() ? 1 : 5; up front. And gets singletons half the time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, sounds sensible to me, I'm happy to keep it as part of the same change

@nik9000 nik9000 merged commit bcca9d1 into elastic:master Jun 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants