-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option to _search to force synthetic source #87068
Conversation
Pinging @elastic/es-analytics-geo (Team:Analytics) |
This adds `?force_synthetic_source` to, well, force running the fetch phase with synthetic source. If the mapping is incompatible with synthetic source it'll throw a 400 error.
This actually won't "turn off" the _source - it'll still be decompressed. So this will show a "worst case" for latency - it'll include the synthetic source. It'll just load the |
Heya @nik9000 can you expand on what the usecase is for this feature? |
Pinging @elastic/es-search (Team:Search) |
I've edited the description to add it - the short version is that it's a way to evaluate synthetic source without reindexing. I don't think it's a "forever" thing, but I've talked to quite a few folks that would find it useful. |
Pinging @elastic/clients-team (Team:Clients) |
Is this option going to be deprecated+removed in the future? If so might be good to document it this way. Also is this option for human debugging only or are we planning on users using it in applications? |
I think so.
Debugging. Though maybe not just humans. |
How can I document this option in this way? Can I just not document it? There is a very real chance it'll live behind the feature flag for it's entire life and be removed before the feature flag. |
@nik9000 Gotcha! In that case can we document that this parameter is experimental and will be removed in a future version? I think that's enough warning for a parameter that's behind a feature flag and undocumented elsewhere. |
It can via |
"'getRoot()' is not public in 'org.elasticsearch.index.mapper.Mapping'. Cannot be accessed from outside package" |
Fair point... But! SourceFieldMapper is in the same package, and we can change the signature of |
I'll have a look |
Yeah, that works. @romseygeek, have another look when you get a chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @nik9000!
@sethmlarson I've added a flag to the spec to mark this parameter as behind the feature flag. Is that OK? Am I going to break a everyone? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change to the rest-api-spec schema.json LGTM.
@@ -308,6 +321,9 @@ public void writeTo(StreamOutput out) throws IOException { | |||
+ "] or greater." | |||
); | |||
} | |||
if (out.getVersion().onOrAfter(Version.V_8_4_0)) { | |||
out.writeBoolean(forceSyntheticSource); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we have an else block here and throw? This ends up failing silently otherwise when communication with older nodes, and breaks the assumption that the ccs compatibility flag is based on (that an exception is thrown whenever a new option is used that a node in the previous minor does not support)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes. Sorry, yeah. I should have had it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened a fix for both of these in #87481.
out.writeBoolean(forceSyntheticSource); | ||
} else { | ||
if (forceSyntheticSource) { | ||
throw new IllegalArgumentException("force_synthetic_source is not supported before 8.3.0"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mention 8.3 in the error message, but 8.4 in the conditional above. I suspect you did that on purpose but I am not following: is there a way to force synthetic source on 8.3 which does not support the flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A mistake!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unbelievable, I could not believe my eyes hence I convinced myself you must have done it on purpose :)
This fixes a missing check for unsupported version logic that'd cause `force_synthetic_source` to be silently dropped when sent to nodes before 8.4. It also fixes an incorrect version number in an error message. Relates to #87068
This adds the option to force synthetic source to the GET API. See elastic#87068 for more discussion on why you'd want to do that - the short version is to get an upper bound on the performance cost of using synthetic source in GET.
This adds the option to force synthetic source to the GET API. See #87068 for more discussion on why you'd want to do that - the short version is to get an upper bound on the performance cost of using synthetic source in GET.
This adds the option to force synthetic source to the MGET API. See elastic#87068 for more discussion on why you'd want to do that - the short version is to get an upper bound on the performance cost of using synthetic source in MGET.
This adds the option to force synthetic source to the MGET API. See #87068 for more discussion on why you'd want to do that - the short version is to get an upper bound on the performance cost of using synthetic source in MGET.
This adds
?force_synthetic_source
to, well, force running the fetchphase with synthetic source. If the mapping is incompatible with
synthetic source it'll throw a 400 error. This should be useful for
folks looking to evaluate synthetic source.
Folks can use it to check if their current mapping supports synthetic
source. Given how picky synthetic source is, their mapping probably
doesn't support it now, but this gives us a very quick way to test
it. Which should help shorten the testing cycle.
If their mapping does support synthetic source it'll fetch the _source
as thought synthetic source were enabled so folks can have a look at
the result and see if it's done a good job. Synthetic source makes a
lot of assumptions about the original layout of the _source and some
of those assumptions are destructive to, well, the layout. If you
need the layout you can try to use the results this produces.
Finally this gives you an upper bound on the performance hit of synthetic
source. This feature will still load and decompress the original
_source bytes, so you won't get any of the fetch-time performance
advantages to synthetic source - but it throws out the _source and
rebuilds it from doc values - do you get all of the fetch time
performance disadvanteages. It the fethc performance is still
acceptable then you know synthetic source is ok for you. If it is
marginal then you'd need a full scale test, reindexing everything
with synthetic source. If the fetch performance is nowhere near ok
then you know synthetic source isn't good for you, at least not yet.