-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ES|QL: add support for include_execution_metadata parameter #134446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES|QL: add support for include_execution_metadata parameter #134446
Conversation
Hi @luigidellaquila, I've created a changelog YAML for you. |
Pinging @elastic/es-analytical-engine (Team:Analytics) |
private boolean columnar; | ||
private boolean profile; | ||
private boolean includeCCSMetadata; | ||
private Boolean includeCPSMetadata; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we need both of these and can't just have include_cps_metadata
option turn on includeCCSMetadata
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only practical reason is that I'm doing the validation at a later stage, in TransportEsqlQueryAction.java (just because I already have the cluster state there, and I can tell if we are in serverless), so at that point I need to know which of the flags was set.
I don't know if that's a strong enough reason, maybe we can try to inject the cluster state in RequestXContent.objectParserCommon() and do the validation at parse time, but it seems a bit inconvenient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I just wonder how important it is that include_cps_metadata
would only work in CPS instead of just being a synonym for the other option and make the whole thing simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong preference on this.
In S2D38 it was decided that it will be available only on Serverless, so I'm just sticking to this decision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wonder how important it is that include_cps_metadata would only work in CPS instead of just being a synonym for the other option and make the whole thing simple.
As Luigi said, we don't want to expose it in stateful. And ideally I think we would put in the API spec that
include_ccs_metadata
is only present in statefulinclude_cps_metadata
is only present in serverless
so that clients/users generally use the correct parameter.
However, practically we would accept include_ccs_metadata
in serverless if the user puts that (a hidden option), but we wouldn't accept include_cps_metadata
in stateful. I think this is consistent with Luigi's current implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @quux00, I like the idea of having a more generic name that we can also apply to Stateful, but the concept of "execution" in ES|QL is pretty broad, and includes compute details (that today you can retrieve with profile=true
, and I'm pretty sure we don't want in this context).
I wonder if include_execution_info
could be confusing for this reason, and if we could find a more specific name. include_execution_metadata
could be an option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @luigidellaquila, execution
sounds too broad and could mean a lot of things. It sounds like we are trying to find a common name for a remote project and remote cluster. I wonder if we can use include_remote_metadata
? Possibly we need to come up with a name for operations happening across multiple clusters/projects without explicitly mentioning cluster/projects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@quux00 @luigidellaquila @idegtiarenko @naj-h
I like the idea of using a generic name, but I also agree that include_execution_info
is a very, very broad name. In the context of ESQL, "execution" can mean a lot of things, and is far too broad about query execution in linked projects.
If we need a new name, include_execution_info
is not my preference. Let's find something else.
I do not mind include_execution_metadata
, even though that is also pretty generic.
include_project_metadata
may be an option too.
QQ: do we still use _clusters
in the response? This doesn't feel super compatible with project model in CPS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include_remote_metadata
Would be problematic because this section also includes the local cluster info and a potential future goal of this additional info is to include it even when it is a local only search, as I mentioned above, as not doing it is a violation of flat world principles in serverless.
include_project_metadata
Would be problematic as it has the same problem of include_cps_metadata
, namely the "project" concept doesn't exist in stateful and we want a generic name that can work in both.
include_execution_metadata
This is fine with me.
QQ: do we still use _clusters in the response?
We asked this of Product and the decision was to leave it, since modifying it would result in different response bodies between serverless and stateful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am OK with include_execution_metadata
|
||
public Boolean includeCPSMetadata() { | ||
return includeCPSMetadata; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a validation check either in this class or in the RestEsqlQueryAction
that the user hasn't set both include_ccs_metadata
and include_cps_metadata
? It's a minor edge case, but we probably don't want users in serverless to be able to set them both and potentially to different values, so we should throw an exception if both are set (in serverless) with an error message like "Both include_cps_metadata and include_ccs_metadata query parameters are set. Use only include_cps_metadata."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It reduces chances of user errors, so ++, let me add it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ both should not be allowed at the same time
Let's also do an API spec ticket for this in https://github.com/elastic/elasticsearch-specification I think we should docuement there that:
so that clients/users generally use the correct parameter. |
* the functionality to do it the right way is not yet ready -- replace this code when it's ready. | ||
*/ | ||
this.canUseSkipUnavailable = settings.getAsBoolean("serverless.cross_project.enabled", false) == false; | ||
this.crossProjectEnabled = settings.getAsBoolean("serverless.cross_project.enabled", false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm no longer using this for this PR, but I'll need it for include_execution_metadata
, so I'm keeping it here.
return CollectionUtils.appendToCopy(super.nodePlugins(clusterAlias), CpsPlugin.class); | ||
} | ||
|
||
protected EsqlQueryResponse runQuery(String query, Boolean cpsMetadataInResponse) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT:
protected EsqlQueryResponse runQuery(String query, Boolean cpsMetadataInResponse) { | |
protected EsqlQueryResponse runQuery(String query, Boolean includeCpsMetadata) { |
} | ||
} | ||
|
||
private static final Setting<String> CpsEnableSetting = Setting.simpleString( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: should it be named CPS_ENABLED_SETTING
? I am a bit surprised spotless did not complain about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I borrowed that code from other tests, but I agree with you that CPS_ENABLED_SETTING
would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed off-line, I'm moving this test to Serverless
## Summary ES is adding a new parameter for retrieving ccs metadata. This works the same as the `include_ccs_metadata` but they want us to use the new one instead elastic/elasticsearch#134446. The reasons they did this change are: - they are trying to avoid the "ccs" term in serverless as much as possible - to be consistent with the flat world model of "all projects should behave the same in a search", in the future we might be changing include_execution_metadata to also return info for local-only searches, which it doesn't do right now. Both params will be available in serverless and stateful
## Summary ES is adding a new parameter for retrieving ccs metadata. This works the same as the `include_ccs_metadata` but they want us to use the new one instead elastic/elasticsearch#134446. The reasons they did this change are: - they are trying to avoid the "ccs" term in serverless as much as possible - to be consistent with the flat world model of "all projects should behave the same in a search", in the future we might be changing include_execution_metadata to also return info for local-only searches, which it doesn't do right now. Both params will be available in serverless and stateful
## Summary ES is adding a new parameter for retrieving ccs metadata. This works the same as the `include_ccs_metadata` but they want us to use the new one instead elastic/elasticsearch#134446. The reasons they did this change are: - they are trying to avoid the "ccs" term in serverless as much as possible - to be consistent with the flat world model of "all projects should behave the same in a search", in the future we might be changing include_execution_metadata to also return info for local-only searches, which it doesn't do right now. Both params will be available in serverless and stateful
## Summary ES is adding a new parameter for retrieving ccs metadata. This works the same as the `include_ccs_metadata` but they want us to use the new one instead elastic/elasticsearch#134446. The reasons they did this change are: - they are trying to avoid the "ccs" term in serverless as much as possible - to be consistent with the flat world model of "all projects should behave the same in a search", in the future we might be changing include_execution_metadata to also return info for local-only searches, which it doesn't do right now. Both params will be available in serverless and stateful
## Summary ES is adding a new parameter for retrieving ccs metadata. This works the same as the `include_ccs_metadata` but they want us to use the new one instead elastic/elasticsearch#134446. The reasons they did this change are: - they are trying to avoid the "ccs" term in serverless as much as possible - to be consistent with the flat world model of "all projects should behave the same in a search", in the future we might be changing include_execution_metadata to also return info for local-only searches, which it doesn't do right now. Both params will be available in serverless and stateful
Add support for parameter in ES|QL request.
Add support for
include_cps_metadatainclude_execution_metadata
parameter in ES|QL request.It will be supported both in Stateful and in Serverless, as a synonym for
include_ccs_metadata
(that will remain for bwc)