Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] Log and audit a warning if all the group-by fields are runtime fields #68050

Merged

Conversation

przemekwitek
Copy link
Contributor

@przemekwitek przemekwitek commented Jan 27, 2021

This PR adds logging and auditing for the case when all the fields that transform uses as group-by (in pivot) or unique key (in latest) are search runtime fields.

This PR is marked as >non-issue as it enhances an unreleased feature.

Relates #65147

@przemekwitek przemekwitek force-pushed the search_runtime_fields_warning branch 6 times, most recently from ed93fc1 to 2fe34a8 Compare January 28, 2021 11:26
@przemekwitek przemekwitek removed the WIP label Jan 28, 2021
@przemekwitek przemekwitek marked this pull request as ready for review January 28, 2021 12:18
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

@@ -77,6 +79,11 @@ public void validateConfig(ActionListener<Boolean> listener) {
listener.onResponse(true);
}

@Override
public List<String> getPerformanceCriticalFields() {
return config.getGroupConfig().getGroups().values().stream().map(SingleGroupSource::getField).collect(toList());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure all of these would be performance critical. it almost seems like they would be performance critical IF they didn't support missing buckets right?

If the group config supported missing buckets, we don't get the benefit of filtering for changes or something? I remember there was a trade off.

Right @hendrikmuhs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the group config supported missing buckets, we don't get the benefit of filtering for changes or something? I remember there was a trade off.

missing bucket runs a bit slower in composite aggs, but I think we can ignore that, its like 10-20% improvement. Problematic are group_by with scripts, but we already have a warning for that: We warn if change detection is completely disabled, which means we do a full re-run on every checkpoint.

@hendrikmuhs
Copy link

Can you also add a check if the timestamp field configured for sync is a search runtime field?

Verified

This commit was signed with the committer’s verified signature.
ggwpez Oliver Tale-Yazdi
@przemekwitek przemekwitek force-pushed the search_runtime_fields_warning branch from 2fe34a8 to 64adc54 Compare February 2, 2021 09:30

Verified

This commit was signed with the committer’s verified signature.
ggwpez Oliver Tale-Yazdi
@przemekwitek
Copy link
Contributor Author

Can you also add a check if the timestamp field configured for sync is a search runtime field?

Done.

Copy link

@hendrikmuhs hendrikmuhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@przemekwitek przemekwitek merged commit 21b3946 into elastic:master Feb 2, 2021
@przemekwitek przemekwitek deleted the search_runtime_fields_warning branch February 2, 2021 14:19
przemekwitek added a commit to przemekwitek/elasticsearch that referenced this pull request Feb 2, 2021
przemekwitek added a commit that referenced this pull request Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants