Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Transform] scripted group_by fails for continuos transform #57332

Closed
hendrikmuhs opened this issue May 29, 2020 · 3 comments · Fixed by #60724
Closed

[Transform] scripted group_by fails for continuos transform #57332

hendrikmuhs opened this issue May 29, 2020 · 3 comments · Fixed by #60724
Assignees
Labels

Comments

@hendrikmuhs
Copy link

hendrikmuhs commented May 29, 2020

Affected versions: 7.7 - 7.9.1 (fixed >= 7.9.2)

Continuous transform minimize the amount of updates by querying only changed buckets, the logic uses a combination of query features for that. For terms it relies on terms query.

If a script is used in group_by instead of a field, this change detection logic causes the transform to fail (after it switches to continuous mode, not directly after start but after checkpoint 1) as it expects a field:

task encountered irrecoverable failure: field name cannot be null

Because scripts offer freedom to build a bucket key, we can't use them to detect changes on them. We could construct a script query, however this would be very expensive.

Mitigation: don't use scripts in group_by together with continuous transform. Scripts in queries are very expensive, so independent of change detection it's highly recommended to not use scripts in production, but only in the development/data exploration phase.

Possible solutions

Update: We finally decided to go with option B, keeping the other ideas for documentation.

## A Disallow scripted group_by in continuous mode

This would be easiest, however if you have only 1 scripted group_by but n non scripted group_by's this would limit functionality unnecessary.

B Disable change detection for scripted group_by

This would let continuous mode do a full rerun if all group_by's are using a script. On larger scale this leads to performance problems. If there are other non-scripted group_by's or the amount of data is small this might still be an acceptable solution.

B.2: For this solution we could also consider using _update instead of index.

## C Implement change detection based on scripted query

I am not 100% sure this is possible. This solution would use a script query instead of a terms query. This solution might not be better than solution B: disabling change detection.

General solution remarks

Because data might be small A would limit functionality, I therefore tend towards B. It would be possible to disallow certain combinations like only disallow if no group_by implements change detection, but again this sounds like to much of a restriction. Instead we should warn the user about potential problems. Solution C might be good in the long run, but takes significant more time to verify, implement and test, so B seems to be the best short term solution.

The user that reported the issue made a workaround for #48243, therefore supporting missing_bucket should be prioritized.

Update

If transform can not apply any change detection a warning (job message + log) is raised regarding performance.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

@hendrikmuhs hendrikmuhs self-assigned this Jul 29, 2020
hendrikmuhs pushed a commit to hendrikmuhs/elasticsearch that referenced this issue Aug 5, 2020
hendrikmuhs pushed a commit that referenced this issue Aug 5, 2020
)

disable optimizations when using scripts in group_by, when scripts using scripts we can not predict
the outcome and we have no query counterpart. Other optimizations for other group_by's are not
affected.

fixes #57332
hendrikmuhs pushed a commit that referenced this issue Aug 5, 2020
)

disable optimizations when using scripts in group_by, when scripts using scripts we can not predict
the outcome and we have no query counterpart. Other optimizations for other group_by's are not
affected.

fixes #57332
hendrikmuhs pushed a commit to hendrikmuhs/elasticsearch that referenced this issue Aug 6, 2020
leftover from a previous solution of elastic#57332, which has been discarded

relates elastic#60724
fixes elastic#60794
hendrikmuhs pushed a commit that referenced this issue Aug 6, 2020
remove test, scripts are excluded in the change collector, the test is a leftover from a previous
solution of #57332, which has been discarded

relates #60724
fixes #60794
hendrikmuhs pushed a commit that referenced this issue Aug 6, 2020
remove test, scripts are excluded in the change collector, the test is a leftover from a previous
solution of #57332, which has been discarded

relates #60724
fixes #60794
@hendrikmuhs
Copy link
Author

FWIW: I tested the possibility to backport this change to 7.9, but unfortunately this turned into a lot of merge problems as the PR depends on other changes.

Without leaking any information about release dates, the next patch release 7.9.2 and the next minor 7.10 are not far away and there is not much time in-between them.

I therefore dropped the idea of a backport.

hendrikmuhs pushed a commit that referenced this issue Sep 17, 2020
…62524)

the outcome and we have no query counterpart. Other optimizations for other group_by's are not
affected.

relates #57332
backport #60724
@hendrikmuhs
Copy link
Author

After revisiting the backport again, I manually backported the important bits to 7.9.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants