-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] CI ForecastIT::testOverflowToDisk can fail due to index timing issues #31173
Comments
Pinging @elastic/ml-core |
Another alternative would be to call the |
In fact, since forecasting requires admin rights, the UI could use the same solution - if the job is not going to be closed after a forecast completes because it was already opened then flush it instead. |
👍 That's actually better as this is a regression test which should do 2 forecasts on the same running job/process, |
flush ml job to ensure all results have been written fixes #31173
flush ml job to ensure all results have been written fixes #31173
The failure has not been seen on the official CI yet, but on a private one:
This is caused by a internal timing problem:
A document that marks the end of the forecast is indexed last but due to the
near real time
behavior, sharding and concurrency it can happen that not all forecast data points are written when the status document is written, meaningsearchable
.The issue has been introduced in #30969. Before the test assertions the test closed the job which triggers an index refresh but after that change closing the job has been moved to the end. Therefore the test fix is as easy as closing the job again and opening it for the 2nd test.
Beside the test issue this is still a problem. The UI only closes the job if it was closed prior running the forecast. Open jobs are kept open and potentially hit the problem although this is rather unlikely as the UI side is slowed down by network communication round tripping. Furthermore aggregations are used for charting, missing a couple of datapoints likely does not produce a visual artifact.
The text was updated successfully, but these errors were encountered: