-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] GeoIpDownloaderIT testStartWithNoDatabases failing #79074
Comments
Pinging @elastic/es-data-management (Team:Data Management) |
Can we mute this please? It fails one of my PRs with a pretty high frequency. |
Muted with 896bb85. Sorry about that. I hadn't seen any other failures, and it didn't reproduce locally. |
Sorry about the noise. I recently added this test in a change that went only in master. I will fix this. |
Might this other test failure in GeoIpDownloaderIT be related? https://gradle-enterprise.elastic.co/s/neooukkmmjfdo |
@danhermann I think so. Can you mute this test? |
Yes, muting in #79089 |
thanks! |
I was able to reproduce this failure. Another test adds test database in the config directory of all nodes. The test cluster is reused, but these databases were not cleaned up by this tests, which broke the assumption of |
In the case a database couldn't be loaded, the geoip processor factory checks whether any databases are available and then returns a processor implementation that tags documents with the fact that required database wasn't available. The GeoIpProcessor itself also loads the database, but in case a database can't be loaded then it always fails with resource missing exception. The GeoIpProcessor is modified in this change to also check whether any database is available and in that case tag documents instead of failing. GeoIpDownloaderIT improvements: * The `testUseGeoIpProcessorWithDownloadedDBs()` was adding databases to config dirs, but not cleaning it up. Which broke assumptions in others in this suite, because the test cluster is reused. * Use the geoip stats api after each test to wait for a clean state, which means wait for database downloader to be disabled and all database files to be removed on all ingest nodes. * Don't use `IngestDocument#getFieldValue(...)` in test code surrounded by `assertBusy(...)`. If a field isn't there an illegal state exception is thrown, which isn't caught by `assertBusy(...)`. Only assertion errors are handled. Closes elastic#79074
…g. (#79131) In the case a database couldn't be loaded, the geoip processor factory checks whether any databases are available and then returns a processor implementation that tags documents with the fact that required database wasn't available. The GeoIpProcessor itself also loads the database, but in case a database can't be loaded then it always fails with resource missing exception. The GeoIpProcessor is modified in this change to also check whether any database is available and in that case tag documents instead of failing. GeoIpDownloaderIT improvements: * The `testUseGeoIpProcessorWithDownloadedDBs()` was adding databases to config dirs, but not cleaning it up. Which broke assumptions in others in this suite, because the test cluster is reused. * Use the geoip stats api after each test to wait for a clean state, which means wait for database downloader to be disabled and all database files to be removed on all ingest nodes. * Don't use `IngestDocument#getFieldValue(...)` in test code surrounded by `assertBusy(...)`. If a field isn't there an illegal state exception is thrown, which isn't caught by `assertBusy(...)`. Only assertion errors are handled. Closes #79074
I'm not sure if the test is fixed, I've just got a failure in https://gradle-enterprise.elastic.co/s/6fisos7fmdfc4 |
I'm reopening this since it seems like the issue still persists |
I'm looking into this new failure. (i was pretty sure the test was fixed... i guess I was wrong!) |
If builtin database can't be loaded then assume it will be available soon via database geoip downloading mechanism. So instead of returning a config error, returns a geoip processor impl that tags documents with the fact that a builtin db isn't yet avaialble. Relates to elastic#79074
* Try to clean config databases after each test (instead of 1). * Improve assertions in testUseGeoIpProcessorWithDownloadedDBs() * Wait for managed databases to be deleted in testUseGeoIpProcessorWithDownloadedDBs(). Relates to #79074
I had a failure of |
I'm unable to reproduce this failure locally. I'm going to re-enable this test with geoip debug logging enabled. When it fails there should be more data to investigate why the pipeline with geoip processor isn't reloaded after the databases are available. If this test fails without me noticing then please mute this test and share the Gradle link. |
Failed for me on a PR build here: https://gradle-enterprise.elastic.co/s/miosturet4dp4 |
I looked into this failure and this instance the simulate pipeline api was repeatedly executed on a coordinating only node. The databases are only loaded on nodes with ingest role, this causes the geoip lookup never to happen. The simulate pipeline api always executes on the first node it hits, I think we should change that (redirect to a node with ingest role instead, or wait until we drop the ingest role...). For now, I will adjust the test to not hit a coordinating only node. Note that this isn't an issue for bulk/index api (if an ingest pipeline is used the these APIs always redirect to a node with an ingest role). |
Muting this test now. My latest tweak to the test, apparently made things worse. |
The test hasn't failed, since 4db5fc5 has been pushed. Closing it for now, if this test fails again then this issue can re-opened or a new issue can be opened. |
Build scan:
https://gradle-enterprise.elastic.co/s/uhhnoantetlfq/tests/:modules:ingest-geoip:internalClusterTest/org.elasticsearch.ingest.geoip.GeoIpDownloaderIT/testStartWithNoDatabases
Reproduction line:
./gradlew ':modules:ingest-geoip:internalClusterTest' --tests "org.elasticsearch.ingest.geoip.GeoIpDownloaderIT.testStartWithNoDatabases" -Dtests.seed=976EEDC6532D0E7E -Dtests.locale=es-HN -Dtests.timezone=America/Thunder_Bay -Druntime.java=11
Applicable branches:
master
Reproduces locally?:
No
Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.ingest.geoip.GeoIpDownloaderIT&tests.test=testStartWithNoDatabases
Failure excerpt:
The text was updated successfully, but these errors were encountered: