Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bubble-up exceptions from scheduler #38441

Merged

Conversation

henningandersen
Copy link
Contributor

@henningandersen henningandersen commented Feb 5, 2019

Instead of logging warnings we now rethrow exceptions thrown inside
scheduled/submitted tasks. This will still log them as warnings in
production but has the added benefit that if they are thrown during
unit/integration test runs, the test will be flagged as an error.

Fixed NPE in GlobalCheckPointListners that caused CCR tests
(IndexFollowingIT and likely others) to fail.

This is a continuation of #38014

Backports #38317

Instead of logging warnings we now rethrow exceptions thrown inside
scheduled/submitted tasks. This will still log them as warnings in
production but has the added benefit that if they are thrown during
unit/integration test runs, the test will be flagged as an error.

Fixed NPE in GlobalCheckPointListners that caused CCR tests
(IndexFollowingIT and likely others) to fail.

This is a continuation of elastic#38014
@henningandersen henningandersen added >bug :Core/Infra/Core Core issues without another label v6.7.0 labels Feb 5, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@henningandersen henningandersen merged commit d0ac828 into elastic:6.x Feb 5, 2019
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Feb 6, 2019
* 6.x: (25 commits)
  Backport of types removal for Put/Get index templates (elastic#38465)
  Add support for API keys to access Elasticsearch (elastic#38291) (elastic#38399)
  Deprecate support for internal versioning for concurrency control (elastic#38451)
  Deprecate types in rollover index API (elastic#38389) (elastic#38458)
  Add typless client side GetIndexRequest calls and response class (elastic#38422)
  [ML] Report index unavailable instead of waiting for lazy node (elastic#38444)
  await fix CurtIT#testIndex until elastic#38451 is merged (elastic#38466)
  Update ilm-api.asciidoc, point to REMOVE policy (elastic#38235) (elastic#38464)
  SQL: Fix esType for DATETIME/DATE and INTERVALS (elastic#38179)
  Clean up duplicate follow config parameter code (elastic#37688) (elastic#38443)
  Deprecation check for No Master Block setting (elastic#38383)
  Bubble-up exceptions from scheduler (elastic#38441)
  Lift retention lease expiration to index shard (elastic#38391)
  Deprecate maxRetryTimeout in RestClient and increase default value (elastic#38425)
  Update Rollup Caps to allow unknown fields (elastic#38446)
  Backport of elastic#38411: `if_seq_no` and `if_primary_term` parameters aren't wired correctly in REST Client's CRUD API
  Support unknown fields in ingest pipeline map configuration (elastic#38429)
  SQL: Implement CURRENT_DATE (elastic#38175)
  Backport changes to the release notes script. (elastic#38346)
  Fix ILM explain response to allow unknown fields (elastic#38363)
  ...
@dansimpson
Copy link

dansimpson commented Jun 6, 2019

@henningandersen I recently ran into an issue related to this change. I have worked around it, but I figure it's worth mentioning.

I have a plugin which performs periodic tasks. I was using the threadPool.scheduler().scheduleAtFixedRate(...) call to execute said periodic tasks. Because the executor wraps the runnable in a ScheduledFuture, the afterExecute call would block, locking up the scheduler. Digging deeper, the EsExecutors.rethrowErrors checks if the runnable is a RunnableFuture, which it is, then performs a 'get', but the ScheduledFuture will block there.

Anyways, the scheduleAtFixedRate method will lock up the threadPool.scheduler() instance on the first execution.

henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Jun 7, 2019
Though not in use in elasticsearch currently, it seems surprising that
ThreadPool.scheduler().scheduleAtFixedRate would hang. A recurring
scheduled task is never completed (except on failure) and we test for
exceptions using RunnableFuture.get(), which hangs for periodic tasks.
Fixed by checking that task is done before calling .get().

Related to elastic#38441
@henningandersen
Copy link
Contributor Author

@dansimpson, thanks for reporting this issue. It should be fixed by #42993 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport >bug :Core/Infra/Core Core issues without another label v6.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants