Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix ts-scripts opensearch timeout #3363

Closed
godber opened this issue Jun 15, 2023 · 7 comments
Closed

fix ts-scripts opensearch timeout #3363

godber opened this issue Jun 15, 2023 · 7 comments

Comments

@godber
Copy link
Member

godber commented Jun 15, 2023

About 1 in 10 runs of the opensearch container by ts-scripts service runner results in a 2min timeout error like this:

✔  success   elasticsearch@6.8.6 is running at http://10.1.0.100:49200/, took 30.65sec

✖  fatal     TSError: Opensearch service (***10.1.0.100:49210) timeout after 2min
    at Timeout.<anonymous> (/home/runner/work/elasticsearch-assets/elasticsearch-assets/node_modules/@terascope/utils/dist/src/promises.js:166:24)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
Caused by: Opensearch service (***10.1.0.100:49210) timeout after 2min

error Command failed with exit code 1.

This is a retry-able error but a big nuisance.

@godber
Copy link
Member Author

godber commented Jun 15, 2023

This is reproducable locally and the status of things when it happens is as follows:

The container starts

docker ps -a
CONTAINER ID   IMAGE                                COMMAND                  CREATED          STATUS          PORTS                                                              NAMES
513876417efe   opensearchproject/opensearch:1.3.0   "./opensearch-docker…"   38 seconds ago   Up 37 seconds   9200/tcp, 9300/tcp, 9600/tcp, 9650/tcp, 0.0.0.0:49210->49210/tcp   ts_test_opensearch
1fd0e14e3882   blacktop/elasticsearch:6.8.6         "/elastic-entrypoint…"   38 seconds ago   Up 37 seconds   9200/tcp, 9300/tcp, 0.0.0.0:49200->49200/tcp                       ts_test_elasticsearch

there is no response on the http endpoint

curl http://admin:admin@192.168.1.198:49210
curl: (52) Empty reply from server

The logs on the docker container show a problem starting up:

docker logs -f 513876417efe
Disabling execution of install_demo_configuration.sh for OpenSearch Security Plugin
Disabling OpenSearch Security Plugin
Killing opensearch process 10
Killing performance analyzer process 11
Killing performance analyzer process 11
OpenSearch exited with code 143
Performance analyzer exited with code 143

The last two lines only printed out at shutdown.

@godber
Copy link
Member Author

godber commented Jun 16, 2023

Here's Peter seeing one of the issues I'm researching here:

jestjs/jest#9659 (comment)

Here's a link to an endless Jest bug on noise ... jestjs/jest#13576

@godber godber changed the title ts-scripts opensearch timeout fix ts-scripts opensearch timeout Jul 10, 2023
@godber
Copy link
Member Author

godber commented Jul 21, 2023

Here is a similar Opensearch issue

opensearch-project/opensearch-build#2143

@godber
Copy link
Member Author

godber commented Jul 21, 2023

The following environment variable would disable the performance analyzer in Opensearch 2:

DISABLE_PERFORMANCE_ANALYZER_AGENT_CLI="true"

There's no argument for keeping it on in CI really.

@godber
Copy link
Member Author

godber commented Jul 21, 2023

It looks like in Opensearch 1.3.10 they have added the same env variable DISABLE_PERFORMANCE_ANALYZER_AGENT_CLI="true"

https://github.com/opensearch-project/opensearch-build/blob/opensearch-1.3.10/docker/release/config/opensearch/opensearch-docker-entrypoint.sh#L54-L66

We should probably switch to this version of Opensearch 1 and disable this ...

@sotojn
Copy link
Contributor

sotojn commented Nov 2, 2023

It looks like in Opensearch 1.3.10 they have added the same env variable DISABLE_PERFORMANCE_ANALYZER_AGENT_CLI="true"

https://github.com/opensearch-project/opensearch-build/blob/opensearch-1.3.10/docker/release/config/opensearch/opensearch-docker-entrypoint.sh#L54-L66

We should probably switch to this version of Opensearch 1 and disable this ...

I have made a pull request with this solution here #3453

godber pushed a commit that referenced this issue Nov 9, 2023
I have followed the instructions and guidance of issue #3363 to
hopefully fix the opensearch timeout problem.

- Updated the default opensearch version to be  `v1.3.10`
> > This is to have access to an environment variable that can turn off
the performance analyzer.

- Set DISABLE_PERFORMANCE_ANALYZER_AGENT_CLI to 'true' for opensearch
container
> > This will ideally stop opensearch from failing to startup randomly 

References to this solution
[here](#3363 (comment))
@godber
Copy link
Member Author

godber commented Nov 9, 2023

Hopefully the PR above resolved this problem. If anyone sees this 2m timeout on Opensearch containers again let us know.

@godber godber closed this as completed Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants