Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] opensearch_node container does not restart on error #2143

Closed
tuotempo opened this issue May 18, 2022 · 7 comments
Closed

[BUG] opensearch_node container does not restart on error #2143

tuotempo opened this issue May 18, 2022 · 7 comments
Labels
bug Something isn't working docker

Comments

@tuotempo
Copy link

Describe the bug
I am running an opensearch docker container, that sometimes exits with code 0 even if I see errors on logs

To Reproduce
I were not able to reproduce it sistematically

Expected behavior
If the container exits for an error, it should restart automatically.

Plugins
discovery-ec2
opensearch-alerting
opensearch-anomaly-detection
opensearch-asynchronous-search
opensearch-cross-cluster-replication
opensearch-index-management
opensearch-job-scheduler
opensearch-knn
opensearch-observability
opensearch-performance-analyzer
opensearch-reports-scheduler
opensearch-security
opensearch-sql
prometheus-exporter
repository-s3

Screenshots

Host/Environment:

  • OS: Ubuntu
  • Version 18.04.4 LTS

Additional context
here is the output of docker ps -a --no-trunc:

CONTAINER ID                                                       IMAGE                                           COMMAND                               CREATED        STATUS                         PORTS                 NAMES
9ac96c1a80544e9927bd570499ca69d3a7c44921973c682b670d6884409ff74a   opensearchproject/opensearch:1.2.3              "./opensearch-docker-entrypoint.sh"   3 months ago   Exited (0) 11 minutes ago                            opensearch_node

looking at log files, I see there was an out of memory error:

java.lang.OutOfMemoryError: Java heap space
[2022-05-09T12:06:33,484][ERROR][o.o.b.OpenSearchUncaughtExceptionHandler] [host] fatal error in thread [opensearch[host][scheduler][T#1]], exiting
java.lang.OutOfMemoryError: Java heap space
Killing performance analyzer process 12
OpenSearch exited with code 127
Performance analyzer exited with code 143

In order to handle those cases, I have set container policy to restart on-failure, but since the exit code is 0, it remains stopped.
I suspect the problem can be on terminateProcesses function, but I'm not 100% sure.

Thanks for your support.

@tuotempo tuotempo added bug Something isn't working untriaged Issues that have not yet been triaged labels May 18, 2022
@adnapibar
Copy link
Contributor

Hey @CEHENKLE, can you please transfer this issue to the opensearch-build repo?

@CEHENKLE
Copy link
Member

Done!

@CEHENKLE CEHENKLE transferred this issue from opensearch-project/OpenSearch May 24, 2022
@bbarani
Copy link
Member

bbarani commented Jun 1, 2022

@tuotempo Thanks for submitting the issue. We have made lots of improvements to Docker image post 1.3.0 release. Can you try out the latest version of OpenSearch images (1.3.2 / 2.0) and let us know if you are still facing these issues?

@bbarani bbarani added docker and removed untriaged Issues that have not yet been triaged labels Jun 1, 2022
@bbarani
Copy link
Member

bbarani commented Jun 13, 2022

@tuotempo Are you able to validate the fix in the latest version (1.3.3) of docker images?

@bbarani
Copy link
Member

bbarani commented Jul 18, 2022

Closing this issue for now. Please feel free to re-open if required.

@bbarani bbarani closed this as completed Jul 18, 2022
@tuotempo
Copy link
Author

tuotempo commented Jul 26, 2022

Hello,
I have tested the 1.3.3 version of images, but the behavior is the same. Here is some additional info (if needed):

output of docker ps -a --no-trunc:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 5460ce72093cadb1c49331f54c7cc7c0d041174f668aeda157efc199e6938fef opensearchproject/opensearch:1.3.3 "./opensearch-docker-entrypoint.sh opensearch" 7 minutes ago Exited (0) 6 minutes ago

container's logs:
[2022-07-26T13:31:22,818][ERROR][o.o.b.OpenSearchUncaughtExceptionHandler] [lb-logs-test] fatal error in thread [main], exiting java.lang.OutOfMemoryError: Java heap space at java.util.jar.JarFile.lambda$entries$0(JarFile.java:531) ~[?:?] at java.util.jar.JarFile$$Lambda$218/0x00000001001bb840.apply(Unknown Source) ~[?:?] at java.util.zip.ZipFile.getZipEntry(ZipFile.java:676) ~[?:?] at java.util.zip.ZipFile$ZipEntryIterator.next(ZipFile.java:531) ~[?:?] at java.util.zip.ZipFile$ZipEntryIterator.nextElement(ZipFile.java:519) ~[?:?] at java.util.zip.ZipFile$ZipEntryIterator.nextElement(ZipFile.java:495) ~[?:?] at org.opensearch.bootstrap.JarHell.checkJarHell(JarHell.java:208) ~[opensearch-core-1.3.3.jar:1.3.3] at org.opensearch.plugins.PluginsService.checkBundleJarHell(PluginsService.java:676) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.plugins.PluginsService.loadBundles(PluginsService.java:528) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.plugins.PluginsService.<init>(PluginsService.java:193) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.node.Node.<init>(Node.java:396) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.node.Node.<init>(Node.java:319) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:242) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.bootstrap.Bootstrap.setup(Bootstrap.java:242) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.bootstrap.Bootstrap.init(Bootstrap.java:412) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.bootstrap.OpenSearch.init(OpenSearch.java:178) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.bootstrap.OpenSearch.execute(OpenSearch.java:169) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:100) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.cli.Command.mainWithoutErrorHandling(Command.java:138) ~[opensearch-cli-1.3.3.jar:1.3.3] at org.opensearch.cli.Command.main(Command.java:101) ~[opensearch-cli-1.3.3.jar:1.3.3] at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:135) ~[opensearch-1.3.3.jar:1.3.3] at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:101) ~[opensearch-1.3.3.jar:1.3.3] Killing performance analyzer process 11 OpenSearch exited with code 127 Performance analyzer exited with code 143

In order to reproduce the issue, I have created the container with this java options inside Env:
"OPENSEARCH_JAVA_OPTS=-Xms8m -Xmx8m"

so that the java.lang.OutOfMemoryError: Java heap space is triggered almost immediately after the container starts.

Regards.

@tuotempo
Copy link
Author

tuotempo commented Aug 8, 2022

Hi @bbarani can you check our response please? We can't reopen this issue with results of our tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working docker
Projects
None yet
Development

No branches or pull requests

4 participants