Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage_server json inhibits ert from plotting after error encountered #8452

Closed
andreas-el opened this issue Aug 13, 2024 · 3 comments · Fixed by #9110 · May be fixed by #8939
Closed

Storage_server json inhibits ert from plotting after error encountered #8452

andreas-el opened this issue Aug 13, 2024 · 3 comments · Fixed by #9110 · May be fixed by #8939
Assignees
Labels

Comments

@andreas-el
Copy link
Contributor

andreas-el commented Aug 13, 2024

Whenever Ert encounters an error during plotting, you cannot plot storage content anymore.
Ert will just hang.


Reproduce by:
First run poly-example to have something to plot.

Add some error, for example adding raise ValueError() at the bottom of observations_for_key in plot_api.py

                all_observations = pd.concat(
                    [all_observations, pd.DataFrame(data_struct)]
                )
        raise ValueError()
        return all_observations.T

Run ert and have the error raised when trying to plot.
Revert code snippet, and observe that you cannot plot content anymore.
Ert will just hang.

I've diffed the output and found this;

diff --color -Nau lol/poly_example/storage/storage_server.json test-data/poly_example/storage/storage_server.json
--- lol/poly_example/storage/storage_server.json	1970-01-01 01:00:00
+++ test-data/poly_example/storage/storage_server.json	2024-08-13 13:49:49
@@ -0,0 +1,8 @@
+{
+    "urls": [
+        "http://127.0.0.1:51834",
+        "http://AC-WR4GDXT41X:51834",
+        "http://AC-WR4GDXT41X:51834"
+    ],
+    "authtoken": "<token>"
+}
\ No newline at end of file

Removing that json file seems to unclog plotting again.

@sondreso
Copy link
Collaborator

I believe what's happening here is that the first exception takes down ERT, and it's not able to clean up the files properly (leaving the storage_server.json behind). The next time ERT starts it tries to pick up the session from this file instead of starting a new (the presence of the file indicates a running session that you can connect to), but it isn't valid anymore.

@sondreso sondreso moved this to Todo in SCOUT Aug 26, 2024
@JHolba JHolba self-assigned this Sep 25, 2024
@JHolba
Copy link
Contributor

JHolba commented Oct 11, 2024

the json file does not seem to be used any more. removing all reads/checks of it does not impact our tests. also tested plotter with snake oil case.

@xjules xjules moved this from Reviewed to In Progress in SCOUT Oct 11, 2024
@JHolba
Copy link
Contributor

JHolba commented Oct 14, 2024

this issue seems to be caused by ert not closing base_services on mac when ert goes down unexpectedly. dark storage will continue running and responding to base_url/healthcheck causing ert to think that it already has a running dark storage when starting up.

@JHolba JHolba assigned jonathan-eq and unassigned JHolba Oct 14, 2024
@jonathan-eq jonathan-eq moved this from Todo to Ready for Review in SCOUT Oct 31, 2024
@jonathan-eq jonathan-eq moved this from Reviewed to In Progress in SCOUT Dec 2, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in SCOUT Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
4 participants