data store: remove threading #574

oliver-sanders · 2024-03-28T14:14:48Z

Closes #194

Run subscribers via asyncio rather than the ThreadPoolExecutor.

I think the only non-blocking operations in the code being called were:

time.sleep (has an async variant)
asyncio.sleep (async)
self.socket.recv_multipart (async)

If so, I don't think we need to use threading here so I've refactored the code so that the underlying async functions could be called via asyncio.

If so, this cuts out the need for the ThreadPoolExecutor removing the workflow limit.

Check List

I have read CONTRIBUTING.md and added my name as a Code Contributor.
Contains logically grouped changes (else tidy your branch by rebase).
Does not contain off-topic changes (use other PRs for other changes).
Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
Tests are included (or explain why tests are not needed).
CHANGES.md entry included if this is a change that can affect users
Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

cylc/uiserver/data_store_mgr.py

hjoliver · 2024-04-02T22:15:20Z

Nice, hope we can get this working 🚀

oliver-sanders · 2024-04-03T09:09:44Z

Ping @dwsutherland, I may well have overlooked something, does this make sense to you?

(note, tests are unhappy for async reasons, but it should work for real usage)

cylc/uiserver/data_store_mgr.py

dwsutherland · 2024-04-05T03:52:53Z

Ping @dwsutherland, I may well have overlooked something, does this make sense to you?

(note, tests are unhappy for async reasons, but it should work for real usage)

Will have a look, I think I wanted to do this when I was first building, but didn't manage to figure out how to have non-blocking subscription receive loops or something.. Can't remember exactly why (wasn't as simple as the sleeps..), but well done if you've figured it out 👏

dwsutherland · 2024-04-09T09:43:24Z

The changes look quite simple .. had a play, and I did notice something I couldn't reproduce on master.
When stopping a workflow in the WUI (latest master cylc-ui):

[I 2024-04-09 21:37:20.380 CylcUIServer] $ cylc play --color=never --mode live five/run1
[I 2024-04-09 21:37:20.901 CylcHubApp log:191] 200 POST /user/sutherlander/cylc/graphql (sutherlander@::ffff:127.0.0.1) 530.46ms
[I 2024-04-09 21:37:20.906 CylcUIServer] [data-store] connect_workflow('~sutherlander/five/run1', <dict>)
[I 2024-04-09 21:37:30.551 CylcHubApp log:191] 200 POST /user/sutherlander/cylc/graphql (sutherlander@::ffff:127.0.0.1) 99.47ms
[I 2024-04-09 21:37:30.943 CylcUIServer] [data-store] disconnect_workflow('~sutherlander/holder/run1')
21:37:31.535 [ConfigProxy] error: 503 GET /user/sutherlander/cylc/subscriptions connect ECONNREFUSED 127.0.0.1:33077
[I 2024-04-09T21:37:31.541 JupyterHub log:191] 200 GET /hub/error/503?url=%2Fuser%2Fsutherlander%2Fcylc%2Fsubscriptions (@127.0.0.1) 5.35ms
21:37:31.894 [ConfigProxy] error: 503 GET /user/sutherlander/cylc/subscriptions connect ECONNREFUSED 127.0.0.1:33077
.
.
.
[I 2024-04-09T21:37:34.955 JupyterHub log:191] 200 GET /hub/error/503?url=%2Fuser%2Fsutherlander%2Fcylc%2Fsubscriptions (@127.0.0.1) 1.33ms
21:37:37.405 [ConfigProxy] error: 503 GET /user/sutherlander/cylc/subscriptions connect ECONNREFUSED 127.0.0.1:33077
[I 2024-04-09T21:37:37.407 JupyterHub log:191] 200 GET /hub/error/503?url=%2Fuser%2Fsutherlander%2Fcylc%2Fsubscriptions (@127.0.0.1) 1.35ms
[W 2024-04-09T21:37:39.146 JupyterHub base:1154] User sutherlander server stopped, with exit code: 0
[I 2024-04-09T21:37:39.146 JupyterHub proxy:356] Removing user sutherlander from proxy (/user/sutherlander/)

Not sure why.. could be something unrelated? maybe this branch needs rebased..

Looks like it might be trying to reconnect to the disconnected workflow?

dwsutherland · 2024-04-09T12:12:25Z

Ah, think I might know why, maybe .. the subscriber resolver (cylc-flow) might use that w_subs variable, which you changed to subscribers but workflow_subscribers would have been better I think

oliver-sanders · 2024-04-22T15:19:10Z

cylc/uiserver/data_store_mgr.py

+        except WorkflowStopped:
+            self.disconnect_workflow(w_id)
+        except Exception as exc:
+            self.log.error(f'Failed to connect to {w_id}: {exc}')
            self.disconnect_workflow(w_id)


Moved error handling up a level from the _start_subscription and _entire_workflow_update methods, no more threads to interfere with error reporting.

* Closes cylc#194 * Run subscribers via asyncio rather than the ThreadPoolExecutor. * The only non-blocking operations in the code being called were: - time.sleep (has an async variant) - asyncio.sleep (async) - self.socket.recv_multipart (async) * Refactored the code so that the underlying async functions could be called via asyncio.

* Move error handling up a level into `connect_workflow` from the `_start_subscription` and `_entire_workflow_update` methods. * Simplify tests (all exceptions are now caught in the same way). * Remove the multi-workflow handling ability of `_entire_workflow_update`, this is unused and can now be achieved more easily via asyncio.gather as the threadding has been removed.

cylc/uiserver/data_store_mgr.py

dwsutherland

I'm still getting the same thing happening, when stopping a workflow (from UI of CLI):

When using hubless, the UIS just disconnects and process stops:

.
.
.
[I 2024-05-06 19:14:36.301 CylcUIServer] $ cylc play --color=never --mode live five/run1
[I 2024-05-06 19:14:36.834 CylcUIServer] [data-store] connect_workflow('~sutherlander/five/run1', <dict>)
[I 2024-05-06 19:14:49.404 CylcUIServer] [data-store] disconnect_workflow('~sutherlander/five/run1')
(uiserver) sutherlander@cortex-vbox:cylc-uiserver$

So it appears the UIS stops or crashes when a workflow is stopped/disconnected.

dwsutherland · 2024-06-10T06:02:09Z

Does it make sense to deal with this one:
#584
here, at the same time?

oliver-sanders · 2024-06-11T08:43:09Z

Can move away from Tornado async interfaces in due course (these are already translated into asyncio calls by the tornado library), Jupyter Server hasn't moved over yet so we may need to wait a while before changing ourselves.

oliver-sanders commented Mar 28, 2024

View reviewed changes

cylc/uiserver/data_store_mgr.py Show resolved Hide resolved

oliver-sanders marked this pull request as draft March 28, 2024 14:16

oliver-sanders mentioned this pull request Mar 28, 2024

Revisit use of multiple subscription sockets/threads #194

Open

oliver-sanders commented Apr 4, 2024

View reviewed changes

cylc/uiserver/data_store_mgr.py Outdated Show resolved Hide resolved

oliver-sanders force-pushed the 194 branch from 2a0a092 to df0f1b6 Compare April 22, 2024 15:17

oliver-sanders commented Apr 22, 2024

View reviewed changes

oliver-sanders added 2 commits April 22, 2024 16:23

oliver-sanders force-pushed the 194 branch from 0d893d1 to 583b71d Compare April 22, 2024 15:24

oliver-sanders marked this pull request as ready for review April 22, 2024 15:24

oliver-sanders requested a review from dwsutherland April 22, 2024 15:24

oliver-sanders self-assigned this Apr 22, 2024

oliver-sanders added this to the 1.5.0 milestone Apr 22, 2024

changelog

9fec5aa

oliver-sanders requested a review from wxtim April 23, 2024 11:31

wxtim reviewed Apr 24, 2024

View reviewed changes

cylc/uiserver/data_store_mgr.py Show resolved Hide resolved

wxtim approved these changes Apr 24, 2024

View reviewed changes

dwsutherland requested changes May 6, 2024

View reviewed changes

oliver-sanders marked this pull request as draft May 10, 2024 10:03

oliver-sanders modified the milestones: 1.5.0, 1.6.0 May 10, 2024

oliver-sanders modified the milestones: 1.6.0, 1.7.0 Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data store: remove threading #574

data store: remove threading #574

oliver-sanders commented Mar 28, 2024 •

edited

Loading

hjoliver commented Apr 2, 2024

oliver-sanders commented Apr 3, 2024 •

edited

Loading

dwsutherland commented Apr 5, 2024 •

edited

Loading

dwsutherland commented Apr 9, 2024 •

edited

Loading

dwsutherland commented Apr 9, 2024 •

edited

Loading

oliver-sanders Apr 22, 2024

dwsutherland left a comment

dwsutherland commented Jun 10, 2024 •

edited

Loading

oliver-sanders commented Jun 11, 2024

data store: remove threading #574

Are you sure you want to change the base?

data store: remove threading #574

Conversation

oliver-sanders commented Mar 28, 2024 • edited Loading

hjoliver commented Apr 2, 2024

oliver-sanders commented Apr 3, 2024 • edited Loading

dwsutherland commented Apr 5, 2024 • edited Loading

dwsutherland commented Apr 9, 2024 • edited Loading

dwsutherland commented Apr 9, 2024 • edited Loading

oliver-sanders Apr 22, 2024

Choose a reason for hiding this comment

dwsutherland left a comment

Choose a reason for hiding this comment

dwsutherland commented Jun 10, 2024 • edited Loading

oliver-sanders commented Jun 11, 2024

oliver-sanders commented Mar 28, 2024 •

edited

Loading

oliver-sanders commented Apr 3, 2024 •

edited

Loading

dwsutherland commented Apr 5, 2024 •

edited

Loading

dwsutherland commented Apr 9, 2024 •

edited

Loading

dwsutherland commented Apr 9, 2024 •

edited

Loading

dwsutherland commented Jun 10, 2024 •

edited

Loading