DM-38279: Do all spawning work in start #14

rra · 2023-03-29T16:03:41Z

The original idea to have start return immediately once the lab start has been initiated doesn't work with JupyterHub's assumptions about spawners. Its timeouts and error handling expect all of the work to happen in the start method, and progress must not raise exceptions or JupyterHub reports uncaught exceptions and breaks its UI and API.

Follow the design of KubeSpawner and have the start method hold a copy of its task and do all the progress monitoring and event creation. The progress method then just waits for that task to complete, reporting all the events it generates as an iterator.

Do a bit of refactoring to move spawn events to a separate class, and add severity to the start of the message similar to what KubeSpawner does.

The original idea to have start return immediately once the lab start has been initiated doesn't work with JupyterHub's assumptions about spawners. Its timeouts and error handling expect all of the work to happen in the start method, and progress must not raise exceptions or JupyterHub reports uncaught exceptions and breaks its UI and API. Follow the design of KubeSpawner and have the start method hold a copy of its task and do all the progress monitoring and event creation. The progress method then just waits for that task to complete, reporting all the events it generates as an iterator. Do a bit of refactoring to move spawn events to a separate class, and add severity to the start of the message similar to what KubeSpawner does.

Add tests to the controller mock to ensure that the spawner sends the correct tokens to the different routes.

rra · 2023-03-29T16:05:04Z

This temporarily removes the wrapper around httpx errors because I was hoping the stringification of the httpx exception would be enough. It wasn't, so that comes back in an upcoming PR.

Test failures during spawn, delays in returning the progress messages, and multiple event watchers.

Be kind to our future selves and return more details in unusual spawn situations via the event stream. Every bit of debugging information we can see is helpful. Separate the severity from the message for spawn events for clearer construction. Flesh out the docstrings and document which exceptions may be raised.

Some of the comments in the progress method were out of date or incomplete. Flesh them out.

athornton · 2023-03-29T19:37:15Z

src/rsp_restspawner/spawner.py

+        The actual work is done in `_start`. This is a tiny wrapper to do
+        bookkeeping on the event stream and record the running task so that
+        `progress` can notice when the task is complete and return.
+


Ah, ok, taken from KubeSpawner. I'd really hoped we could leave more of the KubeSpawner grottiness behind with the RESTSpawner but alas.

athornton · 2023-03-29T19:39:05Z

src/rsp_restspawner/spawner.py

+        except Exception:
+            # We see no end of problems caused by stranded half-created pods,
+            # so whenever anything goes wrong, try to delete anything we may
+            # have left behind before raising the fatal exception.


I like this design choice.

athornton · 2023-03-29T19:44:00Z

src/rsp_restspawner/spawner.py

+        ----------
+        spawner
+            Another copy of the spawner (not used). It's not clear why
+            JupyterHub passes this into this method.


I once convinced myself it was something to do with functools and partial application, but I wouldn't swear I was right.

athornton · 2023-03-29T19:48:11Z

tests/spawner_test.py

 from rsp_restspawner.spawner import LabStatus, RSPRestSpawner

 from .support.controller import MockLabController


+async def gather_progress(


I find the overloading of "gather" a little confusing (yes, even though this is only ever used in a gather() call). I would have used collect_progress I think.

Good idea. Done.

athornton · 2023-03-29T19:48:47Z

src/rsp_restspawner/spawner.py

+                self._events.append(event)
+                return await self._get_internal_url()
+            else:
+                r.raise_for_status()


I always forget raise_for_status() exists

athornton · 2023-03-29T19:50:52Z

src/rsp_restspawner/spawner.py

@@ -35,6 +39,63 @@ class LabStatus(str, Enum):
    FAILED = "failed"


+@dataclass(frozen=True, slots=True)


Much better.

athornton · 2023-03-29T19:53:02Z

src/rsp_restspawner/spawner.py

+            return cls(
+                progress=90, message=sse.data, severity="info", complete=True
+            )
+        elif sse.event == "info":


I would probably have done something silly with functools.partial and then updated the instance object thus created by changing progress or severity if needed, and adding complete/failed if needed, but this is clearer.

athornton

Looks good.

Avoid confusion with asyncio.gather.

rra added 2 commits March 29, 2023 09:03

Test the spawner sends the right tokens

9027efa

Add tests to the controller mock to ensure that the spawner sends the correct tokens to the different routes.

rra requested a review from athornton March 29, 2023 16:04

Add more tests for spawning and progress

0702be4

Test failures during spawn, delays in returning the progress messages, and multiple event watchers.

rra force-pushed the tickets/DM-38279 branch from dd3051c to 0702be4 Compare March 29, 2023 18:58

rra added 2 commits March 29, 2023 12:12

Improve some comments in progress

0e4b8bb

Some of the comments in the progress method were out of date or incomplete. Flesh them out.

athornton reviewed Mar 29, 2023

View reviewed changes

athornton approved these changes Mar 29, 2023

View reviewed changes

Rename gather_progress to collect_progress

29c6bc1

Avoid confusion with asyncio.gather.

rra merged commit 7b9a2dc into main Mar 29, 2023

rra deleted the tickets/DM-38279 branch March 29, 2023 20:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-38279: Do all spawning work in start #14

DM-38279: Do all spawning work in start #14

rra commented Mar 29, 2023

rra commented Mar 29, 2023

athornton Mar 29, 2023 •

edited

Loading

athornton Mar 29, 2023

athornton Mar 29, 2023

athornton Mar 29, 2023

rra Mar 29, 2023

athornton Mar 29, 2023

athornton Mar 29, 2023

athornton Mar 29, 2023

athornton left a comment

		@@ -35,6 +39,63 @@ class LabStatus(str, Enum):
		FAILED = "failed"


		@dataclass(frozen=True, slots=True)

DM-38279: Do all spawning work in start #14

DM-38279: Do all spawning work in start #14

Conversation

rra commented Mar 29, 2023

rra commented Mar 29, 2023

athornton Mar 29, 2023 • edited Loading

Choose a reason for hiding this comment

athornton Mar 29, 2023

Choose a reason for hiding this comment

athornton Mar 29, 2023

Choose a reason for hiding this comment

athornton Mar 29, 2023

Choose a reason for hiding this comment

rra Mar 29, 2023

Choose a reason for hiding this comment

athornton Mar 29, 2023

Choose a reason for hiding this comment

athornton Mar 29, 2023

Choose a reason for hiding this comment

athornton Mar 29, 2023

Choose a reason for hiding this comment

athornton left a comment

Choose a reason for hiding this comment

athornton Mar 29, 2023 •

edited

Loading