Skip to content
This repository has been archived by the owner on Jan 4, 2024. It is now read-only.

DM-38279: Do all spawning work in start #14

Merged
merged 6 commits into from
Mar 29, 2023
Merged

DM-38279: Do all spawning work in start #14

merged 6 commits into from
Mar 29, 2023

Conversation

rra
Copy link
Member

@rra rra commented Mar 29, 2023

The original idea to have start return immediately once the lab start has been initiated doesn't work with JupyterHub's assumptions about spawners. Its timeouts and error handling expect all of the work to happen in the start method, and progress must not raise exceptions or JupyterHub reports uncaught exceptions and breaks its UI and API.

Follow the design of KubeSpawner and have the start method hold a copy of its task and do all the progress monitoring and event creation. The progress method then just waits for that task to complete, reporting all the events it generates as an iterator.

Do a bit of refactoring to move spawn events to a separate class, and add severity to the start of the message similar to what KubeSpawner does.

rra added 2 commits March 29, 2023 09:03
The original idea to have start return immediately once the lab
start has been initiated doesn't work with JupyterHub's assumptions
about spawners. Its timeouts and error handling expect all of the
work to happen in the start method, and progress must not raise
exceptions or JupyterHub reports uncaught exceptions and breaks its
UI and API.

Follow the design of KubeSpawner and have the start method hold a
copy of its task and do all the progress monitoring and event
creation. The progress method then just waits for that task to
complete, reporting all the events it generates as an iterator.

Do a bit of refactoring to move spawn events to a separate class,
and add severity to the start of the message similar to what
KubeSpawner does.
Add tests to the controller mock to ensure that the spawner sends
the correct tokens to the different routes.
@rra rra requested a review from athornton March 29, 2023 16:04
@rra
Copy link
Member Author

rra commented Mar 29, 2023

This temporarily removes the wrapper around httpx errors because I was hoping the stringification of the httpx exception would be enough. It wasn't, so that comes back in an upcoming PR.

Test failures during spawn, delays in returning the progress
messages, and multiple event watchers.
@rra rra force-pushed the tickets/DM-38279 branch from dd3051c to 0702be4 Compare March 29, 2023 18:58
rra added 2 commits March 29, 2023 12:12
Be kind to our future selves and return more details in unusual
spawn situations via the event stream. Every bit of debugging
information we can see is helpful. Separate the severity from the
message for spawn events for clearer construction.

Flesh out the docstrings and document which exceptions may be
raised.
Some of the comments in the progress method were out of date or
incomplete. Flesh them out.
The actual work is done in `_start`. This is a tiny wrapper to do
bookkeeping on the event stream and record the running task so that
`progress` can notice when the task is complete and return.

Copy link
Member

@athornton athornton Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok, taken from KubeSpawner. I'd really hoped we could leave more of the KubeSpawner grottiness behind with the RESTSpawner but alas.

except Exception:
# We see no end of problems caused by stranded half-created pods,
# so whenever anything goes wrong, try to delete anything we may
# have left behind before raising the fatal exception.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this design choice.

----------
spawner
Another copy of the spawner (not used). It's not clear why
JupyterHub passes this into this method.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I once convinced myself it was something to do with functools and partial application, but I wouldn't swear I was right.

from rsp_restspawner.spawner import LabStatus, RSPRestSpawner

from .support.controller import MockLabController


async def gather_progress(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the overloading of "gather" a little confusing (yes, even though this is only ever used in a gather() call). I would have used collect_progress I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Done.

self._events.append(event)
return await self._get_internal_url()
else:
r.raise_for_status()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always forget raise_for_status() exists

@@ -35,6 +39,63 @@ class LabStatus(str, Enum):
FAILED = "failed"


@dataclass(frozen=True, slots=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better.

return cls(
progress=90, message=sse.data, severity="info", complete=True
)
elif sse.event == "info":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably have done something silly with functools.partial and then updated the instance object thus created by changing progress or severity if needed, and adding complete/failed if needed, but this is clearer.

Copy link
Member

@athornton athornton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Avoid confusion with asyncio.gather.
@rra rra merged commit 7b9a2dc into main Mar 29, 2023
@rra rra deleted the tickets/DM-38279 branch March 29, 2023 20:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants