-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add manager for tracking and terminating running server proxies #395
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mahendrapaipuri this looks great to me overall!!! ❤️ 🎉 🌻
I see a clear value in this feature, it also seems like a well thought out design and implementation of the design! I've not been able to review the JS parts thoroughly as I'm not experienced with that, but from what I can tell it looks great as well.
I appreciate a lot about this PR but I'd like to highlight that your PR description was amazing to help me review it, I also greatly appreciate that it included references to related issues.
Required work towards merge
Before doing anything to avoid merge conflicts etc, please rebase on the current main branch to acquire #393 and #396 that changes things in docs and testing infra.
- Update docs to reflect that the labextension does more than just add launcher buttons, but that it also provides these new UI to manage servers
- Add entry in the changelog under the unreleased title that the api endpoint has been renamed
- Consider if we should print some log messages someplace.
Optional bonus work
I note that we don't have existing tests of for example the servers-info
endpoint. Since its become a far bigger project to setup tests from scratch than to mimic existing tests, I don't think we should block merging of this PR as it is but instead open an issue to represent the need to develop tests of our backend API separately going onwards.
- Add tests of introduced backend API handlers
- GET
api/server-proxy
(to list) - GET
api/server-proxy/<proxy name>
, failure situation + success situation - DELETE
api/server-proxy/<proxy name>
, failure situation + success situation
- GET
- Add browser / acceptance test of this feature
jupyter_server_proxy/__init__.py
Outdated
icon_handlers.append( | ||
( | ||
ujoin(base_url, f"server-proxy/icon/{sp.name}"), | ||
IconHandler, | ||
{"path": sp.launcher_entry.icon_path}, | ||
) | ||
) | ||
|
||
nbapp.web_app.add_handlers( | ||
".*", | ||
[ | ||
( | ||
ujoin(base_url, "server-proxy/servers-info"), | ||
ujoin(base_url, "api/server-proxy/servers-info"), | ||
ServersInfoHandler, | ||
{"server_processes": server_processes}, | ||
), | ||
(ujoin(base_url, "server-proxy/icon/(.*)"), IconHandler, {"icons": icons}), | ||
], | ||
(ujoin(base_url, r"api/server-proxy"), ListServersAPIHandler), | ||
(ujoin(base_url, r"api/server-proxy/(?P<name>.*)"), ServersAPIHandler), | ||
] | ||
+ icon_handlers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reflected a bit about the API paths we are locking into providing going onwards, and if we should adjust them or not in some way.
Specifically, reflection I reflected about:
- Multiple handlers register to handle certain routes now, for example
servers-info
could be handled by ServersInfoHandler and ListServersAPIHandler, andicon/<existing-server-process>
could be a ServersAPIHandler if its named quirky.
I think this can be fine, but I'm not fully confident. Should we work to ensure there isn't overlapping here?
I've not gained much experience on REST API design, I've mostly understood there can be quite a bit of things to consider. @bollwyvl @manics do you think we should go for the proposed API endpoints, or do you have suggestions on alternative endpoints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Welp, declaring (and testing against) an OpenAPI spec would be a good place to start, so that one doesn't have to read the code to figure out what's going on. Prior art:
- https://github.com/jupyter-server/jupyter_server/blob/main/jupyter_server/services/api/api.yaml
- https://github.com/jupyterlab/jupyterlab_server/blob/main/jupyterlab_server/rest-api.yml
This can generate nice static docs, as well as help cleaning up some of the design decisions: if you can't express a pattern as a URL param or in a schema, maybe it's not a great idea.
It also then becomes possible to re-use that schema declaration to generate lightweight python and TS types, even if one doesn't do fully generated clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with OpenAPI suggestion. That would make it more cleaner. Actually, servers-info
endpoint is only handling "launcher" specific metadata whereas the new endpoints are handling server specific metadata. So, I think it is logical to keep them as separate. However, I think we need to change the name of servers-info
endpoint as it is bit misleading? ServersAPIHandler
and ListServersAPIHandler
can be merged into one using query parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bollwyvl Cheers for the refs
jupyter_server_proxy/manager.py
Outdated
# Create a default manager to keep track of server proxy apps. | ||
manager = ServerProxyAppManager() | ||
|
||
# Create a Periodic call back function to check the status of processes | ||
pc = PeriodicCallback(monitor_server_proxy_procs, 1e4) | ||
pc.start() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be reasonable, but I'm not confident on it. Is it acceptable that we start this up automatically as soon as this file is loaded?
I'm not confident on the alternate options, but I figure this is a singleton instance of the class defined above, created when this python file is loaded. I figure options involve delaying this initialization.
I'm also thinking a bit about if monitor_server_proxy_procs
should or shouldn't be part of the class itself. It is a function tied to a specific object of the class anyhow via the manager
variable that could be self
as well.
If not someone has a concrete change idea here to persue or generally considers this to be blocking, I don't think this comment should be blocking a merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Importing this file shouldn't have side-effects like starting/changing processes or even things on the event loop, as there's no telling it will actually be there.
Ideally, as much business logic/exception handling as possible would move into the manager class, and out of the handlers and __init__.py
. The server extension would then become a very small layer that just hands down the parent application into the manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I'd recommend new-start async stuff to use stdlib asyncio
or anyio
primitives instead of tornado
-specific ones. And indeed, if we could get out of the polling business, that would be ideal: the underlying processes probably do (or should) provide callbacks/events that can be used to directly update the manager state.
Assuming we do have to poll, for some reason, and as I don't actually see a drop-in for PeriodicCallback
, this could be a ServerProxyManager.monitor()
which fires off an infinite loop:
import traitlets
import asyncio
class ServerProxyManager(ServerProxy):
monitor_interval = traitlets.Int(10, help="Proxy polling interval in seconds").tag(config=True)
monitoring = traitlets.Bool(False, help="Whether the manager should poll for proxy status")
# this doesn't need to be a trait
_monitor_task: asyncio.Task
@traitlets.observe("monitoring")
def on_monitoring_changed(self, change: traitlets.Bunch):
if self._monitor_task:
self._monitoring_task.cancel()
if change.new:
# keep a handle so it doesn't get GCd
self._monitoring_task = asyncio.create_task(self._monitor_loop())
async def _monitor_loop(self):
while self.monitoring:
# do the thing
await asyncio.sleep(self.monitor_interval)
When this gets invoked from __init__.py
:
def _load_jupyter_server_extension(nbapp: ServerApp):
manager = ServerProxyManager(parent=nbapp)
manager.monitor()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user gracefully terminates a server proxy manually outside of lab interface, that process will not be restarted as we are not using always_restart
feature of simpervisor. So somehow we need to update manager to remove this server proxy app and that is why I included this polling. Simpervisor uses exit handlers internally but I think they are not configurable. So, I dont know how we can fire a callback when process exits gracefully.
This is a reasonable starting point, but would benefit from aligning with the configuration and async patterns used in the upstream I'd say a lack of any python tests for the new features is probably a hard blocker to this getting merged, but really, with the more complex lab stuff, this would need to be tested by robot tests as well to be seriously considered for merging. As noted, in general when adding a new REST APIs, starting with the schema and generating the python and TS types is a really good way to keep the system manageable, over time, and whether built with OpenAPI or not, can make the documentation process also much smoother. |
@consideRatio @bollwyvl Thanks for the feedback. I will work on the changes. Yes, I agree that we need to add tests for these features and I am working on them already. Next week I will be at KubeCon so, I might not find lot of time to work on this. Are you aiming to make this extension compatible with Notebook 7? Currently the frontend extension fails as it is loading JupyterLab specific components. Straightforward solution is to load UI components conditionally, i.e., making |
It keeps simpervisor proc objects in internal state to be able to teminate a process by calling terminate method on proc object. It also checks the status of active processes by running a periodic callback. If a process is killed outside of jsp, it removes it from running apps list.
Existing servers-info endpoint is moved with api sub directory to have a consistent end point scheme.
We also check if proxy is "running" and if it is not, we remove proc object to restart the proxy. This way proxy will be restarted only when user does it via lancher.
This manager will be added to running sessions widget so that user can view and terminate running server proxy apps. Proxy app metadata will be shown when we hover over proxy app name.
for more information, see https://pre-commit.ci
@consideRatio @bollwyvl @manics I have update the PR with most of the comments addressed. Please check the updated PR description for more details. Update: Cleaned up my mess and git history is clean now. Sorry about that!! |
Do not instantiate manager in the file anymore. This will be done during extension loading Rework monitoring for proxy apps natively using asyncio. This will be added as callback to ServerApp IO loop during extension loading Manager spits out logs in debug mode when there is proxy app is added/removed from it. Added uni_socket to server proxy app dataclass Remove unnecessary async methods Properly handle the get_server_proxy_{app,proc} methods when server proxy is not found in the manager. We return a default ServerProxyApp tuple in this case with empty values
An OpenAPI spec file is added to keep track of APIs for better maintainability All APIs are moved into /server-proxy/ subdirectory. Seems like root / is used by jupyter_server and /lab/ is used by jupyterlab_server. Moving all APIs under /server-proxy will future incompatibilities Initialise ServersAPIHandler with manager instead of importing it Use a function to setup API handlers which can be called directly during extension loading
We add manager as a traitlet to ServerApp and add monitor as a callback to ServerApp IO lopp Simplify API handlers setup
Move RunningServerProxyApp to a separate file Use lab transalation capabilities for complex strings Use URLExt package to manipulate URLs
Use lumino polling class for polling server proxy apps Add schema to be able to configure refreshInterval
for more information, see https://pre-commit.ci
713fadd
to
bb68e2b
Compare
@consideRatio I have reworked this PR based on the comments and added tests. Please check the updated PR description when you got time. |
This is amazing work @mahendrapaipuri!!! I'm truly sorry we've not been able to review your work further yet @mahendrapaipuri. I'll do a small push now but I can't review this myself to completion. As this PR is very large, avoid force pushing to the PR branch to help reviewers keep track of changes since last review better. I'll go through a few comments and link to the resolving commit to help other reviewers. I'd like to get the open review comments down to a minimum to reduce complexity of further review. |
Thanks a lot @consideRatio for reviewing the changes. It is totally fine, we all are busy with a lot of stuff and I understand things can take time. I guess there are still few small things to address and I will do it next week. Cheers!! |
Avoid duplicating codebase
This adds great additional features to |
This PR addresses part of enhancements proposed in #383
Backend
Here we are adding a manager on the backend to keep track of running server proxy apps with assumptions:
command
to proxy to an already started process (unmanaged process) #339 will not be added to manager as they cannot be controlled via JupyterLab frontend.__init__.py
and added toServerApp
as a traitlet. This is inspired from thejupyter_lsp
project. This singleton object is passed to API handlers andSupervisedProxyHandler
.SIGTERM
signal to running server proxy app by callingterminate
method on simpervisor process.ensure_process
method, we are verifying the state of the process usingproc.running
. When the running server proxy app is either terminated via JupyterLab frontend or if user terminates the app by sending a signal to PID manuallyproc.running
will be evaluated toFalse
. In these cases, we would like to ideally restart the process again and hence, we removeproc
object from internal state and restart the process. _restart_process_if_needed will make sure simpervisor will restart the process in other cases.Frontend
Manager for the frontend follows a similar logic used to implement terminals/kernels sessions in JupyterLab.
shutdown
option in the running sessions. Proxy apps that are shutdown can be restarted afresh from the launcher.Tests
Docs
Note about the Running Server Proxy Apps is included in GUI launchers in docs
Illustration
or all server proxy apps which will terminate by sending
SIGTERM
as explained above.Other changes
servers-info
endpoint is moved fromserver-proxy/servers-info
toserver-proxy/api/servers-info
to be consistentjupyter_server
is using root/
andjupyterlab_server
is using/lab/
directory for API endpoints. So, all API end points are moved to/server-proxy/
sub directory to avoid future incompatibility issuesStaticFileHandler
is used forIconHandler
to simplify codeThis extension can be ported to Notebook 7 as well with minimum changes.
This PR partially addresses #343, #264, #213, #188 (EDIT: marked as fixed, motivated below)
@consideRatio @ryanlovett @yuvipanda @bollwyvl Feel free to review and let me know your feedback!!
Edits by Erik
Fixes Kill proxy process #188
The UI and API allows for management to terminate a proxied process
Fixes Option to allow starting-up again #213
I believe this PR enables that.
Fixes Allow users to restart a service #264
I think this does the trick as well
Fixes Looking up dynamically allocated port numbers to support testing of proxied services #343
I understand that this PR provides an API to look this up.