-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] - Notebook Jobs failing because environment is not found #2277
Comments
I am getting |
This the error I see on the JATIC deployment when I try run now. |
@nkaretnikov do you have any ideas? I know you did some major changes to https://github.com/nebari-dev/argo-jupyter-scheduler last month. I'm not sure it's related to those changes, though. We used the latest |
@marcelovilla That repo has no references to the hardcoded papermill environment inside. argo-jupyter-scheduler just needs to have |
Checking the logs in the user pod, I'm seeing the following error:
I checked another deployment with the latest version of Nebari ( I'll need to investigate this further. |
I think I understood the issue that I was running against: when creating a notebook, but before manually saving it, the kernelspec metadata are not there, hence kernel cannot be found. Manually saving the notebook solves the problem. I think this is an upstream issue, at the very least of quite misleading error message: Server log in details below:
In the beginning the notebook is created as a empty shell on disk: {
"cells": [],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 5
} after the kernel connects, it provides metadata; that metadata is then saved when user saves the notebook for the first time: {
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "27a3458b-29d2-4b29-9695-868f409cd12f",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
} |
I am able to reproduce the issue with Since neither that nor the behaviour of not finding environments before notebook is saved as reported in my previous comment does not occur in the previous (stable) nebari version, I would think this is a change in
|
Last week I mention it is hitting a code path that it has no right of hitting if
The conclusion is that it must be that in the test environment with the current pre-release the traitlet which should be setting the Lines 48 to 53 in 5463e8d
|
Ok, so #2251 by myself is to blame. The very head of the the log contains:
This is exactly the case I was making in issue: Another question is if an error during configuration readout should prevent the jupyter server from starting up in the first place. I would say that yes, it should because the configuration may contain additional opt-in security measures, and if those are not active the security guarantees may not be met. |
Describe the bug
When trying to submit a Jupyter Notebook job, the job immediately stops and shows the following error:
Here is an image of the notebook jobs tab with more information:
Expected behavior
I would expect the Notebook Job to successfully start and execute.
OS and architecture in which you are running Nebari
AWS
How to Reproduce the problem?
I am deploying Nebari from a1e5fd1 to AWS, using the following config file:
Command output
No response
Versions and dependencies used.
No response
Compute environment
AWS
Integrations
Argo
Anything else?
The environment I am using to submit the Notebook Job has
papermill
on it and argo is enabled. This also happens with other environments that havepapermill
on it.The text was updated successfully, but these errors were encountered: