Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncing local python project files with remote Jupyter server #1601

Open
brunocous opened this issue May 25, 2020 · 19 comments
Open

Syncing local python project files with remote Jupyter server #1601

brunocous opened this issue May 25, 2020 · 19 comments
Assignees
Labels
feature-request Request for new features or functionality notebook-remote Applies to remote Jupyter Servers
Milestone

Comments

@brunocous
Copy link

Feature: Notebook Editor, Interactive Window, Python Editor cells

Description

Microsoft Data Science for VS Code Engineering Team: @rchiodo, @IanMatthewHuff, @DavidKutu, @DonJayamanne, @greazer

Context

VSCode allows users to connect with a running remote Jupyter server (https://code.visualstudio.com/docs/python/jupyter-support#_connect-to-a-remote-jupyter-server). Using the Jupyter API it is able to start a kernel and execute cells of a locally saved notebook remotely.

Use case/problem

If you want to for example call a function from another python file (foo.py) from that notebook (local-notebook.ipynb), the remote kernel can't access local files. The remote kernel can access other files saved on the remote notebook server. The problem is that local (Python) files are not synced with the remote Jupyter server instance, for the remote Python interpreter (kernel) to access them.

Existing solutions

The standard way of achieving this is through rsync over SSH. However this default requires managing a SSH connection and SSH keys (which large entreprises servers not necessarily allow).
There are workarounds (manually uploading files through the notebook UI, and using git), but these inhibit development and iteration speed.

Proposal

Extend the VSCode Python extension to allow users to sync files with a remote running Jupyter notebook server. Under the hood, the Jupyter contents API can be used for this:

Authentication and authorization is handled through the API token that you need anyway to connect.

No SSH, git or manual hassle required.

Additionally, you can execute your local code (by calling it through the notebook) remotely without having to manage a remote Python SSH interpreter, or docker images. All you need is a running jupyter notebook.

@brunocous brunocous changed the title Syncing local project files with remote Jupyter server Syncing local python project files with remote Jupyter server May 25, 2020
@IanMatthewHuff
Copy link
Member

@brunocous Thanks for the detailed suggestion. I do feel that this would be an interesting suggestion to consider. We'll discuss it at our triage meeting.

@brunocous
Copy link
Author

Any news on this, or how can I help?

@brunocous
Copy link
Author

Btw, there are other workarounds to this when you use a cloud provider anyway in your project (S3, data storage, blob,...).

For example, using AWS S3 you could perform aws s3 sync to sync local files to an S3 bucket. And then use a jupyterplugin(like https://github.com/uktrade/jupyters3) to sync that S3 bucket with Jupyter. Something similar probably exists for other cloud providers

@CmdQ
Copy link

CmdQ commented Sep 11, 2020

I'd find this feature highly useful. The other ways are always cumbersome.

@fra-luc
Copy link

fra-luc commented Sep 16, 2020

I'd like this very much as well.

@GF-Huang
Copy link

GF-Huang commented Nov 1, 2020

Any progress?

@rchiodo
Copy link
Contributor

rchiodo commented Nov 2, 2020

This is something we're investigating. Not sure if or when we'll release it though.

@DonJayamanne DonJayamanne transferred this issue from microsoft/vscode-python Nov 13, 2020
@205g0
Copy link

205g0 commented Dec 11, 2020

Just found this issue and indeed, this would be a feature making the entire experience more round.

The Jupyter remote server is on abstract level just a dumb number cruncher I use because:

  • I want to have clear separation between my local dev machine and an external TPU machine, latter is rarely for development but just for the model creation and for having a reproduceable production system; besides, both machines couldn't be more different, the TPU machine is just about TFLOPS and a solid server and production Ubuntu while my local dev machine is about hipdi support, mobility, window management, etc.
  • Because of the reproducibility I want to have also all the external imports on that remote machine
  • But everything which comes from me, locally, from my current dev efforts should be locally/from my dev machine which I use to create code

The current separation is jarring (one file local but rest remote) and breaks an unmatched feature of VS Code.

Would love to see an update on this

@brunocous
Copy link
Author

brunocous commented Dec 18, 2020

I'm already pleased that the VSCode team is even considering this feature. I dropped the same feature request for Pycharm some time ago, but not a single response or action was taken (https://youtrack.jetbrains.com/issue/PY-42649). So good job VS code team!
Even this feature gets a green light, it will take some non-trivial dev effort. For now, there are some suboptimal workarounds that gives you the same result (rsync, scripts that do something with git, syncing with any cloud blob or object store, etc).

@greazer greazer added the notebook-remote Applies to remote Jupyter Servers label Aug 5, 2021
@tsuga
Copy link

tsuga commented Aug 20, 2021

Is there any update for this long-wanted feature request? Or is it dead in the water?

I'm seeing @brunocous's proposal (copied below for your convenience) in https://youtrack.jetbrains.com/issue/PY-42649 is promissing.

Proposal
Extend the Pycharm Jupyter extension to allow users to sync files with a remote running Jupyter notebook server. Under the hood, the Jupyter contents API can be used for this:

Authentication and authorization is handled through the API token that you need anyway to connect.

No SSH, git or manual hassle required.

Additionally, you can execute your local code (by calling it through the notebook) remotely without having to manage a remote Python SSH interpreter, or docker images. All you need is a running jupyter notebook.

@rchiodo
Copy link
Contributor

rchiodo commented Aug 20, 2021

@tsuga you can track our iteration plans here:
#7008

Every month our plans will show up as a pinned item at the top of our issues.

Additionally things we plan on working on in the next month or so will have a milestone appended.

This item is on neither, so it's not on the radar at the moment. It would likely need more upvotes to push up the queue of stuff we're looking at.

@tsuga
Copy link

tsuga commented Aug 21, 2021

@rchiodo Thank you for your follow up! Where can we upvote this?

@rchiodo
Copy link
Contributor

rchiodo commented Aug 30, 2021

@rchiodo Thank you for your follow up! Where can we upvote this?

At the top. The upvotes under the main description are tracked as 'votes' for an item.

@alfredodeza
Copy link

This feature is critical to consider using an Azure remote compute instance as a viable option. I understand that it is currently not under any current plans, but would love to see this one move along.

In the meantime, would it be possible to have some sort of recommendation on how to sync local files to a remote instance? That would help alleviate the problem of not having something built-in. I'm happy to contribute documentation on it if that needs to happen

@rchiodo
Copy link
Contributor

rchiodo commented Jan 5, 2022

@alfredodeza thanks for the upvote. There's no recommended way to sync files other than for you to put them in the same folder where you started the remote server (that will make the relative paths work correctly).

@jcnelson30
Copy link

Anyone have any workarounds for how they setup the rsync to accommodate this issue?

I enjoy the data interaction of Jupyter notebooks but not being able to import some of my shared python code makes development extremely tedious.

I have a lot of floating "old-function-versions" due to having to paste each function directly into the Jupyter notebook to execute my long running tasks on my server w/ a beefy gpu

@nttoan26
Copy link

this feature will be very helpful

@DonJayamanne
Copy link
Contributor

DonJayamanne commented Sep 29, 2023

@nttoan26 would it be useful if you could just edit the remote files?
I.e. assume we displayed a file explorer that displayed all of the remote files and you could edit them in VS Code.
However what this means is if you have a local file that will not get uploaded/synced.
Instead you can just create the file on the remote file explorer directly.

Would that work?

Would this address your needs #1366

@AngeValli
Copy link

I think this idea would work. The need here is to have the file explorer in line with the remote Jupyter kernel used in VSCode. For the moment, it is mandatory to perform separately an SSH connection to see the files from the remote server on the file explorer and to connect on the remote Jupyter kernel, for example using JupyterHub’s REST API Token. Using the API Token for accessing remote files in the file explorer would solve the issue.

@rebornix rebornix added notebook-kernel-remote and removed notebook-remote Applies to remote Jupyter Servers labels Dec 6, 2023
@rebornix rebornix assigned DonJayamanne and unassigned rebornix Dec 6, 2023
@amunger amunger added notebook-remote Applies to remote Jupyter Servers and removed notebook-kernel-remote labels Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Request for new features or functionality notebook-remote Applies to remote Jupyter Servers
Projects
None yet
Development

No branches or pull requests