Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R session background connections #440

Open
andycraig opened this issue Nov 6, 2020 · 10 comments
Open

R session background connections #440

andycraig opened this issue Nov 6, 2020 · 10 comments
Labels
engineering Maintenance, style, development process feature-request

Comments

@andycraig
Copy link
Collaborator

In #394, notebook.R runs a socket connection that the notebook communicates with. Would it be worth extracting part of this out so that a socket connection is available for each R session started by the user? This would make the following possible:

  • Sending VS Code's R commands to the session invisibly, rather than having to send them as text to the console
  • Querying the session, for example to get the names of data frames
  • For Run Selection/Line, potentially using R instead of TypeScript to choose which lines to send (allowing use of R's parse functions)
  • Inline evaluation (sending user code to the session invisibly, and displaying the results in the editor) - this one's not really necessary but some other languages/IDE's have it and it's kind of cool

@renkun-ken You wrote notebook.R and the session watcher, which has some overlap with this proposal, so you're definitely the best person to comment. Does this sound practical? Let me know if I need to clarify anything.

@ManuelHentschel
Copy link
Member

ManuelHentschel commented Nov 12, 2020

I've tried adding similar functionality to the debugger in ManuelHentschel/VSCode-R-Debugger#94.

To use it, start an R session (any terminal works), create some variables etc., and then run vscDebugger::.vsc.listen(). To connect vscode to that session, launch the R debugger with the following launch config:

    {
      "type": "R-Debugger",
      "request": "attach",
      "name": "Attach to R process",
      "splitOverwrittenOutput": true
    }

This will allow you to view the current workspace, the callstack (if you called .vsc.listen while inside a function), and evaluate arbitrary code in the debug console (autocomplete should work here, but seems to be a bit unstable). If the R session was launched inside a VSCode terminal, it will also allow settings breakpoints and using the flow control buttons to a limited extent.

(Edit: the strike through parts are mostly not true...)
One of the biggest challenges with this approach is that the R process will be idle while showing the input prompt. During that time it won't be able to handle any commands from vscode. AFAIK the session watcher uses addTaskCallback() to update the session info after each command, this could of course also be used to listen/write to a socket, but we would still have the problem that R only listens to either the user or vscode at any time.

I'm not sure if it would be possible to spawn a child process that continuously listens on a socket and still has access to the workspace of the parent process, but this does sound rather difficult if not impossible.


My opinion regarding the features you suggested:

  • Sending VS Code's R commands to the session invisibly, rather than having to send them as text to the console
  • Inline evaluation (sending user code to the session invisibly, and displaying the results in the editor) - this one's not really necessary but some other languages/IDE's have it and it's kind of cool

This would probably require the R session to be actively listening on the socket, appearing unresponsive to the user during that time.

  • Querying the session, for example to get the names of data frames

This would be possible using addTaskCallback(), but I think that functionality could also be implemented by the current session watcher.

  • For Run Selection/Line, potentially using R instead of TypeScript to choose which lines to send (allowing use of R's parse functions)

Since this is independent of each R session, we could achieve this by spawning a separate helper process, similar to the languageserver.

@andycraig
Copy link
Collaborator Author

Hi @ManuelHentschel, thank you for your comment, and also thank you more generally for all your contributions to the VS Code R space recently!

As a proof of concept, I put together a small demo: https://github.com/andycraig/r-websocket-repl-demo/tree/main/demo

It uses the httpuv and websocket libraries to have an R REPL that the user can use, while also having a server running in the background that a client can communicate with. Using this approach, R would be able to listen to both the user and VS Code at the same time. (This is the approach being used in the work-in-progress sess library: https://github.com/randy3k/sess )

My example of querying data frame names wasn't the best since it already is implemented by the session watcher.

@renkun-ken
Copy link
Member

Thanks @andycraig and @ManuelHentschel for sharing your thoughts on this. I have some thoughts on the pro and cons in practice on this and will post later today.

@ManuelHentschel
Copy link
Member

As a proof of concept, I put together a small demo: https://github.com/andycraig/r-websocket-repl-demo/tree/main/demo

The demo works quite nicely, I wasn't aware of that package/functionality. I guess most of what I said above is not accurate then.

@ManuelHentschel
Copy link
Member

Thanks @andycraig for your proof of concept! I tried integrating this functionality in the debugger, using svSocket instead of a websocket.

This required some changes to the svSocket package which I commited here, but seems to work ok on windows and WSL. To try these out, install the branch vscDebugger/websocket and the modified version of svSocket. Then start an R session, call vscDebugger::.vsc.startWebsocket() and launch the debugger in the example config Attach to R process.

This mode is still a bit unstable, but you should be able to view the current workspace in the variable window and run commands in the debug console (autocomplete seems to be broken here for some reason), without blocking the user from using the normal terminal input.

This approach is of course somewhat limited by the fact that it can only be accessed through the debugger. But I think it would e.g. be possible to write a custom DAP-client that sends only the requests/commands you are interested in and displays the info in a custom interface.

@renkun-ken
Copy link
Member

renkun-ken commented Nov 17, 2020

In the implementation of R session watcher, I deliberately chose not to use socket-connection based communication between editor and R session because I had poor experience working with RStudio Server in the following scenarios:

  1. When the session is busy, then the editor completion is not responding.
  2. When the session is forking into multiple child processes, then the editor as a whole could easily become unresponsive too.

Then using the task callback is a simple way to extract some information from R session while keep the session "safe" from such connections that could make it unstable especially under parallel computing or high-loading scenarios and keep the dependency minimal.

However, the price is obvious too: querying data on demand and two-way communication would be not easy.

@andycraig's demo starting a WebSocket server is just like what I did in #394 implementing the server side of R notebook but in an interactive session so that it could handle WebSocket connections in the background while the main event loop is idle.

@ManuelHentschel's implementation is similar with R notebook session in that both debugger and notebook session are fully managed and the user does not directly work with the session but through a protocol.

One notable difference is that the R notebook session uses callr::r_session to create a new R process to do actual evaluation of notebook cells so that the main process could send meta-commands like eval and interrupt (i.e. Ctrl + C) and capture the output.

In vscode-R's case, personally I need a REPL that could communicate with the editor but should be kept

  1. barebone (not wrapped in a long chain of environments or call stacks), radian is good while a WebView-based simulated REPL is not good in terms of performance, responsiveness, stability, etc.
  2. persistent. The session should be interactive and can still be reached if its connection to its editor is lost. I think an advantage of using R in vscode so far is that user could make the R session persistent with a tmux window. If vscode is crashed, corrupted or even removed, the R session is still there.
  3. parallel-safe. When the R session becomes parallel in some way (e.g. fork via mclapply or multisession via future), there are many problems handling of the standard output or some other connections of the processes. In RStudio, the R console does not show anything fork processes print while in a plain R or radian, the fork processes stdout and stderr are properly shown and is much more stable.

Therefore, if we are to enhance the REPL experiece, the best choice here might be we implement a WebSocket backend of session watcher and let user's interactive session to start a background server to communicate.

@renkun-ken
Copy link
Member

renkun-ken commented Nov 17, 2020

Sending VS Code's R commands to the session invisibly, rather than having to send them as text to the console

It would be very helpful in resolving a number of issues sending code to terminal:

  1. When there's something left in the terminal input or not in the correct mode (e.g. selection mode in tmux), then sending code will mostly result in error.
  2. Plain R terminal has some issues in dealing with a large chunk of code.

but it will produce new issues like if the session is busy, then it cannot handle messages timely, then looks like we need to handle some queuing problem? Not sure about this.

For Run Selection/Line, potentially using R instead of TypeScript to choose which lines to send (allowing use of R's parse functions)

If we could use parse(), then it might be some work to send the "correct" code chunk to the R session. We need to find the minimal parse-able expression around the cursor? This seems to require recursive parsing like I did in REditorSupport/languageserver#209 or we need to find the expression to run via parse data which assumes that the code is at least syntactically correct and thus parse-able.

Inline evaluation (sending user code to the session invisibly, and displaying the results in the editor) - this one's not really necessary but some other languages/IDE's have it and it's kind of cool

This is actually cool if not fancy as a non-REPL experience. But I'm not sure if it is useful when working with complex code and large datasets where I need to do a lot of interactive, back-and-forth evaluation.

@ManuelHentschel
Copy link
Member

ManuelHentschel commented Nov 17, 2020

@ManuelHentschel's implementation is similar with R notebook session in that both debugger and notebook session are fully managed and the user does not directly work with the session but through a protocol.

In the branch linked above (vscDebugger/websocket) the debugger attaches to a tcp port that is run in the background (in a separate tcl thread apparently, not entirely sure myself...). After attaching, you need to enter any command in the R console to view the Global Workspace, but after that the session is surprisingly robust when dealing with parallel execution etc.

E.g. while running

> f <- function(x){Sys.sleep(x); print(x); x}
> l <- parallel::mclapply(1:10, f)

the debug console is responsive and it's even possible to change the value of variables in the gloabl env in the variables window.

Same seems to apply e.g. here:

> x <- 1
> Sys.sleep(10); print(x)
# in the debug console: x <- 99 
[1] 99

Using the debugger in this mode seems to be semi-barebone (the commands entered into the R terminal itself are, the ones entered in the debug console definitely not), persistent (the R process can be started in an external terminal independent of vscode), and seems to be at least somewhat parallel-safe (see examples above?).

@andycraig
Copy link
Collaborator Author

@renkun-ken Thank you for your thoughts on this!

In vscode-R's case, personally I need a REPL that could communicate with the editor but should be kept

I agree 100% with these requirements. For me one of the key features of vscode-R is that it gives us access to a 'raw' REPL with all the benefits you mentioned.

Therefore, if we are to enhance the REPL experiece, the best choice here might be we implement a WebSocket backend of session watcher and let user's interactive session to start a background server to communicate.

Sounds good to me. As you've noted there are a number of technical details that will need some consideration (queuing if session is busy etc.). If I start working on an experimental implementation I'll post here.

@github-actions
Copy link

This issue is stale because it has been open for 365 days with no activity.

@github-actions github-actions bot added the stale label May 26, 2022
@ElianHugh ElianHugh added engineering Maintenance, style, development process and removed stale labels Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
engineering Maintenance, style, development process feature-request
Projects
None yet
Development

No branches or pull requests

4 participants