Multiple sources of LSP messages on frontend and backend #184

bollwyvl · 2020-01-28T05:54:26Z

Elevator Pitch

Enable things other chains than...

Feature ⇄ LSPConnection ⇄ WebSocket ⇄ Handler ⇄ Session ⇄ Stdio ⇄ Language Server

...to complete LSP requests.

Motivation

We've done a lot of work to make the first roundtrip work. However, there are a number of sources of messages that would provide more flexibility and robustness, and in some cases end-user simplicity.

Design Ideas

the frontend language server

Feature ⇄ FrontendLSPConnection

what: many of the node-based language servers can actually be webpacked (with appropriate shims)
why: would let us minimize the install pain for a large swath of servers/users
how: create a FrontendLSPConnection extends ILSPConnection
- each server would then be a packages/jupyterlab-language-server-xyz, with enough boilerplate to wrap/patch its guts (e.g. fs over contents REST API)

the tcp language server

Feature ⇄ LSPConnection ⇄ WebSocket ⇄ Handler ⇄ Session ⇄ Tcp ⇄ Language Server

what: in playing with live development of a language server in-loop with pygls, i really wanted to just pop open a tcp port and and add it to my language servers... but we only support stdio
why: some things are better over tcp. will probably need this eventually anyway for sourcegraph, docker-hosted things, etc.
how: create a TcpReader and Writer, add mode to the spec, allow a {{ port }} pass-through in the args

the kernel language server

Feature ⇄ KernelLSPConnection ⇄ WebSocket ⇄ Kernel

what: similar to the above, and as demonstrated over on lintotype, we can define a comm and let kernels stuff LSP into it
why: the kernels know a lot about... things. like crazy % syntax, and where real files are (except for the notebook you're working on 🤣 )
how: add a KernelLSPConnection implements ILSPConnection

the static language server

Feature ⇄ LSIFConnection ← DocRegistry ← REST ← ContentsManager

what: LSIF defines a spec for a JSON-based dump of a language server in a workspace
why: something like jyve or a lab-forward nbconvert would let you keep all the LSP features (hover, symbols, etc) in a static published setting
how: add an LSIFConnection implements ILSPConnection. add a button for "save workspace", and be able to open an LSIF as a new language server

The text was updated successfully, but these errors were encountered:

bollwyvl · 2020-02-14T13:46:20Z

Another thing here: a motivating case for this would be including (at least one) webpack-compiled language server by default for the language that we all have to deal with: markdown. It should be wired up for using that for markdown cells... and perhaps optionally in code block comments (though python folk are still rewarded for rst, but that looks like a bear to package).

Markdown will really stress the nested language (#191) work, as well... for example, de-facto Jupyter markdown includes support for LaTeX math. It would be interesting to see if there is a reasonable-to-embed language server for that sub-dialect... again, many of the options look hard to package, but perhaps one could theoretically compile digestif and lua to WASM (buh).

The3DWizard · 2020-05-08T12:47:23Z

tcp language server: I tried to get jupyterlab-lsp running with the wolfram language server from @kenkangxgwe. Right now this wolfram language server only supports tcp connections and no stdio connection, although the author seems to be trying to support stdio connections in future releases. So +1 for the tcp language server support.

bollwyvl · 2020-05-08T13:01:34Z

Thanks for the heads-up on that. The big thing I want tcp for is being able to launch a pyls (kind of a DIY language server too kit) inside a running ipython kernel to interactively develop new language servers. It's really not that big of a change: we basically need to: - subclass the Reader and Writer classes - pick the appropriate client (presumably there's already one) - handle any asynchrony - add a config flag - add some tests - release! I don't imagine we'll test with the Wolfram one, but a number of open source servers support tcp, for sure. If we did get it working, we could at least add the detection logic for it!

…

On Fri, May 8, 2020, 08:47 The3DWizard ***@***.***> wrote: *tcp language server*: I tried to get jupyterlab-lsp <https://github.com/krassowski/jupyterlab-lsp> running with the wolfram language server <https://github.com/kenkangxgwe/lsp-wl> from @kenkangxgwe <https://github.com/kenkangxgwe>. Right now this wolfram language server only supports tcp connections and no stdio connection, although the author seems to be trying to support stdio connections in future releases. So +1 for the tcp language server support. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/krassowski/jupyterlab-lsp/issues/184#issuecomment-625798784>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAALCREI5D7HCZTLWHKXAXTRQP5ORANCNFSM4KMM3DIQ> .

FlyingSamson · 2021-02-11T18:02:59Z

Thanks for the heads-up on that. The big thing I want tcp for is being able to launch a pyls (kind of a DIY language server too kit) inside a running ipython kernel to interactively develop new language servers. It's really not that big of a change: we basically need to: - subclass the Reader and Writer classes - pick the appropriate client (presumably there's already one) - handle any asynchrony - add a config flag - add some tests - release! I don't imagine we'll test with the Wolfram one, but a number of open source servers support tcp, for sure. If we did get it working, we could at least add the detection logic for it!
…

Hey there, I am currently looking into this (for the Wolfram LS actually). Might even be as easy as using socket.makefile() and reusing the existing stdio reader and writer.

However, could you be a little bit more specific about this part:

allow a {{ port }} pass-through in the args

I already added the mode=stdio/tcp flag in the spec, but I'm not sure what would be the correct way of specifying the port (and potentially the host). I mean for instance the Wolfram LS requires the argument -tcp=, so one could pick any free port for this and hard-code it into the args in the specs file. I presume you mean something like a placeholder which is then set by the extension during process execution/initialization? So for instance we would reserve the tag {{port}} and in the LanguageServerSession.init_process() we would use re.sub() to replace it with the appropriate port?

Of course we also need to pass the port into the extension such that it can open the TCP connection when creating the reader and writer (and potentially fill in the placeholder in the args) and this is were I'm stuck. Where would I put this and how would I access it? Also in the specs? Or should there be some magic which will search for a free port and use that one instead? What if the LS does not allow changing the port (don't know if that would be the case for any LS)?

bollwyvl · 2021-02-11T18:39:07Z

For my specific use case of interactively developing a language server locally

I'd want this in-loop with pygls on ipython. The process would already be running, and would already have selected a port in my spec. This is fine, but probably only a very specific testing case would have such a hard-coded port, or the port would be selected and hard-coded at test start time.

For the case of a local TCP LS which has its entire lifecycle managed by the server extension

if it does have a hard-coded port, you'd just put that in argv and pray... but we're unlikely to autodetect or test anything that works like that
if it can accept a port (either as an arg or environment variable), it would need a placeholder
- picking a random open port is not a huge deal (we already have it in our acceptance tests someplace) and is more reliable than hardcoding something

For a remote connection... well, that's a whole other kettle of wax. I don't know how you'd get away with anything other than a hard-coded host/port.

mode=stdio/tcp flag in the spec

as in: changed the schema? we'd need to be very careful here, as that's the contract between the frontend and backend. Further, it should be possible to add this to the v2 schema, as it's only adding some properties, so they can't be required. I'm imagining it would be something like:

properties:
  mode:
    description: the connection technique. if absent, assumes stdio
    type: string
    enum:
      - stdio
      - tcp
  port:
    description: the port for tcp mode connections. a null value will select a random, unused port
    default: null
    oneOf:
      - type: null
      - type: number
        format: integer
  host:
    description: the host for tcp mode connections. a null value will assume 127.0.0.1
    oneOf:
      - type: null
      - type: string

in some other places in the stack, unix sockets are also supported, but i don't know if any language servers support that.

socket.makefile

Perhaps be wary going down this road, as the docs say The socket must be in blocking mode. Figuring out how to do it with a non-blocking socket would be the high road, and more likely to work on more platforms. the anyio upstream, now a dependency of jupyter_server would probably be the right place to look. As long as we're in-loop with the rest of the server, and the socket might take... basically any time at all, the very real danger exists of blocking everything (kernels, contents, etc) which is a very unsatisfying experience.

re.sub() to replace it with the appropriate port?

Probably use either .format, or jinja2 (looks like what i was thinking), since it's lying around. But yes, an individual shell token would have some content replaced. For prior art, see jupyter-server-proxy, which we used to rely on... but it has some other characteristics we didn't want.

FlyingSamson · 2021-02-12T17:33:03Z

Thank you very much for your pointers, they are greatly appreciated!

as in: changed the schema? we'd need to be very careful here, as that's the contract between the frontend and backend. Further, it should be possible to add this to the v2 schema, as it's only adding some properties, so they can't be required. I'm imagining it would be something like:

Sorry I don't really understand what schema v2 vs v1 is. Currently I added

"mode": {
          "description": "connection mode used, e.g. stdio, tcp",
          "title": "Mode",
          "type": "string"
        },

inside partial-language-server-spec → properties within `python_packages/jupyter_lsp/jupyter_lsp/schema/schema.json" on the master branch. Should I use a different branch as base, containing such an .yaml based schema file?

Perhaps be wary going down this road, as the docs say The socket must be in blocking mode. Figuring out how to do it with a non-blocking socket would be the high road, and more likely to work on more platforms. the anyio upstream, now a dependency of jupyter_server would probably be the right place to look. As long as we're in-loop with the rest of the server, and the socket might take... basically any time at all, the very real danger exists of blocking everything (kernels, contents, etc) which is a very unsatisfying experience.

I see. To bad, would have been too easy this way ;-) I already had a look at the TCP connection methods within asyncio, before I found the makefile routine. I will see whether I can get a connection up and running with that then.

Probably use either .format, or jinja2 (looks like what i was thinking), since it's lying around. But yes, an individual shell token would have some content replaced. For prior art, see jupyter-server-proxy, which we used to rely on... but it has some other characteristics we didn't want.

Thanks for clarifying. I will have a look at jinja2 and the way free port detection was done in the acceptance test.

bollwyvl · 2021-02-13T04:22:48Z

containing such an .yaml based schema file?

That .json is the correct file, it's just easier for me to freehand valid yaml... but we don't want to wade into the yaml dependency mess for one file. Maybe we'll all use TOML some day or something, if that makes it into every standard library (and grows some better tooling support)... Julia, Python and Rust can't all be wrong, surely? Until then, JSON it is, for all its warts.

The version property somewhere in there lets us ensure that if we change something drastic about the spec then the spec will not load, as starting a language server incorrectly is probably worse than it just not starting (but generating copious logs). We bumped to version 2 when we gave up on language being the main key to identify language servers, and instead do everything based on the implementation name.

As long as all of our specs, and any open source ones we know of in the wild (countable on one hand after trip to the ER on an exciting insert national holiday here 🎆) still validate, it can stay 2. If mode, port, host, etc. change the game a lot, then it will need to be a 3, i guess. If proprietary forks/languages we don't know about can't update that one file, then their users will have to pin, but we really try to keep that part backwards compatible.

TCP connection methods within asyncio

Ah, anyio is a different beast than asyncio... while in the standard lib (and very close to the tornado code one sees in many places across the jupyter landscape) there are other loop implementations. anyio is a loop-independent wrapper and has its own socket wrapper. More importantly, and the reason it's in jupyter_server, is that it has robust async file I/O... indeed, it could well be we should move our existing subprocess stuff to their wrapper, as that has proven notoriously tricky to make work, especially on windows... language servers are not the most stable/well-behaved/spec-conforming things around.

look at jinja2

well, i guess jinja has the danger of folk getting too fancy... if what needs to be done can be done with .format, that's probably sufficient (and i believe what jupyter-server-proxy does).

port detection was done in the acceptance test.

sorry, was a bit strapped for time, earlier. It's implemented here and is pretty straightforward. It would just live in some util file or something, as I don't think it's part of the non-test portion of jupyter_server.

FlyingSamson · 2021-02-13T09:53:40Z

If mode, port, host, etc. change the game a lot, then it will need to be a 3, i guess. If proprietary forks/languages we don't know about can't update that one file, then their users will have to pin, but we really try to keep that part backwards compatible.

All right. I guess this shouldn't be to difficult by just making stdio the default value for mode, as suggested in your yaml snippet above, and ignoring port and host if mode != tcp.

indeed, it could well be we should move our existing subprocess stuff to their wrapper, as that has proven notoriously tricky to make work, especially on windows... language servers are not the most stable/well-behaved/spec-conforming things around.

Ok, then I will see whether I can move the process creation stuff to anyio first. After a first glance on that method it seems that there is no env parameter as is the case for Popen, so I guess we would have to manipulate os.environ directly before calling open_process(). On unix the opened process seems to inherit the new environment. I hope this will also be true for other OS's.

bollwyvl · 2021-04-17T14:49:02Z

Wanted to point out that https://github.com/jtpio/jupyterlite is looking like the most compelling way forward for static hosting of a lab experience. Having LSP features in a static (set of) document(s) would be killer.

There's an issue (https://github.com/jtpio/jupyterlite/issues/40) for offering bespoke routes on the "server", comms and files (e.g. LSIF).

I'm kind of inclined to favor targeting (ha!) the comms (rather than faking our home-rolled websockets), as then the browser kernels could offer both the jupyter messages as well as LSP, which would get around them having to coordinate their language versions, filesystems, etc.

bollwyvl added question Further information is requested package:jupyterlab-lsp package:jupyter-lsp labels Jan 28, 2020

bollwyvl mentioned this issue Mar 31, 2020

Compatibility with Multiple Completer Extensions #236

Closed

bollwyvl mentioned this issue May 6, 2020

Language Server Kernel #268

Open

This was referenced Jun 7, 2020

[wip] Add Language Server Proxy Kernel #278

Open

Add language server (LSP) support in notebook server and jupyterlab-monaco jupyter/enhancement-proposals#26

Closed

How to implement LSP for a multi-language kernel (SoS)? #282

Open

bollwyvl mentioned this issue Aug 14, 2020

Features refactor #318

Merged

41 tasks

This was referenced Feb 1, 2021

Hoist CodeMirror Mark Widgets to a Layer-based model #500

Open

Conda Environments/Kernels Documentation Request #501

Closed

bollwyvl mentioned this issue Feb 12, 2021

Slow completion for large Python packages when using the default pyls #272

Closed

bollwyvl mentioned this issue Mar 21, 2021

Offer an API for Output Matchers #564

Open

This was referenced Mar 31, 2021

JSON and YAML lsp #572

Open

Implement snippets support for completions #208

Open

FlyingSamson mentioned this issue Jun 29, 2021

Tcp mode #636

Open

6 tasks

bollwyvl mentioned this issue Jan 2, 2022

4.0 release plan #742

Closed

13 tasks

bollwyvl mentioned this issue Sep 11, 2022

Slow startup on a cloud-hosted Hub #851

Open

bollwyvl mentioned this issue Mar 6, 2024

Support WebWorker-based Language Servers jupyterlab/jupyterlab#15929

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple sources of LSP messages on frontend and backend #184

Multiple sources of LSP messages on frontend and backend #184

bollwyvl commented Jan 28, 2020

bollwyvl commented Feb 14, 2020

The3DWizard commented May 8, 2020

bollwyvl commented May 8, 2020 via email

FlyingSamson commented Feb 11, 2021 •

edited

Loading

bollwyvl commented Feb 11, 2021

FlyingSamson commented Feb 12, 2021 •

edited

Loading

bollwyvl commented Feb 13, 2021

FlyingSamson commented Feb 13, 2021

bollwyvl commented Apr 17, 2021

Multiple sources of LSP messages on frontend and backend #184

Multiple sources of LSP messages on frontend and backend #184

Comments

bollwyvl commented Jan 28, 2020

Elevator Pitch

Motivation

Design Ideas

the frontend language server

the tcp language server

the kernel language server

the static language server

bollwyvl commented Feb 14, 2020

The3DWizard commented May 8, 2020

bollwyvl commented May 8, 2020 via email

FlyingSamson commented Feb 11, 2021 • edited Loading

bollwyvl commented Feb 11, 2021

FlyingSamson commented Feb 12, 2021 • edited Loading

bollwyvl commented Feb 13, 2021

FlyingSamson commented Feb 13, 2021

bollwyvl commented Apr 17, 2021

FlyingSamson commented Feb 11, 2021 •

edited

Loading

FlyingSamson commented Feb 12, 2021 •

edited

Loading