Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create new package for protocol-level concerns #101

Closed
bollwyvl opened this issue Jan 28, 2020 · 9 comments
Closed

Create new package for protocol-level concerns #101

bollwyvl opened this issue Jan 28, 2020 · 9 comments
Assignees
Labels
discussion For discussing design, approaches, etc.

Comments

@bollwyvl
Copy link
Contributor

bollwyvl commented Jan 28, 2020

Moved here from #93 (comment)

In a nutshell:

  • complete the derivation of a JSON Schema for the Language Server Protocol
  • Use the schema to generate a new, standalone package, e.g. language-server-protocol
    • with good typing, e.g. dataclasses, typeddict or namedtuple
    • either
      • offer multiple versions of the library, e.g. language-server-protocol==3.14.x (only one per env)
      • include multiple versions of the spec e.g. currently lsp.v3_14 (would allow multiple servers with different versions)
    • offer optionally pluggable serialization (e.g. json, orjson, ujson) and runtime validation (e.g. jsonschema, fastjsonschema)
  • use language-server-protocol in pygls
    • remove most of the constants and classes of pygls.types and pygls.protocol

For a language server implementer, the default would look the same:

# setup.py
    install_requires = ["pygls"],

# server.py
import pygls
server = pygls.LanguageServer()

@server.feature(server.lsp.SELECTION_RANGE)
def foo():
    return None  # would fail because SELECTION_RANGE not defined

no validation, stdlib json, use the latest stable LSP

While a fancier implementation might choose:

# setup.py
    install_requires = ["pygls", "orjson", "fast-jsonschema"],

# server.py
import language_server_protocol
import pygls
lsp = language_server_protocol.V3_15(
    json=lsp.io.OrJSON(), 
    validator=lsp.schema.FastJSONSchema()
)

server: LanguageServer[lsp.V3_15] = LanguageServer(lsp=lsp)

@server.feature(lsp.SELECTION_RANGE)
async def bar():
    return await some_untyped_function()  # would fail at runtime if `range` not defined in result

use bleeding edge protocol, hotrod rust-based JSON, and do runtime checking with a code-generating parser

@danixeee danixeee added the discussion For discussing design, approaches, etc. label Jan 29, 2020
@bollwyvl
Copy link
Contributor Author

Discussed off-line: before starting we should resolve some plumbing:

  • creating another organisation/repository
  • creating a separate project inside the same repository with pygls (as Microsoft did with their nodejs LSP implementation)?

I'm going to assume the latter means: "in-tree", e.g. another src. Having a long-running orphan branch is the fastest to start, but has most of the disadvantages of both.

A new repo is more work, and more things to watch. +1 internal.

Because there isn't a machine-readable spec, it presently has to be guessed at from the spec repo and the reference implementation. This requires a fair amount of tooling: while the markdown parsing is done in python, the typescript parsing pretty much must be done in typescript. pygls is nicely pure python at present, and it would be a shame to have to bring in the whole stack for a task that should be pretty static. +1 external.

Testing new changes is always easier with a single repo... but only for the reference implementation. Selfishly, I want to be able to use this new subpackage in jupyter-lsp. So +1 internal (for pylgs devs, -1 for everyone else that wants to contribute to the spec work, but isn't necessarily interested in pygls).

While the release cadence of multiple packages in the same repo can be different, adding new release-ables to an existing repo increases complexity: you have to start being more descriptive with tags. +1 for external. The spec, once it works, should be released far more slowly.

While pylgs does currently contain the typings, they aren't shown in the documentation: and this is good, as a high-level, convenience toolkit may well not want to have a 120+ heading page of all the types, but that would be totally appropriate for the dedicated spec docs package. It's probably possible to get RTD to generate multiple sites from the same repo, but it's still more complex. +1 external.

@brettcannon
Copy link
Contributor

Is there still interest in doing this? I had a similar thought yesterday about auto-generating the Pydantic classes for the protocols from the official Markdown files.

@bollwyvl
Copy link
Contributor Author

Thanks for warming this back up!

I don't want to make too much trouble regarding pydantic (said horse being vigorously flogged elsewhere) but my bet is still on an at-rest JSON Schema intermediate. With this, a library can either use it directly with a validator at runtime, or generate classes in whatever language.

My reasoning is this is the likeliest to get support/adoption by other communities, even if the tooling to build it is in a mishmash of typescript (since the reference implementation) and python (since we're talking about it here).

I did some work to this end over on expectorate... I got hung up on exhaustive testing, and didn't pick it up again, but the code is mostly serviceable as a build chain. I think with some of the recent improvements to hypothesis-jsonschema it would work better now... and wiring up TS in-process with dukpy for the oracle seems quite plausible.

@brettcannon
Copy link
Contributor

Going the route of a JSON intermediate sounds like more of a request on the LSP/VS Code team themselves than us worrying about a cross-language definition. Now making something that works for various types of Python class definitions (e.g. dataclasses) might make sense, I personally wouldn't want to prematurely optimize for other language communities coming on board when I think that's more of an upstream issue (for which I've opened microsoft/language-server-protocol#1248).

@danixeee
Copy link
Contributor

I like the idea with JSON schema, but I don't have time to work on it right now. Is there any benefit in writing a tool to generate classes from official markdown files? It should be pretty straightforward to look at recent changes in nodejs protocol implementation and apply them here, until (and if) they solve microsoft/language-server-protocol#67.

@brettcannon
Copy link
Contributor

I think it's more of a question of whether we can try to minimize custom work from the Python community compared to sharing it across all the language communities. So while I don't think there's anything fundamentally wrong with just scraping the Markdown spec file, I'm personally waiting to see if upstream is willing to at least provide guidance on what they might except so it can be managed in one place and done only once.

But a key benefit of getting a JSON schema to work with is the work to convert it to Pydantic models has been done already 😁 https://github.com/koxudaxi/datamodel-code-generator

@bollwyvl
Copy link
Contributor Author

The markdown spec, as written, currently leaves some things under-specified, which need to be filled in by the reference server implementation... I don't remember exactly what (probably some enums or something), but I probably wouldn't have gone to the trouble of unifying the two if i could get it from the spec. For some of them, it doesn't really matter what the spec says, if the 99% focus is on what the reference server and client speak. Luckily I haven't had to scrape more of the that repo...

Conversely, the broader Method(Params?) -> Optional[Response] relationship is missing (or rather, buried in function overloads with positional arguments) from the reference implementation, encoded only as emoji in the markdown (seriously, can't make this stuff up). I see having that knowledge as a critical piece to being able to confidently validate generated code... not just types, but the stubs for the actual methods (suitably adapted for e.g. the leading ls: LanguageServer in all of the pygls signatures).

Perhaps opening a discussion on https://github.com/langserver/langserver.github.io would be more productive... there's at least the presented image of it being community-driven.

@brettcannon
Copy link
Contributor

I think discussing stuff on microsoft/language-server-protocol#67 makes sense.

@tombh
Copy link
Collaborator

tombh commented Dec 3, 2022

Pygls has now officially migrated to lsprotocol 🎉

Happy to re-open or open new discussions for any remaining issues.

@tombh tombh closed this as completed Dec 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion For discussing design, approaches, etc.
Projects
None yet
Development

No branches or pull requests

4 participants