-
-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An attempt at migrating to lsprotocol
#264
Conversation
f58ce61
to
5d74798
Compare
if params_type is None: | ||
params_type = dict_to_object | ||
elif params_type.__name__ == ExecuteCommandParams.__name__: | ||
params = deserialize_command(params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of this PR deserialize_command
is a dead function... by not calling it seemed to make the type signature on this method actually reflect reality
Anyone know why it was being used before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're wondering if that was a workaround for something that still needs to be worked around, despite deserialize_command
no longer existing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... it must've been added for a reason... but at the moment things appear more consistent without it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alcarney Many thanks for all of your work! Not really sure if this is precisely what you're asking but as I have had to battle it out with this piece of code a while ago, here is my rough understanding:
This seems like a temporary solution in order to reconcile
- the desire to provide structured arguments rather than raw dictionaries to the various handlers for workspace/executeCommand
- with the inability to implement proper validation without customizing the deserialization logic on a per command basis. (each method is expecting different kinds of arguments)
As a compromise, dictionaries inside the arguments are recursively translated into named tuples so as to at least support attribute access like you can see
even if the objects do not (properly) match the corresponding type hints.
For my project needs, I have ended up implementing an alternative solution using pydantic.validate_arguments which can wrap function to perform validation on its inputs based on the corresponding type hints. The argument list containing the raw data can then be passed directly without going through the whole named tuple conversion (I disable it by monkey patching deserialize_command
). In my case I am also unpacking it for extra usability, for example in order to define a method with three parameters
@server.custom_command("navigate_ast")
async def navigate_ast(
server: PythonVoiceCodingPluginLanguageServer,
command: StandardCommand,
doc_uri: str,
sel: Union[Range,Sequence[Range]] = [],
):
...
where StandardCommand
is a pydantic.Model
defined in my project. My code looks roughly like this:
def custom_command(
self, command_name: str
) -> Callable[[F], Callable[["PythonVoiceCodingPluginLanguageServer", Any], Any]]:
def wrapper(f: F):
f = validate_arguments(config=dict(
arbitrary_types_allowed=True))(f)
async def function(server: PythonVoiceCodingPluginLanguageServer, args):
return await f(server, *args)
self.lsp.fm.command(command_name)(function)
return f
return wrapper
The whole thing needs to be polished, refined and rewritten for cattrs but I would be willing to help if there is interest to go down that way. An approach based on type hints might also go nicely with #222 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I certainly like the sound of de-serializing the arguments based on type hints!
@tombh since it sounds like this PR will likely land on a staging branch, would it make sense to remove the existing named tuple solution and have a follow on PR do something clever with type hints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so yes!
So you're saying to remove the existing tuple solution and move to the type hint solution in one PR? I mean there's no intermediate step? As in remove the tuple solution now, which would then require a replacement before merging into main.
And argument against this though is that we shouldn't put too much in this release candidate. I mean it's better to focus on a minimum in order to get a RC public, rather than have too many new features that might block a minimal viable working lsprotocolised Pygls.
@@ -78,6 +80,7 @@ def test_capabilities(client_server): | |||
|
|||
|
|||
@ConfiguredLS.decorate() | |||
@pytest.mark.skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked into it yet, but the changes in this PR seem to break the tests skipped in this file in such a way that they hang the entire test suite. The only error message I was able to get was something going wrong in the exceptiongroup
package...
============================================================================================================= FAILURES ==============================================================================================================
______________________________________________________________________________________ test_signature_help_return_signature_help[ConfiguredLS] ______________________________________________________________________________________
exc = <class 'pygls.exceptions.JsonRpcInvalidParams'>, value = JsonRpcInvalidParams('Invalid Params'), tb = <traceback object at 0x7f708a6fee00>, limit = None, chain = True
def format_exception(exc, /, value=_sentinel, tb=_sentinel, limit=None, \
chain=True):
"""Format a stack trace and the exception information.
The arguments have the same meaning as the corresponding arguments
to print_exception(). The return value is a list of strings, each
ending in a newline and some containing internal newlines. When
these lines are concatenated and printed, exactly the same text is
printed as does print_exception().
"""
value, tb = _parse_value_tb(exc, value, tb)
te = TracebackException(type(value), value, tb, limit=limit, compact=True)
> return list(te.format(chain=chain))
/usr/lib64/python3.10/traceback.py:136:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.env/lib64/python3.10/site-packages/exceptiongroup/_formatting.py:233: in traceback_exception_format
yield from exc.exceptions[i].format(chain=chain, _ctx=_ctx)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <traceback.TracebackException object at 0x7f708a8ea020>
def traceback_exception_format(self, *, chain=True, _ctx=None):
if _ctx is None:
_ctx = _ExceptionPrintContext()
output = []
exc = self
if chain:
while exc:
> if exc.__cause__ is not None:
E AttributeError: 'TracebackException' object has no attribute '__cause__'
.env/lib64/python3.10/site-packages/exceptiongroup/_formatting.py:169: AttributeError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you know that these tests weren't originally skipped precisely because of the hanging? I mean, maybe your changes aren't actually affecting any behaviour here right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you know that these tests weren't originally skipped precisely because of the hanging?
Because I added the skip :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😆
@@ -73,7 +73,7 @@ def test_selection_range_return_list(client_server): | |||
response = client.lsp.send_request( | |||
SELECTION_RANGE, | |||
SelectionRangeParams( | |||
query="query", | |||
# query="query", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I just missed it, but I couldn't find a type definition in lsprotocol
that had this field, nor could I find it in the spec.
Does anyone know why this is here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is something that might be answered by brave implementers of the release candidate?
I think we can generate
Looks like a bug in code generator: microsoft/lsprotocol#52
Looks like a bug in code generator: microsoft/lsprotocol#53
Yes. We can do that. microsoft/lsprotocol#54
I can look into this. We have a list of properties where we require them to be preserved, we could add this one there. The "results" has some added behavior to it in the spec. I will get back to you on this.
We can add |
@alcarney I published a new version of Do let me know if the new one addresses the issue with the |
a9d034e
to
4ee8edb
Compare
@karthiknadig awesome thanks. I've aligned this PR to the new version, but it appears that the issue with the |
@alcarney! What an epic undertaking 🤓 I feel bad that I've taken so long to absorb the enormity of what you've done. I'm still only starting to understand everything involved, but the highlight is that this definitely introduces a breaking change. Which I think will be the first time for pygls? So let's "go to town on it" (as they say in the UK anyway)! Meaning if we're going to be bumping the project by a whole number, what other breaking changes could be useful to introduce (this would be for another issue thread of course)? Maybe add a heated swimming pool in the back garden? 🤣 Thank you so much for the isolated and descriptive commits, they really ease the cognitive load of getting to grips with everything. From my initial reading I think you've taken sensible and practical decisions, and so I don't see any problems. So I see 3 things to address at the moment:
What we should be aiming for now is a justifiably messy release candidate. So somewhat counter-intuitively I think the standards for "merging" this PR (it will be merged into a Would you say that, for those new to this PR, the |
@tombh I will be looking into the missing |
@alcarney The issue with preserving lsp_types._SPECIAL_CLASSES.append(JsonRPCResponseMessage)
lsp_types._SPECIAL_PROPERTIES.append("JsonRPCResponseMessage.result") In the long term, the de-serialization should not depend on
My modification was this, I save the response type along with the future, and get it as needed. I am using the test suite here to catch any missed converter hooks. def deserialize_message(data, get_response_type, get_params_type=get_method_params_type):
"""Function used to deserialize data received from client."""
if 'jsonrpc' in data:
try:
deserialize_params(data, get_params_type)
except ValueError:
raise JsonRpcInvalidParams()
if 'id' in data:
if 'method' in data:
return METHOD_TO_TYPES[data['method']][0](**data)
elif 'error' in data:
return converter.structure(data, ResponseErrorMessage)
else:
return converter.structure(data, get_response_type(data['id']))
else:
return METHOD_TO_TYPES[data['method']][0](**data)
return data |
Yes, that commit will probably give the best impression of the kind of changes someone consuming From karthiknadig's comment above it seems like the Personally, I think there is a nice generic JSON RPC implementation buried in I ask because I think a "simple" fix to the |
I haven't been deep into the client/server JSON communication, so my naive understanding is that it's currently a comparatively adhoc implementation. Adhoc in the sense that it caters only to the specific requirements of Pygls, it can't easily be extended to provide extra features beyond LSP, such that esbonio might like. It's also adhoc in the sense that it doesn't formally adhere to the LSP standard as now defined in So from my understanding I think you're saying that you have a tension between, on the one hand, wanting to invest more in the JSON RPC to allow it to be more easily extended and, on the other hand, understanding that the most straightforward approach is to just formally adhere to If my understanding is right, then I think an example of esbonio's usecase for extending the JSON RPC would be good. |
From what I've seen,
The issue in the {"jsonrpc": "2.0", "id": 1, "result": {"hello": "world"}}
So ideally, the two perspectives need to be aligned and I can think of three possibilities
Having the ability to define custom JSON RPC based protocols would be very useful for
It's actually possible today to swap the types from functools import partial
import pygls.protocol
from pydantic import BaseModel
from pygls.lsp import get_method_params_type
from pygls.lsp import get_method_registration_options_type
from pygls.lsp import get_method_return_type
from pygls.protocol import JsonRPCProtocol
from pygls.server import Server
class ExampleResult(BaseModel):
hello: str
MY_METHODS_MAP = {"example/method": (None, None, ExampleResult)}
# Override the default method definitions
pygls.protocol.get_method_return_type = partial(get_method_return_type, lsp_methods_map=MY_METHODS_MAP)
pygls.protocol.get_method_params_type = partial(get_method_params_type, lsp_methods_map=MY_METHODS_MAP)
pygls.protocol.get_method_registration_options_type = partial(get_method_registration_options_type, lsp_methods_map=MY_METHODS_MAP)
server = Server(protocol_cls=JsonRPCProtocol)
@server.lsp.fm.feature("example/method")
def example_method(ls: Server, params):
return ExampleResult(hello="world") |
This is a great explanation, thank you. So my first thought, and it's just a thought, perhaps somewhat academic or philosophical: to what extent should the official LSP standard support custom client-server communication? I think the short answer is it shouldn't. Or at the very least I'm most certainly not saying the answer to this issue is upstream at So, where I think this gets interesting is when thinking about Pygls' role in all this. Maybe Pygls is the place to provide a more formal bridge between the static standard and the ever changing boundaries of innovation. Being one step removed from
Superficially one might think that such an approach was a half-way house lacking in commitment, that we'll someday find a better solution for. But I don't think that's case. I think it's a good opportunity to define Pygls' role and identity. Namely that it's critical for the LSP ecosystem that innovation is supported and welcomed. |
I made an attempt to switch entirely to |
I think this is starting to come together, looks like the test suite passes now, though I have at the very least some linting issues to clear up Thanks to @karthiknadig for the alcarney#1 PR, it was a big help in figuring out what to do next. Most of the (important) new changes are in 04875c5 which replaces the old Happy to talk through the changes in more detail later, but since it's quite late I'll leave you with just the highlights on changes made to the
|
Awesome. As soon as you feel ready let's merge this into a RC branch. I'm happy to approve the changes as soon you're ready.
|
The `lsprotocol.types` module is re-exported through the `pygls.lsp.types` module hopefully minimsing the number of broken imports. This also drops pygls' `LSP_METHODS_MAP` in favour of the `METHOD_TO_TYPES` map provided by `lsprotocol`. The `get_method_xxx_type` functions have been adjusted to use the new mapping. As far as I can tell `lsprotocol` doesn't provide generic JSON RPC message types (except for `ResponseErrorMessage`) so the old `JsonRPCNotification`, `JsonRPCResponseMessage` and `JsonRPCRequestMessage` types have been preserved and converted to `attrs`.
The machine readable version of the LSP spec (and therefore `lsprotocol`) provides a mapping from an LSP method's name to its `RegistrationOptions` type, which is an extension of the method's `Options` type used when computing a server's capabilities. This means the `RegistrationOptions` type includes additional fields that are not valid within the `ServerCapabilities` response. This commit introduces a new `get_method_options_type` function that returns the correct `Options` type for a given method, automatically deriving the type name from the result of the existing `get_method_registration_options_type` function when appropriate.
This simplifies much of the (de)serialization code by relying on the converter provided by `lsprotocol`. We use the `METHOD_TO_TYPES` mapping to determine which type definition to use for any given message. If a method is not known (as in the case of custom lsp commands) we fall back to pygls's existing generic RPC message classes. The following changes to the base `JsonRPCProtocol` class have also been made - server and client futures have been unified into a single `_request_futures` dict. - upon sending a request, the corresponding result type is looked up and stored in an internal `_result_types` dict. - (de)serialization code has been moved to a method on the `JsonRPCProtocol` class itself so that it has access to the required internal state. - subclasses (such as the `LanguageServerProtocol` class) are now required to implement the `get_message_type` and `get_result_type` methods to provide the type definitions corresponding with the given RPC method name.
The timeouts can get in the way when trying to debug the code under test. This commit makes it possible to disable the timeout by running the testsuite with the `DISABLE_TIMEOUT` environment variable set e.g. $ DISABLE_TIMEOUT=1 pytest -x tests/
Nothing too interesting in this one, just updating imports, class names etc to align `pygls` to the definitions in `lsprotocol`
It's now possible to select which browser is used to run the testsuite by setting the `BROWSER` environment variable e.g. BROWSER=firefox python pyodide_testrunner/run.py If no variable is found, the script will default to use Chrome.
As far as I can tell, there are no tests that depend on any of the values contained within `ClientCapabilities`. This commit adds some tests around the construction of the `TextDocumentSyncOptions` field for the server's capabilities. It also fixes a bug that was introduced in the previous commit
I think this is now in a place where is can be merged to a staging branch so people can start testing it - I'm sure there will be a few issues to find still! |
Awesome! I've published it to Pypi (as 1.0.0a) and made a dedicated pre-release PR: #273 The new branch is |
Not sure sorry... I don't see that commit anywhere - have you pushed it?
It's not the end of the world though if we don't, the main aim of this PR was only to move the conversation forward :) |
I see it now and have included it in this branch - though I'm not sure if that will help at all as the two branches are now identical - in theory there's nothing to merge? |
Yeah you're right, now it says:
Ah well, not worry. Your code isn't going to disappear 😊 |
Description
To help drive the conversation in #257 forward here is an attempt at migrating to using
lsprotocol
for all our type definitions.Note: This PR was done with the mindset of what's the minimum number of changes I can make to get something that works?. So it's very likely better solutions can be found than what I have here currently.
As of now I have something that mostly works in that most (but not all) of the current test suite passes, but I wouldn't be surprised if I managed to introduce a few bugs here and there.
There is a fair amount to digest but I've done my best to split it into separate commits. What follows is a brain dump of everything I've thought about/noticed while working on this - hopefully it's not too overwhelming! 😬 See commits and review comments for the fine details
Breaking Changes
Ideally, I would've wanted to not introduce any breaking changes by migrating to
lsprotocol
, but now I'm not so sure if that will be possible. Here is a list of the breaking changes this PR introduces that I am aware of so far.Initially I tried to re-exporting the
lsprotocol
types via the originalpygls.lsp.types
module to try and avoid breaking existing imports. However, there are enough breakages even with that approach that I now think it's better to have a clean break and switch to just importing everything fromlsprotocol.types
directly. I've kept that in a separate commit for now though (4ee8edb) to make it easy to drop, in case people prefer to keep apygls.lsp.types
module around.Position
,Range
are no longer iterableSerialization/De-serialization
This commit I'm least happy with is eb92fb7 which attempts to integrate lsprotocol's
converter
into the serialization/de-serialization setup inpygls
. However, the two libraries seem to take a slightly different approach which I think complicates thingspygls
tries to hide most of the details surrounding JSON RPC from the user, asking them to only provide values for message fields such asparams
, andresult
. This means itsLSP_METHODS_MAP
only returns types representing theparams
/results
fields of protocol messagesThe
METHOD_TO_TYPES
map inlsprotocol
on the other hand simply returns types representing the full JSON RPC message body.For the most part I think I've managed to resolve the two perspectives without having to change too much of pygls' internals, but I wonder if a cleaner solution could be found if we opted to change pygls' approach to align more closely with
lsprotocol
Anyway I'd be interested to hear people's thoughts on this.
XXXOptions
vsXXXRegistrationOptions
Edit: After some more investigation, it turns out
XXXRegistrationOptions
extendXXXOptions
to include additional fields required for dynamic registration. 557f942 adjusts how the type to validate against is chosen, see the commit message for more detailsAn interesting difference to note is that theMETHOD_TO_TYPES
map inlsprotocol
uses theXXXRegistrationOptions
for a method which as far as I can tell is meant to be used with the register/unregister capability part of the spec since it includes theDocumentSelector
field.However the currentLSP_METHODS_MAP
is set up to provide theXXXOptions
for a method as it uses the options provided via the@server.feature()
decorator to populate itsServerCapabilities
.This means migrating to the new mapping will break any code currently in use as the type checking done in the@server.feature()
decorator will fail. Note I don't think either approach is necessarily wrong, but I'd be interested to hear people's thoughts on resolving the two perspectives (even if we just ultimately declare it to be a breaking change)Questions for the
lsprotocol
teamHere are some thoughts/questions I had while working through this.
SemanticTokensOptions
seems to be missing support for{full: {delta: True}}
(LSP Spec)ServerCapabilities.workspace
field appears to be missing (LSP Spec)pygls.lsp.methods
module"result": None
field being omitted from the serialized JSON - do you know how we can preserve it?lsprotocol
to add additional methods?In some cases the existing type definitions define a few "
__dunder__
" methods or helpers that add a few quality of life improvements. I did briefly try the followinglsprotocol.types
module then the original definition would be used when de-serialising a class withconverter.structure(...)
(And I assume adding helper methods like these are not in the scope oflsprotocol
?)cc @tombh @dgreisen @karthiknadig
Code review checklist (for code reviewer to complete)