Skip to content

Conversation

@bnbarham
Copy link

There's a long standing issue with DispatchIO.read on Windows that causes a continual spin. DispatchIO is more complex than what we need here, so switch to FileHandle.readabilityHandler for reading and FileHandle.write for writing.

Given there's been issues with FileHandle.availableData and we can't read to EOF, the reads are instead broken up into chunks - reading one byte at a time for the header and then the entire contents after we have the content length. The header read could be a little more optimized, but reading a single byte allows for better error handling in the case of an invalid header.

Using the FileHandles directly also allows us to simplify a lot of the surrounding code, as there's no need for a DispatchGroup or sendQueue.

A further change could switch JSONRPCConnection to return an AsyncStream for the received messages rather than being passed a MessageHandler (skipping for now as this change is already large enough and all existing clients already use MessageHandler).

Many thanks to @roman-bcny for starting this change.

// `readabilityHandler` is only ever called sequentially and all accesses to `parser` happen within its callback
// synchronously.
nonisolated(unsafe) let parser = JSONMessageParser(decoder: decodeJSONRPCMessage)
self.receiveFD.readabilityHandler = { fileHandle in
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also just have our own queue rather than using readabilityHandler

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there is some potential to avoid the need to set readabilityHandler to nil to break retain cycles and make the entire behavior more explicit. Wouldn’t do it in this PR though so that change can get its own dedicated review.

guard let header,
let length = header.contentLength
else {
logger.error("Ignoring message due to invalid header")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the spec says we should error here (ie. if there's no content length field). Open to doing that if we prefer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be a good idea. It’s very unlikely that we’ll be able to recover from an invalid header. Doesn’t have to be this PR though.

And also to clarify: Where in the spec does it say so and what does to error mean in this context?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could have sworn I read it, but apparently it only states:

The length of the content part in bytes. This header is required.

So I guess there's some interpretation as to what "This header is required" means 😅

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this! Excited to not see the Windows CPU spinning issue anymore.

A few comments but nothing that’s blocking for now. It might even be easier to address the comments in follow-up PRs.

A couple things I think would be great to explore after this change has landed:

  • As you mentioned, investigate changing this to yield values to an AsyncStream
  • I have a feeling that with an AsyncStream we could get rid of the dispatch queue and instead make JSONRPCConnection an actor, which would probably make the code more readable and make some/all of the nonisolated(unsafe) obsolete.
  • Allow reading more than 1 byte during header parsing (shouldn’t be too hard to do and to review if it’s in its own PR)

// `readabilityHandler` is only ever called sequentially and all accesses to `parser` happen within its callback
// synchronously.
nonisolated(unsafe) let parser = JSONMessageParser(decoder: decodeJSONRPCMessage)
self.receiveFD.readabilityHandler = { fileHandle in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there is some potential to avoid the need to set readabilityHandler to nil to break retain cycles and make the entire behavior more explicit. Wouldn’t do it in this PR though so that change can get its own dedicated review.

/// If an unrecoverable error occurred on the channel's file descriptor, the connection gets closed.
///
/// - Important: Must be called on `queue`
private func send(data dispatchData: DispatchData) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use Data here instead of DispatchData now that we’re no longer using DispatchIO for stdin/stdout writing. Would do it in a follow-up PR to keep this one as minimal as possible though.

}

func parseHeader(data: Data) {
requestBuffer.append(contentsOf: data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my suggestion in parseMessage, should we have a precondition check here that verifies data.count == 1. Otherwise I think it might be very tricky to debug if we ever read past the \r\n\r\n boundary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh... see my test comment 😅

Allow reading more than 1 byte during header parsing (shouldn’t be too hard to do and to review if it’s in its own PR)

Personally I'm not sure this is worth the complexity it adds since the header is fairly small, but 🤷. IMO it's better to always read the content at once if we can (to avoid that copy), so this would mean:

  1. Reading an initial 16 bytes (as you mentioned)
  2. Reading 4 bytes from there until we see a \r or \n
  3. Reading 1 byte until we see either \r\n\r\n or neither of those characters (in which case we can read 4 bytes again)

Reading 16 bytes also doesn't allow for recovering from an invalid header, though if we error then that's less of a concern.

Did you have a better idea there?

Copy link
Member

@ahoppen ahoppen Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, recovering from the invalid header is really a good point. I would like to be able to parse an invalid header so we can error with a descriptive error message instead of loosing the stream’s structure to help people writing LSP clients.

My proposal would be something like the following (though I’d do this in a follow-up PR)

if requestBuffer.suffix(4) == JSONRPCMessageHeader.headerSeparator {
  // parse header
} else if requestBuffer.last == "\n" {
  // read two bytes. This will get us to \r\n\r\n or allow us to detect that there’s another header field.
} else if requestBuffer.last == "\r" {
  // read one byte, \r needs to be followed by a \n in the header for both header field separators and the end of header separator  
} else {
  // read 4 bytes since we know that we need to read at least `\r\n\r\n` to reach the content
}

Should probably reduce the number of reads by a factor of ~3.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point on 2 bytes, nice. And sure, will do after this PR (though my guess is it'll be readabilityHandler and thus there'll be a bunch of changes to do anyway... if so, will probably just include it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer it to be a follow-up PR because every time I read through this PR, it takes me like an hour to reason about all the cases. So having this in would be a nice first step.

There's a long standing issue with `DispatchIO.read` on Windows that
causes a continual spin. `DispatchIO` is more complex than what we need
here, so switch to `FileHandle.readabilityHandler` for reading and
`FileHandle.write` for writing.

Given there's been issues with `FileHandle.availableData` and we can't
read to EOF, the reads are instead broken up into chunks - reading one
byte at a time for the header and then the entire contents after we have
the content length. The header read could be a little more optimized,
but reading a single byte allows for better error handling in the case
of an invalid header.

Using the `FileHandle`s directly also allows us to simplify a lot of the
surrounding code, as there's no need for a `DispatchGroup` or
`sendQueue`.

A further change could switch `JSONRPCConnection` to return an
`AsyncStream` for the received messages rather than being passed a
`MessageHandler` (skipping for now as this change is already large
enough and all existing clients already use `MessageHandler`).

Many thanks to @roman-bcny for starting this change.
@bnbarham bnbarham force-pushed the use-readability-handler branch from ede261c to 074fedd Compare November 22, 2025 22:53
@bnbarham
Copy link
Author

Looks like I have some Linux and Windows debugging to do :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants