Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TlsChannel does not indicate channel as ready for I/ O operation in case less reads than available are made #233

Closed
kortemik opened this issue May 27, 2024 · 8 comments

Comments

@kortemik
Copy link

kortemik commented May 27, 2024

I use a scenario where I receive data from a client connection as follows

int nReady = selector.select(this.readTimeout);
        if (nReady == 0) {
            throw new TimeoutException("read timed out");
        }
Set<SelectionKey> polledEvents = this.selector.selectedKeys();
Iterator<SelectionKey> eventIter = polledEvents.iterator();
while (eventIter.hasNext()) {
            SelectionKey currentKey = eventIter.next();
            System.err.println("currentKey <" + currentKey + ">");
            // tlsChannel needs to know about both
            if (currentKey.isReadable() || currentKey.isWritable()) {
                try {
                    readBytes = tlsChannel.read(byteBuffer);
}
} catch (Exception e) {
// something
}
eventIter.remove();
}

If server has sent a with multiple calls to write or with an array of buffers, does not differ:

                for (ByteBuffer byteBuffer : byteBufferList) {
                    long bytesWritten = establishedContext.socket().write(new ByteBuffer[] {
                            byteBuffer
                    });
                    LOGGER.info("wrote <{}>", bytesWritten);
                }

then tlsChannel is read only once and it does not trigger ready operation in the next invocation to selector.select. the data is unavailable and lost, however if the client code is changed so that:

while (true) {
readBytes = tlsChannel.read(byteBuffer);
if (readBytes <= 0) {
break;
}
}

all data is available and read, however expected behaviour is that selector would be ready immediately as there is still data not read. this change does not comform my needs because client buffers may be smaller than send buffers, however in the where we see the abnormal behviour they are larger.

@kortemik
Copy link
Author

version 0.7.0

@marianobarrios
Copy link
Owner

Hi Mikko,

Thanks for the interest in the library and sending the report. To be honest, I see a few issues with your code (like handling tasks, for example). Did you see the example here?

@kortemik
Copy link
Author

kortemik commented May 28, 2024

Hi Mariano,

please find the full server code here: https://github.com/kortemik/rlp_03/tree/tx-frame-fragments . here are serverside changes that make the client "work" https://github.com/kortemik/rlp_03/tree/extra-bad . branch is named as extra-bad is because I've been testing workarounds and they are not going to be merged. please compare the changes in extra-bad branch 0ae049b7fcc6e0cedd251b4a5ed4f52e2cd2cd17 (client stuck) and 28b190d9c8d01da737dd413fea61497c01c63607 (client works) . test case for the tls-client (com.teragrep.rlp_03.TlsClientTest) will fail in 0ae049b7fcc6e0cedd251b4a5ed4f52e2cd2cd17.

client code is here:
https://github.com/teragrep/rlp_01/blob/master/src/main/java/com/teragrep/rlp_01/RelpClientTlsSocket.java#L199

I thought I wouldn't need to use the tasks because the threads reading/writing could do it? rlp_01 is designed to be threadless so it does not have any pools or so. we have client code in rlp_03 which unrelated to this cas. however it is capable of multithreading and offloads processing to a pool, however it offloads the whole connection processing not just the tls part so it doesn't use tasks either.

@kortemik
Copy link
Author

I think the reason I see this bug is because I trust the selector to be immediately ready in the client code if I didn't read all the data before entering the select loop. This works just fine for the SocketChannel.

kortemik added a commit to kortemik/rlp_01 that referenced this issue May 28, 2024
kortemik added a commit to teragrep/rlp_01 that referenced this issue May 28, 2024
@marianobarrios
Copy link
Owner

Hi Mikko,

Sorry for the delay in the response.

If I understood your situation correctly, then the library is working as intended. Channels are just meant to be read in a loop, until they return 0, -1, or, in the case of TlsChannel, throw one of the flow-control exceptions. If you read careful the documentation from ReadableByteChannel.read(), you will note that it never says that all the bytes available in the buffer are returned (that said, I think the documentation could be clearer).

Of course, it may just work in the standard SocketChannel. But in TlsChannel, we have to actually decrypt encrypted data, so if there is something already decrypted, it is returned immediately. Otherwise we would be holding more data than necessary, using CPU, and introducing a potential latency issue in the client. Please also note that the selector is selecting on the raw channel; if the bytes were already read and are buffered for decryption, the selector will not see them any more.

And yes, the examples in the repository are wrong, they need to be fixed.

Does this make sense to you?

Best regards.

@kortemik
Copy link
Author

Hi Mariano,

No worries, I am quite busy as well.

ReadableByteChannel.read() is clear, there can not really be a guarantee anyway to return all of the data from the channel especially in non-blocking mode. I was expecting it to behave like you describe.

I think we are now getting here on the interesting ground, as you write "the selector is selecting on the raw channel". When using SocketChannel directly the selector will notice that there are still bytes in the channel (or space considering writes) and return immediately as ready. Could selector be made to select on TlsChannel instead of the raw channel? I mean TlsChannel channel could this way indicate to the selector that there is still data that can be read() without need for interestOps changes. This is what I was expecting, however if it is not possible then I think this should be documented quite clearly that there is a such difference between SocketChannel and TlsChannel.

@marianobarrios
Copy link
Owner

The idea of the "selector selecting on the raw channel" is taken from OpenSSL (and I assume it's also done by others). I never really considered other design, to be honest. I guess it may be possible in theory, but I am not sure (Java selectors are not a great design, and have surprises).

That said, I think the issue of the selector being ready again, so to speak, is actually orthogonal. The question can be reduced to: what does it mean that a selector is ready? Is it that (a) bytes to read appeared, or that (b) bytes to read are present?

My guess is that exactly the same happens with a plan socket. The selector becomes ready, a partial read is done (can be forced with a small buffer). Is the selector ready again immediately?

@kortemik
Copy link
Author

With a plain socket, selector is ready immediately when there are bytes available in the channel, as the channel is ready for the read operation. No bytes are discarded in case everything is not read. I read the documentation of Selector and it says "channels ready to perform an operation" and took it perhaps too literaly that I must use only one operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants