Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP support #52

Merged
merged 49 commits into from
Nov 2, 2020
Merged

TCP support #52

merged 49 commits into from
Nov 2, 2020

Conversation

daveyarwood
Copy link
Collaborator

Background

The project I'm working on requires OSC messages to be sent and received over TCP, with a client written in Go and a server written in Java. Neither go-osc nor JavaOSC currently support TCP, so as a labor of love, I've implemented TCP support in both libraries.

This implements TCP support as discussed in #12. This branch builds on top of #47, which I'm about to close because this PR starts with the commits on that branch.

The public API is the same as what we have on master today, except that now, users can set the network protocol to TCP like this:

// receiver
new OSCPortInBuilder()
  .setLocalPort(12345)
  .setNetworkProtocol(NetworkProtocol.TCP)
  .build();

// sender
new OSCPortOutBuilder()
  .setRemotePort(12345)
  .setNetworkProtocol(NetworkProtocol.TCP)
  .build();

I did a bunch of work on the tests to enable thorough testing, reusing the existing UDP tests to test using TCP. I added some additional TCP-only tests that involve sending very large payloads, which is something you can't do over UDP because there is a packet size limit (typically about 65k bytes, depending on your OS).

I also wrote simple DebugServer and DebugClient classes to facilitate testing how this branch of JavaOSC interacts with my TCP branch of go-osc, as well as testing how this branch of JavaOSC works in my project.

I've tested all the permutations of TCP vs. UDP, Go client vs. Java client, and Go server vs. Java server.

After almost two months of working on TCP support in these two libraries, I'm happy to say that everything is looking great, as far as I can tell. I haven't done any legit benchmarks, but subjectively, performance seems to be about the same as it is on master, and I don't think I've made any changes that would adversely affect performance, at least as far as I'm aware. (But if anything looks suspect, please point it out!)

Summary of changes

There are a ton of changes here, so bear with me...

Tests

  • I did a bunch of refactoring in OSCPortTest to make it possible to reuse the existing tests for TCP.

    • I cleaned up the setUp method a bit, making it more generic and allowing the network protocol to be specified.
  • I fixed some timing issues involving the receivers for previous test cases inadvertently hanging around occupying the port and causing the current test case to fail.

    • This is especially problematic for TCP, where you can't have multiple receivers binding the same port.
    • I did two things to fix this:
      • I made it so that each TCP test case uses a random available port instead of reusing ports across test cases.

      • Instead of sleeping for an arbitrary duration, I wrote some utility methods like retryUntilTrue and assertEventuallyTrue that repeatedly check a condition until the desired state is reached, failing if the desired state isn't reached within a timeout period.

        I then used retryUntilTrue to implement waitForSocketClose, and made that part of the tear-down code that runs after each test case, thus ensuring that the receiver gets closed before the next test case begins.

  • I fixed similar timing issues with respect to the test cases that set up listeners and then test that a message is received.

Logging

  • I noticed that log output was being silenced because there is no log4j config file. The log output was useful for me as I was running tests because if a receiver thread threw an exception, I could see the stacktrace instead of the thread just dying silently.

  • I think we probably don't want to include a log4j.properties file in the project, because I think it might interfere with the logging configuration for people using this project.

  • As a solution, I added a log4j-dev.properties file that does not get picked up by default, but you can opt into using it by including the -Dlog4j.configuration=log4j-dev.properties option when you use mvn.

Module organization, transport

  • BREAKING Move higher-level transport.udp.OSCPort* classes up a level to transport.OSCPort*.

    • Most of the code in these classes has nothing to do with UDP transport specifically, and is more about the transport of OSC messages at a higher level.
  • Added a Transport interface that captures the low level implementation details.

  • Moved the current UDP transport implementation into UDPTransport, which implements Transport.

  • Implemented TCPTransport.

  • Adjusted the toString implementations of OSCPort et al to include information about the Transport.

  • OSCParseExceptions now include the data we attempted to parse

    • see e2b9daf
    • Allows the higher level OSCPortIn class to handle the data without needing to reach down into the transport implementation.

OSCPortInBuilder, OSCPortOutBuilder

  • Added a setNetworkProtocol method to OSCPortInBuilder in order to opt into using TCP.

  • We didn't have an OSCPortOutBuilder yet, so I created one that has the same options as OSCPortInBuilder, including setNetworkProtocol.

Serializer

  • Because there is no upper bound on the size of an OSC packet sent over TCP, it isn't feasible to have a fixed-size byte array or ByteBuffer and write to it, potentially overflowing if the packet is too big. That approach works for UDP because there is a packet size limit, but I had to make some adjustments to get it working with TCP without there being an arbitrary limit on the size of the packet to be serialized.

  • See Multi-protocol support (UDP, TCP, ...) #12 for context. I initially stated that the serializer should be stream-oriented, but after spending more time in the code, I realized that a cleaner approach would be to refactor the methods of OSCSerializer to be pure functions that return byte arrays. The public API is a method called byte[] serialize(OSCPacket). A byte array is the simplest thing that works with both the UDP and TCP transports.

  • Along the same lines, the contract of ArgumentHandler.serialize is now byte[] serialize(T value) instead of void serialize(ByteBuffer output, T value)


I realize this PR is huge. Let me know if there's anything I can do to make reviewing it easier!

There is a bunch of transport-agnostic logic in osc.transport.udp that should be
shared with the upcoming TCP implementation.

The plan after this is to move UDP-specific things back down into
osc.transport.udp. Then, implement tho corresponding TCP-specific bits in
transport.tcp.
The data (a ByteBuffer) is an implementation detail of the transport. The higher
level API is only concerned with sending and receiving OSCPackets, and doesn't
need to worry about the underlying data unless something goes wrong.
I decided to remove some of the getter methods from OSCPort/OSCPortIn/OSCPortOut
that concern themselves with transport-level implementation details. This
resulted in compiler errors in the testPorts test case, as getRemoteAddress no
longer exists.

I thought about this, and I think that removing getRemoteAddress makes sense in
the context of real world usage. I don't think it makes sense to keep it just to
facilitate having these tests, which are only testing that the default values we
specified are the values you get when you don't provide a value.

I'm open to feedback on this - happy to add this back if needed!
The issues I'm addressing here are:

* The details of the sender and receiver ports aren't clear/explicit for each
  test.

* Code maintainers need to realize and remember that there is a @before
  method and keep in mind the type of setup created by the @before method.

* A bunch of tests close the initial sender and receiver ports created by the
  @before method and then create new ones. In some cases (e.g.
  testReceivingLoopbackIPv6 when !supportsIPv6()), we don't actually want to set
  up sender and receiver ports at all. It's preferable that each test case sets
  up exactly what it needs and nothing more.
* Un-ignore TCP tests.

* For TCP tests, use a different port for each test.
  * The port for each TCP test case is determined by opening a ServerSocket on
    "port 0," which finds an available port.
  * The TCP tests are sensitive to timing issues where a previous test was
    binding the same port and so that port isn't available for the current test.
  * The UDP tests do not appear to have that problem, I think because UDP is a
    connection-less protocol.

* Add "wait for socket to close/open" logic, which is more robust than waiting a
  certain number of milliseconds and hoping that the socket is closed/open. I've
  implemented this a loop with retries and a timeout of 5 seconds.
  * TCP only.
  * For UDP, we continue to simply wait for 30 ms because that seems to be
    working fine, and I don't think you can really test whether the socket is
    open or closed, anyway.

* Remove the TCP versions of the following test cases:
  * testSocketAutoClose: This test case involves creating sender and receiver
    ports and then neglecting to close them, instead setting `sender` and
    `receiver` to null and invoking `System.gc()` to let the garbage collector
    clean them up.  This works OK for UDP because it's a connection-less
    protocol, but if you do that with TCP sender and receiver ports, the socket
    remains bound, which would be undesirable, so I don't think it's a real use
    case that should be codified as a test case.
    * TODO: I think I should actually just fix the null pointer issue by
      removing `waitforSocketClose` and go back to parameterizing the protocol
      and testing both UDP and TCP.
  * testReceivingBroadcast: I think broadcast is UDP-specific. There is a
    `SO_BROADCAST` flag that we're setting on the UDP channels we create, and it
    doesn't look like that's an option that we can set on TCP sockets.
    * Worth noting also that the UDP version of this test case fails on my
      machine.
  * All test cases involving "connected out," "connected in" or "connected
    both." We're making TCP connections on the fly when `send` or `receive` is
    invoked, and explicitly connecting via `connect` is a no-op, so there's
    nothing worth testing.
    * I also don't understand the semantics of OSCPortIn and OSCPortOut having
      both `local` and `remote` ports, and something about setting them to
      certain values causes the tests to fail. I don't know what that's about,
      since I think (at least for TCP) we're only using `sender.remote` and
      `receiver.local`, and it's only going to work if they're the same.
@daveyarwood
Copy link
Collaborator Author

One thing that would make this PR smaller is if we re-opened #47 (consists of the earliest commits on this branch) and had it reviewed and merged.

I can think of a bunch of other things in this big PR that I could make smaller PRs out of, if that would help:

  • Make serializer be byte array-oriented instead of ByteBuffer-oriented.
  • Misc. improvements to the tests (use random ports, fix timing issues, etc.)
  • Debug tools (Debug, DebugClient and DebugServer classes)
  • Dev-only logging config
  • Add OSCPortOutBuilder

@hoijui
Copy link
Owner

hoijui commented Mar 25, 2020

ufff sorry dave!
just saw this now.. will have a look at it these days (not today anymore though).

@hoijui
Copy link
Owner

hoijui commented Mar 25, 2020

WOW! :O
(... could't help reading ;-) )
this sounds phenomenal! thank you dave!! :-)
it all sounds super good, except the serializer part of moving to byte-array, but ...
I imagine this is only because I did the opposite move back then. ;-)
I am sure your choice was wise, as by now you are surely more into the code-base as I am (not having been into it since quite some time).
I'll have a look at the code these days!

Copy link
Owner

@hoijui hoijui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the serializer change (ByteBuffer -> byte[]) is the only mayor concern I have, the rest looks really geat!
about serializer ...
The issue I see there would be memory usage, allocation and garbage collection. I would guess all those increased now a few times (int) then before.
... could you elaborate on why a stream-based attempt, or a custom interface (using a ByteBuffer or LinkedList<byte[]> or something else in the background) was not a good idea?
was it just because of the bad data handler/exception?

@@ -107,7 +108,7 @@ public OSCParser(
public static void align(final ByteBuffer input) {
final int mod = input.position() % ALIGNMENT_BYTES;
final int padding = (ALIGNMENT_BYTES - mod) % ALIGNMENT_BYTES;
input.position(input.position() + padding);
((Buffer)input).position(input.position() + padding);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to document the reason for this directly in the code.
Not sure how to best do that with a short comment (wherever it gets used),
and still being clear enough so that one dos not have to follow a link each time one sees it.
Maybe something like:
// NOTE Strange cast fixes JDK 8 vs 9 issue; see: https://github.com/eclipse/jetty.project/issues/3244

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that there is a single place where we could put that explanation. If you'd like, I can go find all the places where I did this cast and add a comment, although I think it would be repetitive and kind of messy.

We should consider what problem we are trying to solve. If we are trying to make sure that people will continue to add this type of cast going forward, then we could solve that problem by adding openjdk11 to the Travis config.

@daveyarwood
Copy link
Collaborator Author

The problem is that TCP is stream-oriented, and there is no limit to the size of the OSC packet we can serialize when the protocol is TCP. When you're working with ByteBuffers, there has to be a limit.

In the serializer, we are basically concatenating a bunch of byte arrays together. In the current implementation, we have a single ByteBuffer of a fixed size (i.e. the maximum size of an OSC packet intended to be sent over UDP where there is a packet size limit), and we work our way through the objects we're serializing and add them into the ByteBuffer as we go.

That strategy will not work for TCP, because we have to pick an upper limit on the ByteBuffer when we allocate it, but we can't possibly know how large the packet is going to be; it's essentially infinite, because TCP doesn't impose a limit to the number of bytes you can send.

My initial thought was that we could use a ByteArrayOutputStream (as some of the answers to this SO question mention) as the common denominator between stream-oriented TCP and ByteBuffer-oriented UDP. But then I ran into the problem where we couldn't provide context for the bad data event.

@daveyarwood
Copy link
Collaborator Author

Re: the interface idea, I thought about that a little bit, but I wasn't able to figure out what kind of interface would work as a common denominator. It seemed to me like the method signatures would have to be too different, and it wasn't clear to me how to unify what we would want to do for TCP vs. UDP.

@hoijui
Copy link
Owner

hoijui commented Mar 27, 2020

...what kind of interface would work as a common denominator ...

What about taking the ByteBuffer interface, and removing everything that is not used in the serializer?
... I did that yesterday evening (had a flow moment). It is a very rough draft with some hacky edges, but already works, and I see nothing unsolvable. The basic idea is just as described above, and the main files are:

What do you think about that?
Even if this is a bad solution for some reason, my main thinking is: I'd rather have some high-level abstraction that allows for performant, low level implementations, then to use a simple interface that makes performance improvements impossible.
But.. as said, you are into this more then me right now, so I might not see some things.

@hoijui
Copy link
Owner

hoijui commented Mar 27, 2020

.. now thinking about it... I don't know whether my solution makes sense for the TCP scenario. ;-)

@daveyarwood
Copy link
Collaborator Author

I don't think that would work, because for TCP you can't "back" what you're doing with a data structure that has a limit to the number of bytes you can hold in it, like either a ByteBuffer or a byte array.

I revisited my byte array-based serializer setup on this branch just now, and I think part of the problem might be that we sometimes need to serialize part of the output (e.g. a blob argument), count the bytes of that serialized part, and then serialize that number and put it before the thing we serialized. I couldn't see a way to do that with a stream-based implementation, so working with a bunch of pure functions that return byte arrays ended up being a lot easier to work with.

On another note, I'd be curious to have a better picture of how my implementation compares to the ByteBuffer-oriented one on the master branch. In what ways is it actually less efficient / performant? I wouldn't be surprised if it is less performant, but I don't really have any idea how to measure it objectively.

@hoijui
Copy link
Owner

hoijui commented Mar 28, 2020

I don't think that would work, because for TCP you can't "back" what you're doing with a data structure that has a limit to the number of bytes you can hold in it, like either a ByteBuffer or a byte array.

As you can see here, the BytesReceiver implementation meant to be used by TCP is not backed by a fixed-size structure.
It is also mainly byte[] based, it just saves on the number of allocations/concat(enation) operations of byte[]s.

On another note, I'd be curious to have a better picture of how my implementation compares to the ByteBuffer-oriented one on the master branch. In what ways is it actually less efficient / performant? I wouldn't be surprised if it is less performant, but I don't really have any idea how to measure it objectively.

To measure the performance.. easiest thing would probably be, to use a re-parser unit test methods code, or maybe a few of them, run them many times in a performance measurement tool, and concentrate mainly on the memory usage there.
an other way could be, to abstract away all the low level memory access (or at least allocation and copying of byte[]s and ByteBuffers) into our own utility functions, and log how often they are called with what size of arguments.

As I see it, your code does more allocations and copying/concatenating of the bare memory, and if it is already hard for us to figure out how much exactly, users of the library would have no chance.
In the old way, and I think in this solution, the bare memory usage of these low level structures is pretty much the actual OSC packets size, as it will be, going over the network.
as it is right now, it will be double that size when the BytesReceiver.toByteArray() function is called. but at least that function requires only a single (or none) allocation of a byte[], and GC call afterwards could free the "other half" of the memory.

I revisited my byte array-based serializer setup on this branch just now, and I think part of the problem might be that we sometimes need to serialize part of the output (e.g. a blob argument), count the bytes of that serialized part, and then serialize that number and put it before the thing we serialized. I couldn't see a way to do that with a stream-based implementation, so working with a bunch of pure functions that return byte arrays ended up being a lot easier to work with.

Yes! I also saw this problem.
I solved it through the PlaceHolder, the interface being in BytesReciever (an interface and a function below it), and each implementation of BytesReciever having their own implementation of PlaceHolder.

Now thinking of it ...
In addition to the current implementations of BytesReciever, we could also have:

  • (kind of) reusable ones
  • one backed by a single byte[], which would be automatically replaced by a bigger one, if needed
  • one backed by a single byte[], which would throw an exception if it turns out to be too small
  • same like the above two, but with ByteBuffer as the back-storage instead of byte[]
  • ... maybe more?

... and if this proves to be an overall good idea, we might even introduce a BytesSupplier to be used in the parser part later on. ;-)

@daveyarwood
Copy link
Collaborator Author

Ah, I think I may have misunderstood the overall point that you're trying to make. Is the idea that (because we suspect changing it will have an unwelcome impact on performance) we want to keep the current implementation for UDP (using a ByteBuffer of fixed size) and create a new, potentially less performant one for TCP?

I like that idea because, since JavaOSC currently does not support TCP at all, introducing it with worse performance than UDP would not be a regression! 😄

I can't tell yet if the interface you're proposing will work or not for TCP (with an implementation of the interface that uses a List<byte[]>, but I think it probably will. Unfortunately, I think it's going to be a while before I can try it because the COVID-19 situation is leaving me with not much free time. If you'd like to take a stab at it, be my guest! Otherwise, I'll get to it when I can (potentially months from now)

@hoijui
Copy link
Owner

hoijui commented Mar 30, 2020

mmmmm ok :/
same here.. much less time then before.
but yeah... both solutions are fine with me! you or me doing it.

hoijui added 6 commits May 12, 2020 15:35
# Conflicts:
#	modules/core/src/main/java/com/illposed/osc/transport/OSCPort.java
#	modules/core/src/test/java/com/illposed/osc/transport/udp/OSCPortTest.java
* adding SPDX-License-Identifier
* adjusting white-spaces
* JavaDoc fix
* adding `@Override`s
This is required for TCP, for example.
See mainly the new interface `BytesReceiver`.
It is basically a stripped down ByteBuffer interface,
which has one implementation backed by a ByteBuffer (`BufferBytesReceiver`) for UDP targeted serialization,
and one backed by a `List<byte[]>` (`BytearrayListBytesReceiver`) for TCP targeted serialization.
# Conflicts:
#	modules/core/src/main/java/com/illposed/osc/OSCPacketDispatcher.java
#	modules/core/src/main/java/com/illposed/osc/OSCSerializer.java
#	modules/core/src/main/java/com/illposed/osc/OSCSerializerAndParserBuilder.java
#	modules/core/src/main/java/com/illposed/osc/argument/ArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/AwtColorArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/BlobArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/BooleanFalseArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/BooleanTrueArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/ByteArrayBlobArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/CharArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/ColorArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/DateTimeStampArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/DoubleArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/FloatArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/ImpulseArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/IntegerArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/LongArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/MidiMessageArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/NullArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/StringArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/SymbolArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/TimeTag64ArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/argument/handler/UnsignedIntegerArgumentHandler.java
#	modules/core/src/main/java/com/illposed/osc/transport/channel/OSCDatagramChannel.java
#	modules/core/src/test/java/com/illposed/osc/OSCMessageTest.java
#	modules/core/src/test/java/com/illposed/osc/OSCReparserTest.java
#	modules/core/src/test/java/com/illposed/osc/OSCSerializerTest.java
#	modules/core/src/test/java/com/illposed/osc/argument/handler/ColorArgumentHandlerTest.java
@hoijui
Copy link
Owner

hoijui commented May 13, 2020

I merged byte-buffer-abstraction into your tcp-support branch after merging latest master,
and the result is now in this repos tcp-support branch.

Interesting for you @daveyarwood, is probably only the merge commit.

@daveyarwood
Copy link
Collaborator Author

daveyarwood commented May 16, 2020

@hoijui Thanks so much for digging into this. It looks great, as far as I can tell! I did a build of origin/tcp-support with mvn install and tried it out with my project (which uses JavaOSC over TCP), and it works like a charm!

I'm going ahead and merging origin/tcp-support into this branch, and I think it'll be ready to merge, unless there are any outstanding issues?

EDIT: Merged origin/tcp-support into this branch.

@hoijui
Copy link
Owner

hoijui commented May 24, 2020

thank you @daveyarwood for the testing and your great work! :-)

@daveyarwood
Copy link
Collaborator Author

@hoijui Are there any blockers to getting this PR merged at this point?

@hoijui hoijui merged commit 9ac4357 into hoijui:master Nov 2, 2020
@hoijui
Copy link
Owner

hoijui commented Nov 2, 2020

Thanks @daveyarwood for all this work, and for reminding me again!
... and sorry for the delay.

@daveyarwood
Copy link
Collaborator Author

No worries! Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants