-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle audio packets received out of order #619
Comments
Interesting findings @softins. Good idea to add sequence numbers to the audio packets for better buffers' handling. Being jamulus custom UDP packets, I imagine that they even go to lower priority in the routing QoS rules of providers. Saying so, I wouldn't disregard the use of RTP for the audio path, it maybe worthy to analyze pros and cons (ie low overhead, out of order & jitter mgmt & report, solid existing libraries, opus integration -RFC 7587-, ...) versus increase complexity in the jamulus protocol to achieve the same goals. |
Who does this? That does not make any sense to me to split UDP packets of a continuous stream. Maybe Virgin is developing their own "Virginulus Realtime Jam Software" and want to turn Jamulus down ;-).
If it is just the connection from client to server which makes problems, maybe the UDP packets are sent in a bursted manner which leads to the routing issues. @hselasky had implemented some "traffic shaping" for the client since he found out that especially under Windows, the sound card callback does not have a good timing leading to UDP bursts. But I do not find that code in his repo anymore. Maybe he has removed it. What you could try is to use Linux instead (or the Jamulus OS on an USB drive). Usually you have a better timing under Linux. If the problem does not occur under Linux, we know that the bursted UDP transmission causes the issue. You could also try to increase the Buffer Delay in the Jamulus settings. Then the network packets are larger and may not suffer from the bad routing. If the small UDP packets are delayed by Virgin, you will get exactly the same overall latency by using larger buffers.
Wow, 3 bytes? I usually run Jamulus with a Jitter buffer size of 2 or 3. For that small size a packet counter would gain nothing. I consider a Jitter buffer size of 8 as large. So a 3 bit counter would be sufficient for that size. |
The tests on Monday demonstrating out-of-order delivery of UDP packets in a single stream were performed without Jamulus, using a custom test program. I didn't conduct the tests (@sthenos did), but I watched the conversation. They demonstrated a general problem which would certainly result in degraded audio performance on Jamulus, with audio packets arriving out of order. |
I can comment on the tests we performed. @softins , very kindly provided some scripts: https://github.com/softins/jamulus-wireshark, in order to perform tcpdumps whilst virgin and non-virgin (sounds dirty ;) ) clients were connected to the jamulus server running on our dedicated OVH server. What we saw was that packets were arriving at the server in bursts every 5ms or so instead of a single packet every 3ms or so as most other clients were (depending on bufffer size, audio quality, etc). This was a consistent pattern for all the virgin clients we tested (we tested two). This led us initially to conclude that this queuing and flushing of packets could have been the issue. However @softins had an AWS VM and ran the same tests for the virgin clients (where there is no audio burbling for Virgin Jamulus clients) and could see the same behaviour of clumping of packets arriving every 5ms. I then wrote a very simple java client/server app that streams 128 byte udp datagrams every 2ms( as jamulus would) and wrote a sequence number into the first 4 bytes. The server received the packets and printed out the sequence number. We could see clearly that for all virgin clients the packets were consistently inconsistently arriving at the OVH dedicated server. I then ran the same test on a cloud VM at OVH, wondering if it was a problem with the baremetal server or the subnet it was in and the same behaviour occurred both for the UDP client/server app as well as when the users used Jamulus. This behaviour has led us to conclude that packets are most likely being load balanced and split randomly by an ECMP router in the path, either within OVH or Virgin. My view is that it only affects jamulus instances at OVH, therefore the mutli-path routing must be occurring within OVH, but that it's only occuring for Virgin traffic, not any other ISP, therefore we are wondering if something is occurring on the header of the packets to make the ECMP router in OVH not be able to determine packets are from the same source and headed for the same destination thus should be kept on the same path. Still all theories at this stage, and my next step is to test the java UDP client/server app with virgin clients to a non OVH server (in AWS for instance). |
Would this also explain other strange behaviours? I'm on VM and have never been able to see more than about 15 servers per central server (while --showallservers shows a large number) |
@sthenos Any updates on the network issues with Virgin? Have you found a solution/workaround which works for you? |
For the last couple of worldjams they have just used a different server. I don't know the current status with Virgin. Although the last week has been busy, I still have plans to try a proof of concept with sequence numbering. Even if you are running on a short jitter buffer, sequence numbering allows a late, out-of-order packet to be discarded instead of being inserted into the jitter buffer in the wrong place, which should cause less aberration in the audio. And the affected client could always increase their jitter buffer setting to see if it improves things. |
In my experience, queuing and reassembling sequenced audio packets in a low latency environment is a very bad idea . We implemented a scheme to do that (in a teleconference system) years ago and it resulted in large jitter buffer sizes and noticeable audio delays. We found that if you're going to add a sequence number to packets, it should probably only be used to detect missing or out of sequence packets. These might be covered by initializing the jitter buffer to silence and playing the correctly sequenced streaming buffers when they arrive within the jitter window. Since Jamulus uses Opus audio compression, I'm not sure what effect that will have, but redesigning a working protocol based on one or two bad network path's is not a great idea in my opinion. |
@interogativ I largely agree with you, but I do think it is always wrong to process audio packets received out of order in the incorrect order that they were received, even if that situation is fairly rare. They should at least be detected and discarded, and if the user's jitter buffer size is such that the decoder hasn't yet needed the data from a late packet when it arrives, there is no harm in placing the data in the correct part of the buffer, and some benefit in doing so. |
Well, from the client's point of view, lets say it receives packet 10, then packet 12. Soon packet 11 might be arriving out of order, but because this is UDP, it might never arrive at all. I agree that packets should be placed into the correct order in the jitter buffer before they are played. I actually think we're in agreement here. |
Yes, I think we agree. Here is my plan to prove the concept (and I am taking account of forward/backward compatibility):
The JB will be sized exactly as at present, according to the client settings. I am not proposing to make the JB artifically larger. If a compatible server receives sequenced audio from a compatible client, it will also sequence its outgoing audio to that client in the same way. |
Its a pretty ambitious plan. Some thoughts and potential flies in the ointment:
As I said, this is pretty ambitious. |
All excellent points, and mostly ones that I've considered and taken into account. I think for a low-latency application like Jamulus, a 2-second wrap around for the sequence number is more than enough. The jitter buffer would never be more than a hundred or two milliseconds at most, usually much less, and any UDP packet that didn't arrive in well under a second is probably lost rather than delayed. We are looking to reorder or discard packets delayed by only a few or few tens of milliseconds. I still have a bit more research to do. My main obstacle has been the use of vectors, firstly understanding how they were used, and then thinking about how to avoid unnecessary copies. C-style buffers and pointers are so much easier to handle efficiently! C++ spans might be a reasonable way, but they are only in C++20. |
What are you trying to achive? Using an audio packet for decoding which is 43 hours old? What latency is that? It will really hard to play with such a latency ;-). What you are trying to do here is to solve a Virgin-specific problem. 99.5 % of the Jamulus users are happy with the current implementation. If a provider does nasty stuff with the network packet routing, I don't think you will get any good jam performance even if you correct the incorrect packet order.
Ok, go ahead. It does not hurt to do investigations on a private fork. Merging such a code to the official repo is a completely different story... |
As mentioned before, I think we're acting without having a sense on how prevalent is this case. So my take would first insert sequence numbers and measure how many packets are lost, how many arrive in wrong order and get some stats to gauge if the development effort will be worthy (or at least have another measure of link quality to display besides the delay as a troubleshooting tool in case of bad audio quality). In any case, I think the comment from @interogativ (#619 (comment)) on actual experience with streamed audio quality is worthy, measure it, insert silence and drop the out of order packet provides better perceived audio overall. |
I agree that if it ain't broke, don't fix it. However some investigation of how prevalent the "problem" is probably warranted. |
Hi, We have one bit free in each TX'ed packet, which can be used w/o breaking backwards compatibility. The problem is that that current code expects a certain frame size on the client side and will drop packets having wrong length :-( It might not solve the problem, but in my #539 branch I've added such a sequence number (transmitted one bit at a time) which actually measure the number of packets TX'ed and RX'ed. You may want to give it a spin on your network on client and server, to get some more details about what is going on. Don't forget to add --showanalyzerconsole to the command line. --HPS |
I was one of @sthenos testers. I am on VM cable. For me, Jamulus on OVH is unusable. I am filled with fear if I see one of my Worldjam songs scheduled on an OVH server. I have ordered ADSL broadband but it will be a few more days before it is activated. I will happily test again using Mac OSX and Windows 10 from this location. |
Hi @njc235 I'm guessing from the handle that you might be Nick Cox. Cool! I'm partway through implementing this, and the client will have a checkbox to turn sequence numbering on and off so we can compare on the fly. I'm hoping to have something ready this coming week that you can try. I will need Simon @sthenos to update the server on OVH with my code too (it will be fully compatible with existing clients too). Would be great if we can get this tested before you terminate VM. |
@softins I can test too as I'm on VM cable as well. |
I'd be willing to test as well because I'm having the same issues on the OVH servers (ISP: Vodafone Cable Germany) |
This is now ready for testing, and I am pleased with the result. I tested on a local server using a proxy to introduce some packet disordering. The proxy can be seen at https://github.com/softins/udpjumble if interested. My branch of Jamulus with this code is at https://github.com/softins/jamulus/tree/sequence-audio The test clients for Mac and PC are available in the repo https://github.com/softins/jamulus-seq-test Please report results in this thread. Thanks to @njc235, @gilgongo and @dingodoppelt for offers to test. @sthenos will be updating the OVH server with this code soon. |
Sounds good. For the testers: If you had issues with the regular Jamulus version since your ISP makes some strange packet routings, does the new version improves the experience? Also in that case:
If the feedback is positive, of course, it makes sense to integrate this new feature in the regular Jamulus version. |
installed test client. |
Nick, did you have your jitter buffer on auto or manual? You need to have it on auto for it to settle at the required length, which might be a bit longer than you normally have. |
One of our WorldJam users, Rich Williams, tested the new client last night on Virgin talking to the OVH server running the new code, and recording the output of Jamulus. He's not on GitHub, so I'm posting the clips (using Dropbox, as GitHub doesn't allow audio attachments).
Regarding the questions above:
Edit: attached files here as a Zip: RW-tests.zip |
@njc235 Nick, I've just listened to your audio clips. I think there was something going on there separate from Jamulus. I listened first to VirginTest.mp3. With sequence numbering off, I could clearly hear the classic bubbling sound, which is caused by packets arriving in the wrong order, but Jamulus processing them in the order received. When you turned on sequencing, the bubbling stopped. Intermittently in the whole recording there was some kind of buzzing or crackling distortion that I've never heard before. Your audio level was also a lot lower than Rich's. I then listened to BTTest.mp3 and it also seemed very low level with the same crackling distortion, but it didn't have any bubbling, with sequence numbering either on or off, which is to be expected, since we haven't observed packet mis-ordering with BT. I would be interested in a comparative test between the test client and a release client, without you changing anything else, to see if the crackling distortion is the same, or unique to the test client. |
@softins my bad, I had not switched jitter buffers to auto, hence the distortion. I have tested again and the results are conclusively in favour or your fix. The classic bubbling is ever present with virgin and numbering off. With it on, the bulling has gone. As you rightly say, on BT there is no difference as expected. The question then is, do I cancel my BT circuit while I can :). I will test again after normal work with some sample audio and record the result again. |
Yes, it makes sense. |
@softins I finished a first experimental version of the sequence number support on my branch: https://github.com/corrados/jamulus/tree/feature_sequence_numbers. I am not yet satisfied with the results and there are still a lot of places in the code which have to be cleaned up. But at least something is working. It would be great if you could test this version on my branch with your "packet disordering proxy" to find out if my solution solves the problem. I.e. I get a feedback that I go into the right direction. Of course, since you are the jitter buffer expert, you are welcome to review/comment my commits on my branch if you like. |
Sure. No time today, but will try it out Sunday or Monday. Thanks! |
@corrados Just looking through the code before trying it out. I see you have appended the sequence number byte to the audio frame. My question, with my Wireshark hat on: is there enough information within a single packet for the packet dissector to know for sure whether there is a sequence number or not? Without having to be aware of preceding context? In my own scheme, using 0xFF followed by a sequence number at the beginning of the packet, it was very easy to tell. |
I tell the server in the PROTMESSID_NETW_TRANSPORT_PROPS message whether the sequence counter is used or not.
No, it isn't. You would have to parse the network transport properties protocol message to find out if it is active or not. Edit: Since the sequence counter will be enabled by default for all versions >= 3.6.0, in the future you will always get a sequence counter. |
Hmm, that's a great pity. It means a disssector displaying traffic between different versions will not be able to show whether a sequence number is present of not. It cannot refer to the earlier presence and contents of a Transport Properties message. With the scheme I used, having an unambiguous flag byte followed by a sequence number (which can be a single byte if it is counting packets), it was easy to display the presence and value of the sequence number. It also made it easy for the receiving end to know whether a sequence number was present or not, and to handle either type of packet without having to be told in advance.
But if a 3.6.0 client is talking to an earlier server, it will not be sending a sequence number, will it? What do you see as the disadvantage of my approach to the audio protocol? To me, it seems more robust, self-contained and backward compatible. |
Well, it's a waste of bandwidth actually. The Jamulus client/server does not need it for operation. Why to send an unused byte for each audio packet? The numbers in client.h "code rate in bps" are fixed. You can simply use these numbers for detecting the sequence number. Check the size of the packet. If it matches one of these values, you know that the sequence number is not used. In all other cases the sequence number is used. |
For low quality mono on 64 byte buffer, it adds only 1.47% to the bandwidth, for medium stereo on 128, it adds just 0.88%, and for high quality stereo on 256 it adds a mere 0.3%. It's not an "unused byte", it is making the protocol more understandable to the receiver and a dissector.
Well it's rather kludgy to do that. And there are two ambiguties:
|
@corrados ok, I have now compiled the 3.6.0git-seqtest version on my local CentOS 7 PC (hp3) as a server ("headless nosound"), and on my Raspberry Pi 4 as a full client. So they are on the same LAN. The server is running on port 22124. I then added the disordering proxy as So I can connect to
So it looks like your code is achieving a similar result to mine. Finally, a note on the auto jitter buffer settings I saw, for local/server, between 3.6.0 client (Rpi) and server (PC) on the same LAN, using packet size of 128, 5.33ms:
Please feel free to download and use |
I agree that not being the packets atomic in nature (ie carrying all information needed to properly decode) may bring some unintended side issues (ie mixed old-new pairs of clients-server, PROTMESSID_NETW_TRANSPORT_PROPS packet losses, ...) |
Thank you for testing. Then I'll clean up my code soon and merge it to master. Then it will not take very long to get the next release version out.
There is no loss of stability.
Yes, you have some ambiguities. But when analyzing sequence numbers it does not make sense to look at just one packet but you will take a look at a sequence of packets. If you extract the last byte of these packets even if you have just 3 packets available you will easily see if they are random numbers or if the count up. Also, if you investigate one single connection, you will be sure that if the sequence number is used, it is always used. If you know the version number of the client/server, then if they are >= 3.6.0, they will use the sequence numbers. |
You could also use multiple different UDP port numbers, to implement the 16 or so sequence numbers you need. |
@sthenos The new code is ready to be merged. It would be good if you could setup a Jamulus test server based on the latest commit of the branch https://github.com/corrados/jamulus/tree/feature_sequence_numbers on your server hardware which shows the network routing issues so that we can start testing the new code. |
@corrados Simon hadn't seen your request above, so I have alerted him to it. I have also put it on my two servers jamminthejazz and jamminthejazz2. Have you built clients for Windows or Mac with this test version? If not, I can build for Mac and know someone else who could build for Windows. |
Thanks Tony. No, I have not yet created setups for these versions.
Thanks for the offer. Yes, that would be great. |
Ok i've got a version now running on the WorldJam central servers called Will need a compiled windows/mac version tho to test. @softins - let me know when we have a windows client we can test with |
Thanks. I just created a Windows setup from the latest git master code. You can download it from here: |
The Mac build is available at https://github.com/softins/jamulus-seq-test |
Tested tonight with the 3.6.0 client on Mac to both an existing release OVH server and to @softins test server with the new code. Results are pretty conclusive. Well done! |
Just to clarify Nick's comment above. His tests with the new client were done against two different server versions on the same actual machine at OVH run by @sthenos: Waiting Rm2 (3.5.10) showing the issue, and Latest SEQ TEST (3.6.0git-seqtest) showing the solution. |
I did the same test and it works like it's supposed to. |
Thanks for your testing. I'll schedule the next release for the upcoming weekend (see #77 (comment)). |
@corrados With the release of 3.6, would we consider dropping support for 3.4? |
No, this is not planned. 3.6 will still be compatible with 3.4 servers. |
Please note that I am not at home right now so I do not have access to my Windows and Mac computer. So, tomorrow I'll just set the release label in Git and create a Github release tag. The 3.6.0 binaries on Sourceforge will be added with a short delay (mid next week). I'll wait with the update of the Central Server so that the clients will not show the "Software Upgrade Available" until the binaries are ready. Anyway, since the release tag in Git and the "latest" tag will be available for the 3.6.0 version, the Linux servers can already be updated. And if someone needs the new version earlier, he can use the beta versions provided in this Issue. |
Since the implementation is done, I'll close this issue now. |
I have started working on this, but I wanted to post it as an issue so it could be discussed too.
In the WorldJam community, there has recently been a significant problem with audio quality between clients using Virgin Media as an ISP and the WorldJam servers at OVH. The audio sent from client to server is severely burbled, although the client hears everyone else clearly.
Some tests were conducted on Monday by @sthenos and others using a test program to send streams of UDP packets containing serial numbers from the VM client to the OVH host. These tests demonstrated that the UDP packets were arriving out of order. Our theory is that there is some kind of multipath load-balancing routing happening, where one of the paths is slower than the other.
It is likely to be difficult to get either Virgin or OVH to investigate this deeply for us.
My own view is that out-of-order packet delivery is part and parcel of the unreliable nature of UDP.
Currently, the Jamulus jitter buffer is a simple ring buffer, and expects all audio packets to arrive in the order sent. In the protocol, there is no sequence numbering of audio packets to enable out-of-order packets to be re-ordered correctly.
My plan is to enhance the jitter buffer to re-order out of order packets instead of assuming they are all in the correct order. Protocols like RTP (used with SIP for VoIP calls) do indeed include sequence numbering for exactly that reason. I'm not thinking of actually using RTP, but something much simpler. Just a 3-byte header with 0xFF flag and 2-byte sequence number (counting in bytes) prepended to the existing audio data. The existing Jamulus data never has 0xFF as the first byte, so the server would only treat specially any packets starting with 0xFF. All other traffic, e.g. from older clients, would be processed exactly as at present. If the server sees the 0xFF, it would get the sequence number and use that to determine where in the jitter buffer to put the remaining audio data in the packet (and if it's too old, discard it). Also, once the server had seen the 0xFF flag from a client, it would know that it could send data back to that client in the same way. Otherwise (for older clients) it would send audio unsequenced as at present.
If the sequence number in a received audio packet was higher than expected, that would indicate missing data. We would need to fill the skipped space with silence, in case the missing data never arrived in time. If it did subsequently arrive, it could be placed in the JB at the correct place, overwriting the silence.
The text was updated successfully, but these errors were encountered: