-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Latency, synchronization and zerocopy improvements #240
Comments
Not sure if this is related, but I've noticed a significant memcpy operation that has been popping up in Time Analyzer (~9% when only viewing local video on demo.openwebrtc.io). Looking at the code, it is the following line in
Making the buffer writable obviously creates a new buffer and copies the contents into it, as per the documentation says, but it seems like this call (and similar ones) could perhaps be avoided by using better buffer management? |
Slightly related :) This also only creates a copy if the memory inside the buffer is not shareable. Usually it will only create a shallow copy of the buffer (so you can modify the metadata), not a deep copy including all the memory behind the buffer. |
Note that this does not give us zerocopy in Owr yet. There's still the "tee problem": https://bugzilla.gnome.org/show_bug.cgi?id=730758 |
I think this is now ready for wider testing, review and possibly merging. |
What kind of improvements are we talking about? |
Even if it does not bring about immediate improvements on all platforms due to a bit more work being needed in platform-specific elements, the changes enable latency improvements. I'll do some comparisons on each platform next week. |
For test-send-receive on Linux, the difference was from noticeable latency to unnoticeable latency. I didn't measure it though, but I would guess 200ms to less than 75ms |
I had an idea how we could relatively simply improve latency in owr, and
also make it easier for zerocopy to work properly. And as a side effect
also give better synchronization between different streams.
So the main idea is the following:
All Owr pipelines explicitly use the system clock as clock, and
explicitly have 0 set as base time on them
The connections between the different pipelines are done by
inter-style elements, but not as we have now. Currently the inter
elements only work with raw video or audio, and produce a constant
framerate live stream. In the future we would have something like Arun's
inter(app)src/sink elements. The source would then only output a buffer
whenever a buffer arrives on the sink, basically acting like a queue.
They would also pass through timestamps, segments, and other events, and
also passthrough the ALLOCATION and LATENCY queries.
This would mean that all elements would use the same clocks and
synchronization "configurations", meaning that timestamps from one
pipeline are also meaningful in other pipelines.
This would also mean that the inter elements would not introduce any
additional latency and just work as a queue. And latency would be passed
through, and only later used by synchronizing sinks (audio/video sinks,
RTP udpsinks). As a side effect, audio and video streams would now also
be definitely synchronized correctly together as long as all display
sink pipelines have the same latency configured, which we can make
happen.
And finally this also means that we can support compressed sources and
zerocopy. Nothing would stand in the way anymore like the inter
elements. We would have to insert max-rate videorates for raw video
streams again though.
The text was updated successfully, but these errors were encountered: