-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low-latency HLS Streaming #1
Conversation
gj on the initiative! some initial notes:
|
@shacharz Thanks for the feedback!
|
Here are some additional comments. Nice work John! Overall, I don't see any trouble spots with implementing
I'm not sure how a media server can always guarantee that duration of the prefetch segment is <= TARGETDURATION. The target duration in real world live cases is typically that largest segment that has been seen at that point. For example, if the media server is not transcoding and only repackaging on GOP boundaries of the incoming encoded stream, the incoming stream may end up with a longer prefetch segment based on GOP boundaries than it has previously seen. This may be an edge case, but it would be fairly common with Wowza's Server
6b. I think rebuffering and playback rate are client specific. This spec does LHLS really well so IMO focus on that and leave these up to specific client implementations. Different use cases may call for different approaches (gradual catch up to live versus immediately jump to live in cases of drift, for example) 6c. I think CMAF is the logical companion of LHLS for many reasons, but no reason to restrict ts chunk delivery that I can see. |
Hey @ScottKell! Thanks for the in-depth review. I've been a bit tied up with the next Hls.js release but I'll address your feedback shortly after. |
On point 3. Prefetch Media Segments: For LHLS+Chunked CMAF segments, if we want to align the logic on DASH (where the AvailabilityTimeOffset parameter handles this case), the client shall not be told to request a segment until the first CMAF chunk is available on the origin (as it is the smallest logical unit). If we reference a segment for prefetch one segment ahead, it will open multiple CDN connections on the origin (hopefully not too much if the CDN correctly collapses requests), and assuming that the origin won't reject those connection requests, the gain will just be the network connection opening time, which is negligible compared to the segment duration/load time. For LHLS+TS segments I guess that there is less constraints, so starting after a few uploaded bytes (equivalent to the TS headers?) might work fine. The additional problem is that the origin can add some latency if it's buffering the data coming from the packager, so the AvailabilityTimeOffset defined at packager level won't be totally accurate - the packager will need to add the origin-generated latency to define the precise time when the segment can be advertised for prefetching in the playlist. |
Good point, I'll change language to reflect this.
This is Apple's language but I think I can do a better job of simplifying it.
Sounds reasonable, but I'm wondering if this needs to be a requirement. Is it fundamentally impossible (or ill-advised) to advertise before the first byte is available, or is this specific to Wowza?
Hmm, it seems like errors are allowed by RFC8216:
The client should be able to handle bad status codes (and will have to), so I think its better to imply that the client must be able to handle these cases.
A prefetch tag is repeated if its found in two manifests e.g. it remains after refreshing. But now that I'm reading this again it doesn't really make sense. Will mark as resolved.
True, it's a bit of a can of worms to offer guidelines on this. I was thinking more from the perspective of an event publisher, who wants to guarantee that their viewers are as close to the live edge as possible, regardless of which client they're using. If no catch-up is required it's hard to guarantee without knowing how the client is implemented. But I agree, I think it should be left up to the client.
That's what I've been hearing as well. Will mark as resolved. Again, my thanks to you and the Wowza team for the feedback! 👍 |
It'd be pretty difficult to make an AvailabilityTimeOffset analogue in HLS.
We're limiting the max amount of prefetch segments to 2 to deal with load. We wanted to do just one, but other encoders have setups where two segments can be transcoding at once (the example I was given was that the next segment begins while the b-frames of the last segment are being completed).
I think this is generally a good practice but up to the server. Putting stuff in the spec has it's own danger - trying to remove/deprecate features becomes more difficult because devs may be relying on it.
Yeah this will be a problem with manifest refreshing too - the actual refresh time should be the duration of a segment + whatever latency is incurred on the encoding/delivery side. As alluded to before, I believe this can be accomplished with the etag: On the server: On the client: Where refreshOffset is added to the manifest refresh time (usually equal the duration of the playlist). Just some napkin math but I believe that's the general idea. This Will Law's idea, I'm going to follow up with him to see if this correct. I haven't anticipated how necessary this will be for the success of LHLS so it's not in the spec yet; it may be a "wait and see" thing. Thanks for the feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few suggestions, mostly to fix some of the language.
Some more comments:
- I would also like to see documentation about which use cases are out-of-scope (ULL / real-time communication).
- The guide-level explanation specifically mentions Hls.js a number of times, even though the section describes a pretty generic client implementation (MSE and Fetch API excluded). I think it might be valuable to focus less on the Hls.js implementation in that section.
- In general, I think the spec should not depend on the client-side implementation being written in JavaScript or using the MSE/Fetch APIs. These are used in the concrete Hls.js implementation, but implementation of a client that does not have access to these APIs (for example in an FFmpeg-based client solution) should be possible using this RFC. However, the Hls.js implementation can be used to illustrate how a client could be implemented.
|
||
## Media Segment Tags | ||
|
||
The server must not precede any prefetch segment with metadata other than those specified in this document, with the specified constraints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this constraint breaks extensibility. Since you explicitly list the #ext-x-
tags that are not allowed in this section, this statement can be considered redundant.
|
||
* Transform a prefetch segment to a complete segment. ([Prefetch Transformation](#prefetch-transformation)) | ||
|
||
To each prefetch segment response, the server must append the `Transfer-Encoding: chunked` header. The server must maintain the persistent HTTP connection long enough for a client to receive the entire segment - this must be no less than the time from when the segment was first advertised to the time it takes to complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This limits communication to HTTP/1.1, because HTTP/2 does not support this mechanism (as described in RFC7540, Section 8.1).
Furthermore, Chunker Transfer Encoding requires that we follow a specific protocol (described in RFC7230, Section 4.1) which is not referred to here.
I recommend making a distinction between HTTP versions and referring to the specs that describe streaming (chunked) data transfer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Good idea, I've gotten similar feedback
## What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? | ||
|
||
- Alternative connection protocols (WebRTC, Websockets, etc.) | ||
- Manifestless mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth linking to a resource that describes this concept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we can link out to some DASH docs for now, there isn't anything specified for HLS atm
proposals/0001-lhls.md
Outdated
#EXT-X-PREFETCH:https://foo.com/bar/7.ts | ||
``` | ||
|
||
`5.ts, 6.ts, and 7.ts` all have the Discontinuity Sequence Number of 1. Note how the `PREFETCH-DISCONTINUITY` transformed to the conventional `EXT-X-DISCONTINUITY` tag, and how that tag still applies to prefetch segments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new #EXT-X-PROGRAM-DATE-TIME
tag was also introduced. I cannot find this behavior described in earlier sections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was in an old draft, I'll add it back
@johnBartos Thanks for this work! Please find my comment below. I still don't understand why we would need 2 prefetch segments. Yes, I agree that the encoder might be still working with the B-frames of the last segment, while working on the I-frame and/or P/B-frames of the current segment. This is an expected behavior when the encoder is multi-threaded based on frame-parallelism. But still I would expect the encoder to output and upload the frames based on the monotonically increasing order of DTS. In such a case the upload of the current segment would start only after the upload of previous segment is complete. One practical example is the x264 encoder + ffmpeg. x264 encoder is frame-based multithreaded and hence could work on frames across segment at the same time. But still it always outputs the frames on the order of monotonically increasing DTS. Hence I would suggest limiting the number of prefetch segments to 1. Please feel free to correct me if my understanding is incomplete. |
@johnBartos I concur with @jkarthic-akamai : while it make sense to prefetch the N+1 segment as soon as the first bytes or CMAF chunk are available on the origin, prefetching the N+2 segment will require to support long polling over a duration superior to the duration segment. It's just gonna open sockets with no data to transmit, generate timeouts and false positive errors in the logs (like in dash with the wallclock misalignment problems). Honestly I don't see CDNs or origins supporting the N+2 prefetch anytime, the benefit of it is absolutely not proven and very long polling comes with a lot of security risks for the CDNs/origins. It's just not realistic to require N+2 prefetch support across the chain. I also suggest to limit the number of prefetch segments to 1. |
@jkarthic-akamai @nicoweilelemental I agree; thanks for the breakdown. I'll amend the spec for N+1 only |
Thanks @johnBartos - looking forward for the hls.js implementation ! |
proposals/0001-lhls.md
Outdated
# Summary | ||
[summary]: #summary | ||
|
||
Low-latency streaming is becoming an increasingly desired feature for live events, and is typically defined as a delay of two seconds or less from point of capture to playback (glass-to-glass). However, the current HLS specification precludes this possibility - within the HLS guidelines, the best attempts have achieved about four seconds glass-to-glass, with average implementations typically beyond thirty seconds. This RFC proposes modifications to the HLS specification ("HTTP Live Streaming 2nd Edition" specification (IETF RFC 8216, draft 03)") which aim to reduce the glass-to-glass latency of a live HLS stream to two seconds or below. The scope of these changes are centered around a new "prefetch" segment; it's advertising, delivery, and interpretation within the client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of a moot point because I think 2 seconds is a good goal for this project, but 2 seconds is probably the most aggressive definition of "low latency" I've seen. Wowza pegs it at 1-5s and @wilaw has it at 4-10s, with 2 seconds being closer to "Ultra low latency". Not sure what I expect you to do with that info but thought it was worth pointing out in case there's opportunity for industry consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I've been working with Will on the new latency ranges definition that you have seen in his Demuxed presentation. We defined low latency by what you can achieve with 1s and 2s segments of regular HLS/DASH (meaning : 4 to 10 seconds latency) and ultra low latency as what you can achieve with chunked CMAF (meaning : between 1 and 4 seconds). We used this technology criteria as the previous latency ranges definition by Wowza & Streaming Media was mixing technology and use case requirements criterias.
So we maybe should say here: 'Low-latency and Ultra low-latency streaming are becoming increasingly desired features for live events, and are typically defined as a delay of 4 to 10 seconds for low latency and 1 to 4 seconds for ultra low latency, from point of capture to playback (glass-to-glass)'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agreed. Thanks for keeping me up to date with Will's work - I want to be aligned with whatever he's doing where possible. Has he been factoring the Streamline project into his definitions? The LHLS fork of Exoplayer is getting 1.1s latency which is pretty crazy.
'Low-latency and Ultra low-latency streaming are becoming increasingly desired features for live events, and are typically defined as a delay of 4 to 10 seconds for low latency and 1 to 4 seconds for ultra low latency, from point of capture to playback (glass-to-glass)'
Sounds good to me!
As of writing this, Hls.js by default will have between 1 and 2 segment's duration of client-side latency. The plan is to start the stream at the last complete segment, and begin buffering prefetch from there. I'm not sure what server-side latency is looking like but it should put us at the lower end of the "Low latency" definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We defined low latency by what you can achieve with 1s and 2s segments of regular HLS/DASH (meaning : 4 to 10 seconds latency) and ultra low latency as what you can achieve with chunked CMAF (meaning : between 1 and 4 seconds).
So by this definition we're building an ultra low latency player.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default yes, but the player configuration shall allow a higher target latency. Which leads me to another close consideration : in DASH we are heading towards setting the target latency on the manifest level, and not on the player configuration level. Would it be interesting for LHLS to discuss a similar approach, like #EXT-X-TARGETLATENCY: 2500 (if we measure in milliseconds) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nicoweilelemental That's an interesting idea!!! There are lot of advantages in being able to configure the playout latency at the manifest itself. If we going ahead with this, I propose a small modification though. Instead of setting a target latency, I suggest we set a target buffer size. Since the encoder's latency is not in player's control this definition could be little-bit misleading. Instead we could ask the player to maintain a specific target buffer size with the condition that it continuously loads data to be as close to the live edge as possible. Something like #EXT-X-TARGETBUFFERSIZE: 2500 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right @jkarthic-akamai it's hard for the player to determine what's the actual E2E latency. The only way would be to make at least one #EXT-X-PROGRAM-DATE-TIME insertion mandatory per child playlist. In dash we recommend putting the Producer Reference Time in the prft mp4 box, which the player can parse to get the actual timecode of a segment: it could replace #EXT-X-PROGRAM-DATE-TIME for latency measurement.
Setting the #EXT-X-TARGETBUFFERSIZE instead will roughly fill the same purpose while relaxing the dependency to absolute time.
|
||
The client may opt into an LHLS stream. If so, the client must choose a prefetch Media Segment to play first from the Media Playlist when playback starts. The client must choose prefetch Media Segments for playback in the order in which they appear in the Playlist; however, the client may open connections to as many prefetch segments as desired. If data from a newer prefetch Media Segment is received before an older one, the client should not append this data to the SourceBuffer; doing so may stall playback. If the client opts out of LHLS, it must ignore all prefetch Media Segments, and any additional constraints outlined in this specification. | ||
|
||
The client may set a minimum amount of buffer to begin and maintain playback. The client should not impose a minimum buffered amount greater than one target duration; doing so may introduce undue latency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like somewhere we might want to provide a little more guidance, smart defaults, or open the door to configure the target min buffer somehow. With ultra low latency there will be a fine balance between lower latency and more rebuffering, that will likely be audience dependent. A lot of players don't give you any buffer config options today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I need to rework this section. It was trying to be an analogue of this section in the HLS spec but it didn't really come out right.
This feels like somewhere we might want to provide a little more guidance, smart defaults, or open the door to configure the target min buffer somehow.
We're trying to be as hands-off as possible with recommendations - clients should be able to build whatever experience best suits their usecase. Maybe we can strike a balance with language (should instead of must), but it may be more productive to put any kind of guidance in some ancillary doc where we don't have to worry about spec compliance. Hls.js will be a reference implementation, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clients should be able to build whatever experience best suits their usecase.
Just to clarify this, they should build whatever low-latency usecase best suits them. The original intent behind this section was to ensure that they were operating like a low-latency player (and not just ignoring prefetch segments or having like 30s of buffer or whatever) but I don't think it came out right. I'll take another stab at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nicoweilelemental's other comment about providing the target latency in the manifest would actually solve my concerns here. I like that idea a lot. Assuming a player respects that tag, it wouldn't have to expose much else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to target latency in the manifest. There definitely needs to be a knob to turn for different use cases as they walk the line between latency and rebuffering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Will and his summary. The client will play the LHLS stream as best as it can given the current conditions and configuration. The problem with a target is that it's not actionable - if the client is behind because the network is slow, there's nothing it can do to get ahead. I was checking to see if DASH had something similar but couldn't find anything (but I didn't have the chance to look very hard).
But even then, manifest updates and network delays can cause us to stall if we try to play at the live edge. So we will actually delay where we will play at so we don't stall. The time we adjust by can be specified in the manifest using the MPD@suggestedPresentationDelay attribute. This specifies the delay to give the live stream to allow for smooth playback. If it isn't specified, we will give a reasonable default value.
suggestedPresentationDelay
looks interesting (it seems to specify a minimum latency), but I'm not sure how useful it is for LHLS. The encoder must update the manifest with new segments on an interval equal to the average length of a segment.
I think the manifest should add information to allow the latency to be estimated and I would support the spec saying that the package MUST add #EXT-X-PROGRAM-DATE-TIME to the media playlists.
I had #EXT-X-PROGRAM-DATE-TIME this in an original draft just for this purpose, but deleted it - i couldn't come up with an acceptable definition that all encoders could follow. Is the timestamp the time when the segment begins transcoding, has finished; or is it something like when the manifest was created in-memory (or between whatever stages in a transcoding pipeline). Maybe it doesn't matter too much. Input appreciated here, I'd like to add it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are indeed many potential reference points for where to hang the definition of #EXT-X-PROGRAM-DATE-TIME. MPEG has defined 6 if I remember correctly for the equivalent in DASH. There is no need for that complexity here. All you need is a COMMON reference point that all clients can access. They can then achieve synchronization (per @jkarthic-akamai comment above) by targeting a fixed delta from that point. A practical point would be the wallclock time at which that frame of media entered the encoder. Assuming very small camera delay, the delta between that value and the wallclock time when the media frame is displayed by the client then represents the end-to-end latency. That's for lab confirmation. In the real production world, there is an unknown production delay upstream of the encoder (camera, OB truck, satellite contribution, broadcast profanity delay etc) so its very difficult for the end client to calculate the true e2e latency. Luckily, we don;t need to know the true e2e latency even for synch, we just need a consistent reference point between clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can get behind the philosophy of the manifest just reporting the availability. But defining the target latency somewhere is critical in these use cases, so it brings me back to the original comment of needing more in the spec to get clients to expose configuration. i.e. if iOS Safari implements LHLS with a set latency target and no option to configure it, that's not good.
A practical point would be the wallclock time at which that frame of media entered the encoder.
I'm not sure the other options but that seems the most sensible. However with a UGC platform, streamers can use anything that streams RTMP (e.g. OBS) to the central service, and in that case I don't believe the central service has access to when the media frame entered the original encoder. For this use case I think I'd be fine with just using the time the central service received the media frame in the stream. It ignores the time prior to that, but assuming I can configure all my players' target latencies, I can adjust for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had #EXT-X-PROGRAM-DATE-TIME this in an original draft just for this purpose, but deleted it - i couldn't come up with an acceptable definition that all encoders could follow. Is the timestamp the time when the segment begins transcoding, has finished; or is it something like when the manifest was created in-memory (or between whatever stages in a transcoding pipeline). Maybe it doesn't matter too much. Input appreciated here, I'd like to add it back.
I agree that there is no need for us to define #EXT-X-PROGRAM-DATE-TIME strictly. We could just stick with the original definition from the official HLS spec. https://tools.ietf.org/html/draft-pantos-http-live-streaming-23#page-17 . We just suggest to make it as mandatory(MUST) parameter, so that all clients have a common reference point to sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, we are relying on #EXT-X-PROGRAM-DATE-TIME to decide on what segment to start playback to reach our target latency. This approach has been successful for us.
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Co-Authored-By: johnBartos <jbartos7@gmail.com>
Some feedback coming from our experience at Periscope and Twitter, after managing a large scale LHLS deployment for more than 2 years: having 2 prefetch segments is an important feature for us to constrain the total end to end latency. It allows the client to start receiving data immediately for the next segment after the current prefetch segment is ended. Without this, the client would need to eagerly request a new playlist immediately after the server closes the current prefetch segment. This means that:
|
That was the conclusion @TBoshoven and I came to - I hadn't considered the impact of refresh misses when we were having the above discussion. I'm going to amend the segment to be able to handle multiple segments. If your system can support a dozen prefetch segments, or just one, the spec shouldn't stop you; if you want to support one, that's fine too. But I believe that two will be the natural default. We're trying avoid being prescriptive wherever possible. Thanks for sharing your experience! If there's anything else you see that could be improved I'd be happy to hear it. (Edit: removed the requirement for at least two. There may be clients/servers who negotiate some other refresh scheme which allows for accurate refreshes with just 1.) |
@biglittlebigben makes a very good point. I faced the exact same issue he mentioned when I tried to add LHLS support to Exoplayer with just 1 prefetch segment. Definitely 2 prefetch segments in the playlist would be useful to reduce the latency significantly. But for the purpose of practical systems we could add a condition that the client should request the 2nd PREFETCH segment, only after the 1st PREFETCH segment is loaded completely. In that way we could achieve low latency without imposing the strict conditions around CDNs and origins supporting long polling. Also I don't understand the reason behind more than 2 prefetch segments. Can we limit the maximum number of prefetch segments to 2? Is there a practical use-case for publishing 3 or more prefetch segments? |
This requirement of waiting would delay arrival of data for the 2nd prefetch segment by a network roundtrip time after the client is done downloading the 1st one. Any such added jitter will require the client to keep a longer buffer (by the roundtrip duration here). Maybe that's an acceptable trade off if there is a consensus that supporting long requests is a challenge for CDNs, but again, we have had no issue with our vendor.
I would say that there is a tradeoff between the amount of prefetch segments and how often the client needs to refresh the playlist. This is particularly true for segments before a discontinuity, that can be shorter than the target duration. |
[discontinuities]: #ext-x-discontinuity | ||
|
||
A prefetch segment must not be advertised with an `EXT-X-DISCONTINUITY` tag. To insert a discontinuity just for prefetch segments, the server must insert the `EXT-X-PREFETCH-DISCONTINUITY` tag before the newest `EXT-X-PREFETCH` tag of the new discontinuous range. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our use case, discontinuities are caused by a gap in timestamps in the stream sent by the broadcaster (because of a networking issue or a camera flip for instance). This means that we do not know if there is a discontinuity at the time we advertise the prefetch segment. We only know of the discontinuity at the point when we write the first data into the prefetch segment.
This means that we would add a EXT-X-PREFETCH-DISCONTINUITY tag to a prefetch segment after it has first been advertised, and that a player may not see the EXT-X-PREFETCH-DISCONTINUITY tag by the time it starts playing back media from that prefetch segment.
This behavior doesn't seem to be explicitly forbidden by the spec, but it does require the player to have some heuristic to detect discontinuities that were not advertised in time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that we would add a EXT-X-PREFETCH-DISCONTINUITY tag to a prefetch segment after it has first been advertised, and that a player may not see the EXT-X-PREFETCH-DISCONTINUITY tag by the time it starts playing back media from that prefetch segment.
I can foresee this being a problem if there is a large gap between the PTS values. Hls.js feeds a discontinuity tag into the muxer so that it can modify PTS values post-discontinuity to ensure that there is no gap. Without the tag we'd be inserting a gap into the sourceBuffer equal to to the PTS gap in seconds. Depending on how much forward buffer exists we can jump it without a stall, but we'd still need to do a seek in userland which would cause a momentary disruption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you imagine the interaction between EXT-X-GAP and prefetch segments should work? Should it be allowed to add EXT-X-GAP to an already advertised prefetch segment? That would be another way to address the use case above I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behavior doesn't seem to be explicitly forbidden by the spec
I'll add language which addresses this case. It doesn't make sense to add it as a prefetch discontinuity after the fact, but it must appear when transformed to a complete segment. Players should be able to handle gaps so I don't believe that this a problem.
Maybe the prefetch discontinuity isn't even needed anymore - I need to think a bit harder about SSAI workflows. You can probably just stop adding prefetch segments and then insert a regular discontinuity before the ads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you imagine the interaction between EXT-X-GAP and prefetch segments should work? Should it be allowed to add EXT-X-GAP to an already advertised prefetch segment? That would be another way to address the use case above I believe.
I suppose a gap would a) cause the client to ignore the segment or b) abort the connection if it has already been requested, but data has not yet been received. Not sure if I understand how to fix gaps with this - would you mind giving me an example?
In general the blocker for post-advertisement insertion of prefetch tags is that clients may not refresh the playlist in time to know that the segment has a new tag. I can see adding tags post-advertisement in the case that there are multiple prefetches and the tag is on N+1 or greater - the client should refresh the manifest before the encoder beings sending segments. But I think there will be some tricky race conditions with this kind of solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently detect the gap in timestamps in the backend. If it is over a threshold, we start a new segment and insert a discontinuity. This does mean that players watching over LHLS will not hear about the discontinuity in time from the playlist. Our player has heuristics to detect such gaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I don't believe this makes much of a change to that workflow.
The prefetch segment up until the point the packager detects the gap in the backend would continue to push data normally like a prefetch. Upon detecting the gap, it would complete the segment, close the connection.
Updates the manifest locally with that completed segment, advertising the length now with a EXT-INF, push that manifest to the origin with the next prefetch having the DISCO applied to it.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what we would do indeed. Is this allowed by the spec though? (inserting a discontinuity for a segment that didn't have a prefetch discontinuity?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@biglittlebigben Yes, and I believe it should also be mandatory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work for our use case then. Any feedback from player developers about having to detect and handle timestamp discontinuities?
Hi all, We've been working hard on a demo in order to prove out some of the ideas we've introduced here. You can find it here: http://demo.jwplayer.com/lhls/. A beta build off of Hls.js will be made available in the near future, which you can play around with yourself. In this demo, you can use your own file with the The results have been very positive so far. We're able to reach ~1s of latency under ideal conditions, but 4s is typically a comfortable amount. Our tests have currently been done with LL-CMAF. The demo has surfaced some new requirements, which are critical to a functional LHLS implementation:
Requirement 1 requires Requirement 2 is a bit trickier. Synchronization of updates with the server is critically important if you're using one prefetch tag, since a refresh miss can mean the client runs through whatever small forward buffer it has. Our current implementation estimates the next manifest refresh using a combination of What requirement #2 also exposes is how the current refresh miss logic is insufficient for smooth playback at lower latencies. It states that the interval should be halved on miss - we have found this to be much too slow in practice. In the case of a miss the next reload needs to be made much sooner. In general, the demo has underscored the challenge of implementing LHLS. I'd like to continue a discussion on a companion guide, which outlines best practices/patterns for accomplishing common functionality, such as playback rate manipulation for latency targeting. I know I've been a bit slack in integrating the current round of feedback, but I'll be getting around to that soon. Having a functioning codebase will be critical for testing out solutions to the more challenging requirements |
I agree with you on the issues with timing the "manifest refresh" properly. And it is potential bottleneck for latency improvement. Based on this experience should we modify the spec to mandate atleast two PREFETCH tags for LHLS. When we are mandating two PREFETCH urls, I would like if the spec clearly mentions that the fetching of the first prefetch should be COMPLETE, before request for fetching of the second prefetch segment is made. Such a wording in the spec will relax the long polling support requirements for the CDN or the origin HTTP server. Better yet would be specify a mandatory delay between download completion of nth segment to the download start of the (n+1)th segment. Encoder can set this delay to be equal to one frame duration, so that existing set of HTTP servers and CDN can support LHLS with two PREFETCH tags seamlessly. |
Apple announce Low-Latency HLS in the wwdc2019. |
It would seem to me the preliminary LHLS spec by Apple is vastly more complicated than what has been discussed here. |
Was wondering on thoughts around Community LHLS and Apple Low Latency HLS |
I think there is 0% chance that Apple takes what the community has developed. They have decided on their path and are headed that direction Not sure on hls.js, but I know other player vendors, including Wowza, are working on implementing the Apple LHLS spec ScottK |
Hi all, It's time to kill LHLS. From what I can tell, Apple's LLHLS is the choice of the community - therefore, there's no point in continuing on here. Apple has done a great job working with video-dev to address our concerns, and I feel that the best way forward is to continue to work together and make LLHLS the best it can be. It's my hope that we can continue to have a voice within Apple to continue advocating for low-latency and HLS beyond. |
Thanks John
And good on yer!
… On 7 Apr 2020, at 12:48 am, John Bartos ***@***.***> wrote:
Hi all,
It's time to kill LHLS. From what I can tell, Apple's LLHLS is the choice of the community - therefore, there's no point in continuing on here. Apple has done a great job working with video-dev to address our concerns, and I feel that the best way forward is to continue to work together and make LLHLS the best it can be. It's my hope that we can continue to have a voice within Apple to continue advocating for low-latency and HLS beyond.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Thanks John! Great job stewarding this forward. |
Thanks John! |
Please leave all feedback in this PR. Nothing is set in stone; if you think something won't work, or there's a better way, please open a discussion so we can make it better. I'm still doing editing & cleanup but grammatical improvements are also very appreciated.
Also note that this PR isn't just for Hls.js - other clients are welcome and encouraged to participate. It'd be great to see more implementations in the wild.
Please see https://github.com/video-dev/hlsjs-rfcs/pull/1/files#diff-585ec6c4984f979b2e85a1ee0280b804R191 for the list of open questions. Everyone is welcome to help resolve them!