Low-latency HLS Streaming #1

johnBartos · 2018-09-28T18:12:55Z

Please leave all feedback in this PR. Nothing is set in stone; if you think something won't work, or there's a better way, please open a discussion so we can make it better. I'm still doing editing & cleanup but grammatical improvements are also very appreciated.

Also note that this PR isn't just for Hls.js - other clients are welcome and encouraged to participate. It'd be great to see more implementations in the wild.

Please see https://github.com/video-dev/hlsjs-rfcs/pull/1/files#diff-585ec6c4984f979b2e85a1ee0280b804R191 for the list of open questions. Everyone is welcome to help resolve them!

proposals/0001-lhls.md

shacharz · 2018-09-28T18:16:26Z

gj on the initiative!

some initial notes:

media server and http edge aren’t necessarily the same machine, what happens when you have a prefetch segment but the edge isn’t configured to chunked transfer ?
Should media container be in the discussion?
should playback rate and rebuffer handling be part of the spec?
when does the media server need to advertise the next prefetch segment ?
Specifically race conditions like edge doesn’t have the file at all yet should be considered to avoid 404’s
Spec terminology nits: first and last are a bit hard to follow, I prefer newest, oldest

…ices were made

johnBartos · 2018-10-03T20:24:19Z

@shacharz Thanks for the feedback!

I think this is out of scope for this proposal. I'm assuming that developers know how to deliver segments via HTTP transfer; this proposal creates a common language to signal them to a client.
Personally I'm all for CMAF-only. But some developers want MPEG-TS and have provided compelling reasons. That discussion is ongoing. But as for whether it should be part of this spec is questionable. If a developer wants MPEG-TS and a client wants to support why should we stop them?
I added this as an open question. For me the crux is whether it's valuable to force a common solution, or whether to let this behavior be up to the client. I'm personally leaning towards the client. A compromise may be another tag which specifies the maximum distance desired from the live edge.
Good question. I'm open to merging language which guides developers on this. I'm not very skilled with delivery technology so it'd be great to have some help.
I'll change the instances of first & last that I personally wrote, but in other places (like with MEDIA-SEQUENCE-NUMBER) I use the terminology of the spec.

ScottKell · 2018-11-13T18:36:38Z

Here are some additional comments. Nice work John! Overall, I don't see any trouble spots with implementing

"MSE-based": In a couple places it sounds like the client must be MSE based. To me, LHLS is independent of MSE or an MSE-based client. It could be a native mobile client or a flash based player :)
"A segment must remain available to clients for a period of time equal to the duration of the segment plus the duration of the longest Playlist file distributed by the server containing that segment."
Not sure I understand the 2nd half of ^^. " duration of the longest Playlist file distributed by the server containing that segment."
Prefetch Media Segments
I think it should state that the prefetch segment may be advertised only after a single byte is available
"A prefetch segment must not be advertised with an EXTINF tag. The duration of a prefetch segment must be equal to or less than what is specified by the EXT-X-TARGETDURATION tag."

I'm not sure how a media server can always guarantee that duration of the prefetch segment is <= TARGETDURATION. The target duration in real world live cases is typically that largest segment that has been seen at that point. For example, if the media server is not transcoding and only repackaging on GOP boundaries of the incoming encoded stream, the incoming stream may end up with a longer prefetch segment based on GOP boundaries than it has previously seen. This may be an edge case, but it would be fairly common with Wowza's Server

General Server Resp
Maybe worth calling out that server must NOT respond with an error to any prefetch segments. No 404s, etc
Unresolved questions
6a. "Should the manifest only update after the prefetch frags have completed? Can prefetch frags be repeated if they are not yet completed?"
I don't understand 2nd question. What would a repeated prefetch tag be?

6b. I think rebuffering and playback rate are client specific. This spec does LHLS really well so IMO focus on that and leave these up to specific client implementations. Different use cases may call for different approaches (gradual catch up to live versus immediately jump to live in cases of drift, for example)

6c. I think CMAF is the logical companion of LHLS for many reasons, but no reason to restrict ts chunk delivery that I can see.

johnBartos · 2018-11-20T17:05:41Z

Hey @ScottKell! Thanks for the in-depth review. I've been a bit tied up with the next Hls.js release but I'll address your feedback shortly after.

nicoweilelemental · 2018-11-22T10:13:55Z

On point 3. Prefetch Media Segments:

For LHLS+Chunked CMAF segments, if we want to align the logic on DASH (where the AvailabilityTimeOffset parameter handles this case), the client shall not be told to request a segment until the first CMAF chunk is available on the origin (as it is the smallest logical unit). If we reference a segment for prefetch one segment ahead, it will open multiple CDN connections on the origin (hopefully not too much if the CDN correctly collapses requests), and assuming that the origin won't reject those connection requests, the gain will just be the network connection opening time, which is negligible compared to the segment duration/load time.

For LHLS+TS segments I guess that there is less constraints, so starting after a few uploaded bytes (equivalent to the TS headers?) might work fine.

The additional problem is that the origin can add some latency if it's buffering the data coming from the packager, so the AvailabilityTimeOffset defined at packager level won't be totally accurate - the packager will need to add the origin-generated latency to define the precise time when the segment can be advertised for prefetching in the playlist.

johnBartos · 2018-12-10T20:10:34Z

@ScottKell

"MSE-based": In a couple places it sounds like the client must be MSE based. To me, LHLS is independent of MSE or an MSE-based client. It could be a native mobile client or a flash based player :)

Good point, I'll change language to reflect this.

"A segment must remain available to clients for a period of time equal to the duration of the segment plus the duration of the longest Playlist file distributed by the server containing that segment."
Not sure I understand the 2nd half of ^^. " duration of the longest Playlist file distributed by the server containing that segment."

This is Apple's language but I think I can do a better job of simplifying it.

Prefetch Media Segments
I think it should state that the prefetch segment may be advertised only after a single byte is available

Sounds reasonable, but I'm wondering if this needs to be a requirement. Is it fundamentally impossible (or ill-advised) to advertise before the first byte is available, or is this specific to Wowza?

"A prefetch segment must not be advertised with an EXTINF tag. The duration of a prefetch segment must be equal to or less than what is specified by the EXT-X-TARGETDURATION tag."

I'm not sure how a media server can always guarantee that duration of the prefetch segment is <= TARGETDURATION. The target duration in real world live cases is typically that largest segment that has been seen at that point. For example, if the media server is not transcoding and only repackaging on GOP boundaries of the incoming encoded stream, the incoming stream may end up with a longer prefetch segment based on GOP boundaries than it has previously seen. This may be an edge case, but it would be fairly common with Wowza's Server

General Server Resp
Maybe worth calling out that server must NOT respond with an error to any prefetch segments. No 404s, etc

Hmm, it seems like errors are allowed by RFC8216:

If the server wishes to remove an entire presentation, it SHOULD
provide a clear indication to clients that the Playlist file is no
longer available (e.g., with an HTTP 404 or 410 response).

The client should be able to handle bad status codes (and will have to), so I think its better to imply that the client must be able to handle these cases.

Unresolved questions
6a. "Should the manifest only update after the prefetch frags have completed? Can prefetch frags be repeated if they are not yet completed?"
I don't understand 2nd question. What would a repeated prefetch tag be?

A prefetch tag is repeated if its found in two manifests e.g. it remains after refreshing. But now that I'm reading this again it doesn't really make sense. Will mark as resolved.

6b. I think rebuffering and playback rate are client specific. This spec does LHLS really well so IMO focus on that and leave these up to specific client implementations. Different use cases may call for different approaches (gradual catch up to live versus immediately jump to live in cases of drift, for example)

True, it's a bit of a can of worms to offer guidelines on this. I was thinking more from the perspective of an event publisher, who wants to guarantee that their viewers are as close to the live edge as possible, regardless of which client they're using. If no catch-up is required it's hard to guarantee without knowing how the client is implemented. But I agree, I think it should be left up to the client.

6c. I think CMAF is the logical companion of LHLS for many reasons, but no reason to restrict ts chunk delivery that I can see.

That's what I've been hearing as well. Will mark as resolved.

Again, my thanks to you and the Wowza team for the feedback! 👍

johnBartos · 2018-12-11T21:48:45Z

@nicoweilelemental

For LHLS+Chunked CMAF segments, if we want to align the logic on DASH (where the AvailabilityTimeOffset parameter handles this case), the client shall not be told to request a segment until the first CMAF chunk is available on the origin

It'd be pretty difficult to make an AvailabilityTimeOffset analogue in HLS. PROGRAM-DATE-TIME could probably be used along with a PREFETCH tag, but I'm not sure how accurate this would be in practice. I believe that the HLS-thonic way is to have the server control availability via manifest - if it's in the manifest the client is allowed to download it, otherwise it won't. The server can choose when it appends the segment, be it after the first byte or before. I think the only danger is if the client requests a refresh too soon and misses the refresh, and subsequently has to wait duration / 2 before re-requesting. I'll see if we can come up with anything to better synchronize playlist refresh between the sever/client - I know of a method using etags (or something similar) in the HTTP request, but I'm not sure if it's appropriate in a standard.

If we reference a segment for prefetch one segment ahead, it will open multiple CDN connections on the origin (hopefully not too much if the CDN correctly collapses requests)

We're limiting the max amount of prefetch segments to 2 to deal with load. We wanted to do just one, but other encoders have setups where two segments can be transcoding at once (the example I was given was that the next segment begins while the b-frames of the last segment are being completed).

For LHLS+TS segments I guess that there is less constraints, so starting after a few uploaded bytes (equivalent to the TS headers?) might work fine.

I think this is generally a good practice but up to the server. Putting stuff in the spec has it's own danger - trying to remove/deprecate features becomes more difficult because devs may be relying on it.

The additional problem is that the origin can add some latency if it's buffering the data coming from the packager, so the AvailabilityTimeOffset defined at packager level won't be totally accurate - the packager will need to add the origin-generated latency to define the precise time when the segment can be advertised for prefetching in the playlist.

Yeah this will be a problem with manifest refreshing too - the actual refresh time should be the duration of a segment + whatever latency is incurred on the encoding/delivery side. As alluded to before, I believe this can be accomplished with the etag:

On the server:
response.etag = T_segment + T_encoding

On the client:
refreshOffset = T_encoding = response.etag - T_segment

Where refreshOffset is added to the manifest refresh time (usually equal the duration of the playlist).

Just some napkin math but I believe that's the general idea. This Will Law's idea, I'm going to follow up with him to see if this correct. I haven't anticipated how necessary this will be for the success of LHLS so it's not in the spec yet; it may be a "wait and see" thing.

Thanks for the feedback!

TBoshoven

I left a few suggestions, mostly to fix some of the language.

Some more comments:

I would also like to see documentation about which use cases are out-of-scope (ULL / real-time communication).
The guide-level explanation specifically mentions Hls.js a number of times, even though the section describes a pretty generic client implementation (MSE and Fetch API excluded). I think it might be valuable to focus less on the Hls.js implementation in that section.
In general, I think the spec should not depend on the client-side implementation being written in JavaScript or using the MSE/Fetch APIs. These are used in the concrete Hls.js implementation, but implementation of a client that does not have access to these APIs (for example in an FFmpeg-based client solution) should be possible using this RFC. However, the Hls.js implementation can be used to illustrate how a client could be implemented.

proposals/0001-lhls.md

TBoshoven · 2018-12-13T10:59:19Z

proposals/0001-lhls.md

+
+## Media Segment Tags
+
+The server must not precede any prefetch segment with metadata other than those specified in this document, with the specified constraints.


I think this constraint breaks extensibility. Since you explicitly list the #ext-x- tags that are not allowed in this section, this statement can be considered redundant.

TBoshoven · 2018-12-13T11:46:27Z

proposals/0001-lhls.md

+
+* Transform a prefetch segment to a complete segment. ([Prefetch Transformation](#prefetch-transformation))
+
+To each prefetch segment response, the server must append the `Transfer-Encoding: chunked` header. The server must maintain the persistent HTTP connection long enough for a client to receive the entire segment - this must be no less than the time from when the segment was first advertised to the time it takes to complete.


This limits communication to HTTP/1.1, because HTTP/2 does not support this mechanism (as described in RFC7540, Section 8.1).
Furthermore, Chunker Transfer Encoding requires that we follow a specific protocol (described in RFC7230, Section 4.1) which is not referred to here.

I recommend making a distinction between HTTP versions and referring to the specs that describe streaming (chunked) data transfer.

👍 Good idea, I've gotten similar feedback

proposals/0001-lhls.md

TBoshoven · 2018-12-13T12:15:44Z

proposals/0001-lhls.md

+## What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC?
+
+- Alternative connection protocols (WebRTC, Websockets, etc.)
+- Manifestless mode


It might be worth linking to a resource that describes this concept.

Yeah we can link out to some DASH docs for now, there isn't anything specified for HLS atm

TBoshoven · 2018-12-13T12:18:33Z

proposals/0001-lhls.md

+#EXT-X-PREFETCH:https://foo.com/bar/7.ts
+```
+
+`5.ts, 6.ts, and 7.ts` all have the Discontinuity Sequence Number of 1. Note how the `PREFETCH-DISCONTINUITY` transformed to the conventional `EXT-X-DISCONTINUITY` tag, and how that tag still applies to prefetch segments.


A new #EXT-X-PROGRAM-DATE-TIME tag was also introduced. I cannot find this behavior described in earlier sections.

It was in an old draft, I'll add it back

jkarthic-akamai · 2018-12-19T10:31:20Z

We're limiting the max amount of prefetch segments to 2 to deal with load. We wanted to do just one, but other encoders have setups where two segments can be transcoding at once (the example I was given was that the next segment begins while the b-frames of the last segment are being completed).

@johnBartos Thanks for this work! Please find my comment below.

I still don't understand why we would need 2 prefetch segments. Yes, I agree that the encoder might be still working with the B-frames of the last segment, while working on the I-frame and/or P/B-frames of the current segment. This is an expected behavior when the encoder is multi-threaded based on frame-parallelism. But still I would expect the encoder to output and upload the frames based on the monotonically increasing order of DTS. In such a case the upload of the current segment would start only after the upload of previous segment is complete. One practical example is the x264 encoder + ffmpeg. x264 encoder is frame-based multithreaded and hence could work on frames across segment at the same time. But still it always outputs the frames on the order of monotonically increasing DTS. Hence I would suggest limiting the number of prefetch segments to 1.

Please feel free to correct me if my understanding is incomplete.

nicoweilelemental · 2018-12-20T07:33:52Z

@johnBartos I concur with @jkarthic-akamai : while it make sense to prefetch the N+1 segment as soon as the first bytes or CMAF chunk are available on the origin, prefetching the N+2 segment will require to support long polling over a duration superior to the duration segment. It's just gonna open sockets with no data to transmit, generate timeouts and false positive errors in the logs (like in dash with the wallclock misalignment problems).

Honestly I don't see CDNs or origins supporting the N+2 prefetch anytime, the benefit of it is absolutely not proven and very long polling comes with a lot of security risks for the CDNs/origins. It's just not realistic to require N+2 prefetch support across the chain. I also suggest to limit the number of prefetch segments to 1.

johnBartos · 2019-01-02T16:56:06Z

@jkarthic-akamai @nicoweilelemental I agree; thanks for the breakdown. I'll amend the spec for N+1 only

nicoweilelemental · 2019-01-03T14:44:20Z

Thanks @johnBartos - looking forward for the hls.js implementation !

heff · 2019-01-10T23:43:50Z

proposals/0001-lhls.md

+# Summary
+[summary]: #summary
+
+Low-latency streaming is becoming an increasingly desired feature for live events, and is typically defined as a delay of two seconds or less from point of capture to playback (glass-to-glass). However, the current HLS specification precludes this possibility - within the HLS guidelines, the best attempts have achieved about four seconds glass-to-glass, with average implementations typically beyond thirty seconds. This RFC proposes modifications to the HLS specification ("HTTP Live Streaming 2nd Edition" specification (IETF RFC 8216, draft 03)") which aim to reduce the glass-to-glass latency of a live HLS stream to two seconds or below. The scope of these changes are centered around a new "prefetch" segment; it's advertising, delivery, and interpretation within the client.


This is kind of a moot point because I think 2 seconds is a good goal for this project, but 2 seconds is probably the most aggressive definition of "low latency" I've seen. Wowza pegs it at 1-5s and @wilaw has it at 4-10s, with 2 seconds being closer to "Ultra low latency". Not sure what I expect you to do with that info but thought it was worth pointing out in case there's opportunity for industry consistency.

Agreed. I've been working with Will on the new latency ranges definition that you have seen in his Demuxed presentation. We defined low latency by what you can achieve with 1s and 2s segments of regular HLS/DASH (meaning : 4 to 10 seconds latency) and ultra low latency as what you can achieve with chunked CMAF (meaning : between 1 and 4 seconds). We used this technology criteria as the previous latency ranges definition by Wowza & Streaming Media was mixing technology and use case requirements criterias.

So we maybe should say here: 'Low-latency and Ultra low-latency streaming are becoming increasingly desired features for live events, and are typically defined as a delay of 4 to 10 seconds for low latency and 1 to 4 seconds for ultra low latency, from point of capture to playback (glass-to-glass)'

Yeah agreed. Thanks for keeping me up to date with Will's work - I want to be aligned with whatever he's doing where possible. Has he been factoring the Streamline project into his definitions? The LHLS fork of Exoplayer is getting 1.1s latency which is pretty crazy.

'Low-latency and Ultra low-latency streaming are becoming increasingly desired features for live events, and are typically defined as a delay of 4 to 10 seconds for low latency and 1 to 4 seconds for ultra low latency, from point of capture to playback (glass-to-glass)'

Sounds good to me!

As of writing this, Hls.js by default will have between 1 and 2 segment's duration of client-side latency. The plan is to start the stream at the last complete segment, and begin buffering prefetch from there. I'm not sure what server-side latency is looking like but it should put us at the lower end of the "Low latency" definition.

We defined low latency by what you can achieve with 1s and 2s segments of regular HLS/DASH (meaning : 4 to 10 seconds latency) and ultra low latency as what you can achieve with chunked CMAF (meaning : between 1 and 4 seconds).

So by this definition we're building an ultra low latency player.

By default yes, but the player configuration shall allow a higher target latency. Which leads me to another close consideration : in DASH we are heading towards setting the target latency on the manifest level, and not on the player configuration level. Would it be interesting for LHLS to discuss a similar approach, like #EXT-X-TARGETLATENCY: 2500 (if we measure in milliseconds) ?

@nicoweilelemental That's an interesting idea!!! There are lot of advantages in being able to configure the playout latency at the manifest itself. If we going ahead with this, I propose a small modification though. Instead of setting a target latency, I suggest we set a target buffer size. Since the encoder's latency is not in player's control this definition could be little-bit misleading. Instead we could ask the player to maintain a specific target buffer size with the condition that it continuously loads data to be as close to the live edge as possible. Something like #EXT-X-TARGETBUFFERSIZE: 2500 .

You're right @jkarthic-akamai it's hard for the player to determine what's the actual E2E latency. The only way would be to make at least one #EXT-X-PROGRAM-DATE-TIME insertion mandatory per child playlist. In dash we recommend putting the Producer Reference Time in the prft mp4 box, which the player can parse to get the actual timecode of a segment: it could replace #EXT-X-PROGRAM-DATE-TIME for latency measurement.
Setting the #EXT-X-TARGETBUFFERSIZE instead will roughly fill the same purpose while relaxing the dependency to absolute time.

proposals/0001-lhls.md

heff · 2019-01-17T01:38:54Z

proposals/0001-lhls.md

+
+The client may opt into an LHLS stream. If so, the client must choose a prefetch Media Segment to play first from the Media Playlist when playback starts. The client must choose prefetch Media Segments for playback in the order in which they appear in the Playlist; however, the client may open connections to as many prefetch segments as desired. If data from a newer prefetch Media Segment is received before an older one, the client should not append this data to the SourceBuffer; doing so may stall playback. If the client opts out of LHLS, it must ignore all prefetch Media Segments, and any additional constraints outlined in this specification.
+
+The client may set a minimum amount of buffer to begin and maintain playback. The client should not impose a minimum buffered amount greater than one target duration; doing so may introduce undue latency.


This feels like somewhere we might want to provide a little more guidance, smart defaults, or open the door to configure the target min buffer somehow. With ultra low latency there will be a fine balance between lower latency and more rebuffering, that will likely be audience dependent. A lot of players don't give you any buffer config options today.

Yeah I need to rework this section. It was trying to be an analogue of this section in the HLS spec but it didn't really come out right.

This feels like somewhere we might want to provide a little more guidance, smart defaults, or open the door to configure the target min buffer somehow.

We're trying to be as hands-off as possible with recommendations - clients should be able to build whatever experience best suits their usecase. Maybe we can strike a balance with language (should instead of must), but it may be more productive to put any kind of guidance in some ancillary doc where we don't have to worry about spec compliance. Hls.js will be a reference implementation, too.

clients should be able to build whatever experience best suits their usecase.

Just to clarify this, they should build whatever low-latency usecase best suits them. The original intent behind this section was to ensure that they were operating like a low-latency player (and not just ignoring prefetch segments or having like 30s of buffer or whatever) but I don't think it came out right. I'll take another stab at it.

@nicoweilelemental's other comment about providing the target latency in the manifest would actually solve my concerns here. I like that idea a lot. Assuming a player respects that tag, it wouldn't have to expose much else.

+1 to target latency in the manifest. There definitely needs to be a knob to turn for different use cases as they walk the line between latency and rebuffering

I agree with Will and his summary. The client will play the LHLS stream as best as it can given the current conditions and configuration. The problem with a target is that it's not actionable - if the client is behind because the network is slow, there's nothing it can do to get ahead. I was checking to see if DASH had something similar but couldn't find anything (but I didn't have the chance to look very hard).

But even then, manifest updates and network delays can cause us to stall if we try to play at the live edge. So we will actually delay where we will play at so we don't stall. The time we adjust by can be specified in the manifest using the MPD@suggestedPresentationDelay attribute. This specifies the delay to give the live stream to allow for smooth playback. If it isn't specified, we will give a reasonable default value.

suggestedPresentationDelay looks interesting (it seems to specify a minimum latency), but I'm not sure how useful it is for LHLS. The encoder must update the manifest with new segments on an interval equal to the average length of a segment.

I think the manifest should add information to allow the latency to be estimated and I would support the spec saying that the package MUST add #EXT-X-PROGRAM-DATE-TIME to the media playlists.

I had #EXT-X-PROGRAM-DATE-TIME this in an original draft just for this purpose, but deleted it - i couldn't come up with an acceptable definition that all encoders could follow. Is the timestamp the time when the segment begins transcoding, has finished; or is it something like when the manifest was created in-memory (or between whatever stages in a transcoding pipeline). Maybe it doesn't matter too much. Input appreciated here, I'd like to add it back.

There are indeed many potential reference points for where to hang the definition of #EXT-X-PROGRAM-DATE-TIME. MPEG has defined 6 if I remember correctly for the equivalent in DASH. There is no need for that complexity here. All you need is a COMMON reference point that all clients can access. They can then achieve synchronization (per @jkarthic-akamai comment above) by targeting a fixed delta from that point. A practical point would be the wallclock time at which that frame of media entered the encoder. Assuming very small camera delay, the delta between that value and the wallclock time when the media frame is displayed by the client then represents the end-to-end latency. That's for lab confirmation. In the real production world, there is an unknown production delay upstream of the encoder (camera, OB truck, satellite contribution, broadcast profanity delay etc) so its very difficult for the end client to calculate the true e2e latency. Luckily, we don;t need to know the true e2e latency even for synch, we just need a consistent reference point between clients.

I can get behind the philosophy of the manifest just reporting the availability. But defining the target latency somewhere is critical in these use cases, so it brings me back to the original comment of needing more in the spec to get clients to expose configuration. i.e. if iOS Safari implements LHLS with a set latency target and no option to configure it, that's not good.

A practical point would be the wallclock time at which that frame of media entered the encoder.

I'm not sure the other options but that seems the most sensible. However with a UGC platform, streamers can use anything that streams RTMP (e.g. OBS) to the central service, and in that case I don't believe the central service has access to when the media frame entered the original encoder. For this use case I think I'd be fine with just using the time the central service received the media frame in the stream. It ignores the time prior to that, but assuming I can configure all my players' target latencies, I can adjust for that.

I had #EXT-X-PROGRAM-DATE-TIME this in an original draft just for this purpose, but deleted it - i couldn't come up with an acceptable definition that all encoders could follow. Is the timestamp the time when the segment begins transcoding, has finished; or is it something like when the manifest was created in-memory (or between whatever stages in a transcoding pipeline). Maybe it doesn't matter too much. Input appreciated here, I'd like to add it back.

I agree that there is no need for us to define #EXT-X-PROGRAM-DATE-TIME strictly. We could just stick with the original definition from the official HLS spec. https://tools.ietf.org/html/draft-pantos-http-live-streaming-23#page-17 . We just suggest to make it as mandatory(MUST) parameter, so that all clients have a common reference point to sync.

FWIW, we are relying on #EXT-X-PROGRAM-DATE-TIME to decide on what segment to start playback to reach our target latency. This approach has been successful for us.

Co-Authored-By: johnBartos <jbartos7@gmail.com>

biglittlebigben · 2019-01-26T00:12:13Z

@johnBartos I concur with @jkarthic-akamai : while it make sense to prefetch the N+1 segment as soon as the first bytes or CMAF chunk are available on the origin, prefetching the N+2 segment will require to support long polling over a duration superior to the duration segment. It's just gonna open sockets with no data to transmit, generate timeouts and false positive errors in the logs (like in dash with the wallclock misalignment problems).

Honestly I don't see CDNs or origins supporting the N+2 prefetch anytime, the benefit of it is absolutely not proven and very long polling comes with a lot of security risks for the CDNs/origins. It's just not realistic to require N+2 prefetch support across the chain. I also suggest to limit the number of prefetch segments to 1.

Some feedback coming from our experience at Periscope and Twitter, after managing a large scale LHLS deployment for more than 2 years: having 2 prefetch segments is an important feature for us to constrain the total end to end latency. It allows the client to start receiving data immediately for the next segment after the current prefetch segment is ended. Without this, the client would need to eagerly request a new playlist immediately after the server closes the current prefetch segment. This means that:

The client will need to keep a longer buffer to avoid a stall while it downloads the new playlist and requests the new prefetch segment
This will cause a large request spike for the playlist asset at the time server closes the prefetch segment because all clients will request the mew playlist at the same time
We have had no issue with our CDN vendor supporting long polling HTTP requests for these extra prefetch segments.

johnBartos · 2019-01-26T00:34:15Z

@biglittlebigben

That was the conclusion @TBoshoven and I came to - I hadn't considered the impact of refresh misses when we were having the above discussion. I'm going to amend the segment to be able to handle multiple segments. If your system can support a dozen prefetch segments, or just one, the spec shouldn't stop you; if you want to support one, that's fine too. But I believe that two will be the natural default. We're trying avoid being prescriptive wherever possible.

Thanks for sharing your experience! If there's anything else you see that could be improved I'd be happy to hear it.

(Edit: removed the requirement for at least two. There may be clients/servers who negotiate some other refresh scheme which allows for accurate refreshes with just 1.)

jkarthic-akamai · 2019-01-28T06:03:59Z

@biglittlebigben

That was the conclusion @TBoshoven and I came to - I hadn't considered the impact of refresh misses when we were having the above discussion. I'm going to amend the segment to be able to handle multiple segments. If your system can support a dozen prefetch segments, or just one, the spec shouldn't stop you; if you want to support one, that's fine too. But I believe that two will be the natural default. We're trying avoid being prescriptive wherever possible.

Thanks for sharing your experience! If there's anything else you see that could be improved I'd be happy to hear it.

(Edit: removed the requirement for at least two. There may be clients/servers who negotiate some other refresh scheme which allows for accurate refreshes with just 1.)

@biglittlebigben makes a very good point. I faced the exact same issue he mentioned when I tried to add LHLS support to Exoplayer with just 1 prefetch segment. Definitely 2 prefetch segments in the playlist would be useful to reduce the latency significantly. But for the purpose of practical systems we could add a condition that the client should request the 2nd PREFETCH segment, only after the 1st PREFETCH segment is loaded completely. In that way we could achieve low latency without imposing the strict conditions around CDNs and origins supporting long polling.

Also I don't understand the reason behind more than 2 prefetch segments. Can we limit the maximum number of prefetch segments to 2? Is there a practical use-case for publishing 3 or more prefetch segments?

biglittlebigben · 2019-01-28T16:08:58Z

@biglittlebigben
That was the conclusion @TBoshoven and I came to - I hadn't considered the impact of refresh misses when we were having the above discussion. I'm going to amend the segment to be able to handle multiple segments. If your system can support a dozen prefetch segments, or just one, the spec shouldn't stop you; if you want to support one, that's fine too. But I believe that two will be the natural default. We're trying avoid being prescriptive wherever possible.
Thanks for sharing your experience! If there's anything else you see that could be improved I'd be happy to hear it.
(Edit: removed the requirement for at least two. There may be clients/servers who negotiate some other refresh scheme which allows for accurate refreshes with just 1.)

@biglittlebigben makes a very good point. I faced the exact same issue he mentioned when I tried to add LHLS support to Exoplayer with just 1 prefetch segment. Definitely 2 prefetch segments in the playlist would be useful to reduce the latency significantly. But for the purpose of practical systems we could add a condition that the client should request the 2nd PREFETCH segment, only after the 1st PREFETCH segment is loaded completely. In that way we could achieve low latency without imposing the strict conditions around CDNs and origins supporting long polling.

This requirement of waiting would delay arrival of data for the 2nd prefetch segment by a network roundtrip time after the client is done downloading the 1st one. Any such added jitter will require the client to keep a longer buffer (by the roundtrip duration here). Maybe that's an acceptable trade off if there is a consensus that supporting long requests is a challenge for CDNs, but again, we have had no issue with our vendor.

Also I don't understand the reason behind more than 2 prefetch segments. Can we limit the maximum number of prefetch segments to 2? Is there a practical use-case for publishing 3 or more prefetch segments?

I would say that there is a tradeoff between the amount of prefetch segments and how often the client needs to refresh the playlist. This is particularly true for segments before a discontinuity, that can be shorter than the target duration.

biglittlebigben · 2019-01-28T17:50:08Z

proposals/0001-lhls.md

+[discontinuities]: #ext-x-discontinuity
+
+A prefetch segment must not be advertised with an `EXT-X-DISCONTINUITY` tag. To insert a discontinuity just for prefetch segments, the server must insert the `EXT-X-PREFETCH-DISCONTINUITY` tag before the newest `EXT-X-PREFETCH` tag of the new discontinuous range. 
+


In our use case, discontinuities are caused by a gap in timestamps in the stream sent by the broadcaster (because of a networking issue or a camera flip for instance). This means that we do not know if there is a discontinuity at the time we advertise the prefetch segment. We only know of the discontinuity at the point when we write the first data into the prefetch segment.

This means that we would add a EXT-X-PREFETCH-DISCONTINUITY tag to a prefetch segment after it has first been advertised, and that a player may not see the EXT-X-PREFETCH-DISCONTINUITY tag by the time it starts playing back media from that prefetch segment.

This behavior doesn't seem to be explicitly forbidden by the spec, but it does require the player to have some heuristic to detect discontinuities that were not advertised in time.

This means that we would add a EXT-X-PREFETCH-DISCONTINUITY tag to a prefetch segment after it has first been advertised, and that a player may not see the EXT-X-PREFETCH-DISCONTINUITY tag by the time it starts playing back media from that prefetch segment.

I can foresee this being a problem if there is a large gap between the PTS values. Hls.js feeds a discontinuity tag into the muxer so that it can modify PTS values post-discontinuity to ensure that there is no gap. Without the tag we'd be inserting a gap into the sourceBuffer equal to to the PTS gap in seconds. Depending on how much forward buffer exists we can jump it without a stall, but we'd still need to do a seek in userland which would cause a momentary disruption.

How would you imagine the interaction between EXT-X-GAP and prefetch segments should work? Should it be allowed to add EXT-X-GAP to an already advertised prefetch segment? That would be another way to address the use case above I believe.

This behavior doesn't seem to be explicitly forbidden by the spec

I'll add language which addresses this case. It doesn't make sense to add it as a prefetch discontinuity after the fact, but it must appear when transformed to a complete segment. Players should be able to handle gaps so I don't believe that this a problem.

Maybe the prefetch discontinuity isn't even needed anymore - I need to think a bit harder about SSAI workflows. You can probably just stop adding prefetch segments and then insert a regular discontinuity before the ads.

How would you imagine the interaction between EXT-X-GAP and prefetch segments should work? Should it be allowed to add EXT-X-GAP to an already advertised prefetch segment? That would be another way to address the use case above I believe.

I suppose a gap would a) cause the client to ignore the segment or b) abort the connection if it has already been requested, but data has not yet been received. Not sure if I understand how to fix gaps with this - would you mind giving me an example?

In general the blocker for post-advertisement insertion of prefetch tags is that clients may not refresh the playlist in time to know that the segment has a new tag. I can see adding tags post-advertisement in the case that there are multiple prefetches and the tag is on N+1 or greater - the client should refresh the manifest before the encoder beings sending segments. But I think there will be some tricky race conditions with this kind of solution.

We currently detect the gap in timestamps in the backend. If it is over a threshold, we start a new segment and insert a discontinuity. This does mean that players watching over LHLS will not hear about the discontinuity in time from the playlist. Our player has heuristics to detect such gaps.

So, I don't believe this makes much of a change to that workflow.

The prefetch segment up until the point the packager detects the gap in the backend would continue to push data normally like a prefetch. Upon detecting the gap, it would complete the segment, close the connection.

Updates the manifest locally with that completed segment, advertising the length now with a EXT-INF, push that manifest to the origin with the next prefetch having the DISCO applied to it.

Thoughts?

That's what we would do indeed. Is this allowed by the spec though? (inserting a discontinuity for a segment that didn't have a prefetch discontinuity?)

@biglittlebigben Yes, and I believe it should also be mandatory

That would work for our use case then. Any feedback from player developers about having to detect and handle timestamp discontinuities?

johnBartos · 2019-04-08T22:14:46Z

Hi all,

We've been working hard on a demo in order to prove out some of the ideas we've introduced here. You can find it here: http://demo.jwplayer.com/lhls/. A beta build off of Hls.js will be made available in the near future, which you can play around with yourself. In this demo, you can use your own file with the ?file= query selector.

The results have been very positive so far. We're able to reach ~1s of latency under ideal conditions, but 4s is typically a comfortable amount. Our tests have currently been done with LL-CMAF.

The demo has surfaced some new requirements, which are critical to a functional LHLS implementation:

The ability for the client to compute its latency to the encoder
The ability for the client to synchronize manifest refresh times with the encoder

Requirement 1 requires #EXT-X-PROGRAM-DATE-TIME to be present in the manifest. So this will be a must instead of should. I'm still thinking of how exactly to define it; for example, our current stream does not put it on the prefetch, but it may be better to do that.

Requirement 2 is a bit trickier. Synchronization of updates with the server is critically important if you're using one prefetch tag, since a refresh miss can mean the client runs through whatever small forward buffer it has. Our current implementation estimates the next manifest refresh using a combination of PROGRAM-DATE-TIME and a few HTTP headers. We'll be codifying more of this as our implementation improves, but you can read our first attempt here.

What requirement #2 also exposes is how the current refresh miss logic is insufficient for smooth playback at lower latencies. It states that the interval should be halved on miss - we have found this to be much too slow in practice. In the case of a miss the next reload needs to be made much sooner.

In general, the demo has underscored the challenge of implementing LHLS. I'd like to continue a discussion on a companion guide, which outlines best practices/patterns for accomplishing common functionality, such as playback rate manipulation for latency targeting.

I know I've been a bit slack in integrating the current round of feedback, but I'll be getting around to that soon. Having a functioning codebase will be critical for testing out solutions to the more challenging requirements

jkarthic-akamai · 2019-04-10T08:52:12Z

Requirement 2 is a bit trickier. Synchronization of updates with the server is critically important if you're using one prefetch tag, since a refresh miss can mean the client runs through whatever small forward buffer it has. Our current implementation estimates the next manifest refresh using a combination of PROGRAM-DATE-TIME and a few HTTP headers. We'll be codifying more of this as our implementation improves, but you can read our first attempt here.

What requirement #2 also exposes is how the current refresh miss logic is insufficient for smooth playback at lower latencies. It states that the interval should be halved on miss - we have found this to be much too slow in practice. In the case of a miss the next reload needs to be made much sooner.

I agree with you on the issues with timing the "manifest refresh" properly. And it is potential bottleneck for latency improvement. Based on this experience should we modify the spec to mandate atleast two PREFETCH tags for LHLS. When we are mandating two PREFETCH urls, I would like if the spec clearly mentions that the fetching of the first prefetch should be COMPLETE, before request for fetching of the second prefetch segment is made. Such a wording in the spec will relax the long polling support requirements for the CDN or the origin HTTP server. Better yet would be specify a mandatory delay between download completion of nth segment to the download start of the (n+1)th segment. Encoder can set this delay to be equal to one frame duration, so that existing set of HTTP servers and CDN can support LHLS with two PREFETCH tags seamlessly.
Currently such a behavior of specifying a small delay is already possible with DASH with "availabilityTimeOffset" tag. It is only logical if LHLS also has some means of supporting the same, so that this manifest refresh logic is not in the critical path.

heff · 2019-04-18T19:06:03Z

The buffer length in the demo seems a lot more spikey than I'd expect, compared to the DASH LL demo where the buffer level stays pretty flat. Does it include video that's already been played?

vlee-harmonicinc · 2019-06-10T10:10:23Z

Apple announce Low-Latency HLS in the wwdc2019.
https://developer.apple.com/videos/play/wwdc2019/502/
There are new tags:
EXT-X-SERVER-CONTROL:
EXT-X-PART-INF:
EXT-X-PART:
EXT-X-RENDITION-REPORT:
EXT-X-SKIP:
Protocol Extension for Low-Latency HLS (Preliminary Specification): https://developer.apple.com/documentation/http_live_streaming/protocol_extension_for_low-latency_hls_preliminary_specification

zmousm · 2019-06-11T15:02:03Z

It would seem to me the preliminary LHLS spec by Apple is vastly more complicated than what has been discussed here.

kevleyski · 2019-07-15T06:45:51Z

Was wondering on thoughts around Community LHLS and Apple Low Latency HLS
I'm doing a presentation on low latency this week and want to say something like hls.js is implementing Apple Low Latency, but is that true?
Perhaps a better comment is along the lines of perhaps there will be a best-of-breed here where Apple might take what the community are up to and vice versus.

https://tinyurl.com/yyr2rz8m

ScottKell · 2019-07-15T11:26:23Z

Was wondering on thoughts around Community LHLS and Apple Low Latency HLS
I'm doing a presentation on low latency this week and want to say something like hls.js is implementing Apple Low Latency, but is that true?
Perhaps a better comment is along the lines of perhaps there will be a best-of-breed here where Apple might take what the community are up to and vice versus.

I think there is 0% chance that Apple takes what the community has developed. They have decided on their path and are headed that direction

Not sure on hls.js, but I know other player vendors, including Wowza, are working on implementing the Apple LHLS spec

ScottK

johnBartos · 2020-04-06T14:48:12Z

Hi all,

It's time to kill LHLS. From what I can tell, Apple's LLHLS is the choice of the community - therefore, there's no point in continuing on here. Apple has done a great job working with video-dev to address our concerns, and I feel that the best way forward is to continue to work together and make LLHLS the best it can be. It's my hope that we can continue to have a voice within Apple to continue advocating for low-latency and HLS beyond.

kevleyski · 2020-04-06T17:49:15Z

Thanks John And good on yer!

…

On 7 Apr 2020, at 12:48 am, John Bartos ***@***.***> wrote: Hi all, It's time to kill LHLS. From what I can tell, Apple's LLHLS is the choice of the community - therefore, there's no point in continuing on here. Apple has done a great job working with video-dev to address our concerns, and I feel that the best way forward is to continue to work together and make LLHLS the best it can be. It's my hope that we can continue to have a voice within Apple to continue advocating for low-latency and HLS beyond. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

heff · 2020-04-06T20:48:45Z

Thanks John! Great job stewarding this forward.

TBoshoven · 2020-04-07T02:42:07Z

Thanks John!
It's a pity to see this go, but I agree this is the right way forward.

Draft 0.0.1

f30e05c

mjneil reviewed Sep 28, 2018

View reviewed changes

proposals/0001-lhls.md Outdated Show resolved Hide resolved

Outline proposal goals and further elaborate on why specific tech cho…

ff23232

…ices were made

Address feedback, add more examples, clarify language

0157419

johnBartos deployed to github-pages October 3, 2018 20:36 View deployment

johnBartos mentioned this pull request Oct 5, 2018

Low Lantency HLS technology Disscuion video-dev/hls.js#1944

Closed

John Bartos added 2 commits October 12, 2018 11:03

Add general client/server responsibilities

751d58f

Restrict the amount of prefetch segments to two

ab64460

TBoshoven reviewed Dec 13, 2018

View reviewed changes

heff reviewed Jan 10, 2019

View reviewed changes

heff reviewed Jan 11, 2019

View reviewed changes

proposals/0001-lhls.md Outdated Show resolved Hide resolved

heff reviewed Jan 17, 2019

View reviewed changes

heff and others added 7 commits January 17, 2019 10:55

Update proposals/0001-lhls.md

e359ee3

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

dc4cc85

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

44dcb7f

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

3e579f2

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

84e95b4

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

5fb1769

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

c076b88

Co-Authored-By: johnBartos <jbartos7@gmail.com>

TBoshoven and others added 6 commits January 25, 2019 12:02

Update proposals/0001-lhls.md

40caed4

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

796b241

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

9859539

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

380d4e3

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

a124e93

Co-Authored-By: johnBartos <jbartos7@gmail.com>

Update proposals/0001-lhls.md

a6e7cc4

Co-Authored-By: johnBartos <jbartos7@gmail.com>

biglittlebigben reviewed Jan 28, 2019

View reviewed changes

Prendo93 mentioned this pull request Feb 4, 2019

LHLS Implementation video-dev/hls.js#2114

Closed

heff mentioned this pull request Feb 8, 2019

Improve support for low-latency HLS live streams google/ExoPlayer#5011

Closed

johnBartos closed this Mar 6, 2019

johnBartos force-pushed the lhls-spec branch from a6e7cc4 to 044bc49 Compare March 6, 2019 15:59

johnBartos reopened this Mar 11, 2019

johnBartos closed this Apr 6, 2020

This was referenced Jul 2, 2020

Hls.js v1.0.0 update and roadmap video-dev/hls.js#2861

Closed

Remove LHLS support video-dev/hls.js#2864

Merged


		## Media Segment Tags

		The server must not precede any prefetch segment with metadata other than those specified in this document, with the specified constraints.


		* Transform a prefetch segment to a complete segment. ([Prefetch Transformation](#prefetch-transformation))

		To each prefetch segment response, the server must append the `Transfer-Encoding: chunked` header. The server must maintain the persistent HTTP connection long enough for a client to receive the entire segment - this must be no less than the time from when the segment was first advertised to the time it takes to complete.


		The client may opt into an LHLS stream. If so, the client must choose a prefetch Media Segment to play first from the Media Playlist when playback starts. The client must choose prefetch Media Segments for playback in the order in which they appear in the Playlist; however, the client may open connections to as many prefetch segments as desired. If data from a newer prefetch Media Segment is received before an older one, the client should not append this data to the SourceBuffer; doing so may stall playback. If the client opts out of LHLS, it must ignore all prefetch Media Segments, and any additional constraints outlined in this specification.

		The client may set a minimum amount of buffer to begin and maintain playback. The client should not impose a minimum buffered amount greater than one target duration; doing so may introduce undue latency.

		[discontinuities]: #ext-x-discontinuity

		A prefetch segment must not be advertised with an `EXT-X-DISCONTINUITY` tag. To insert a discontinuity just for prefetch segments, the server must insert the `EXT-X-PREFETCH-DISCONTINUITY` tag before the newest `EXT-X-PREFETCH` tag of the new discontinuous range.

Low-latency HLS Streaming #1

Low-latency HLS Streaming #1

Conversation

johnBartos commented Sep 28, 2018 • edited Loading

shacharz commented Sep 28, 2018

johnBartos commented Oct 3, 2018

ScottKell commented Nov 13, 2018

johnBartos commented Nov 20, 2018

nicoweilelemental commented Nov 22, 2018

johnBartos commented Dec 10, 2018 • edited Loading

johnBartos commented Dec 11, 2018 • edited Loading

TBoshoven left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkarthic-akamai commented Dec 19, 2018

nicoweilelemental commented Dec 20, 2018 • edited Loading

johnBartos commented Jan 2, 2019

nicoweilelemental commented Jan 3, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnBartos Jan 17, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnBartos Jan 17, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heff Jan 23, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

biglittlebigben commented Jan 26, 2019

johnBartos commented Jan 26, 2019 • edited Loading

jkarthic-akamai commented Jan 28, 2019

biglittlebigben commented Jan 28, 2019

Choose a reason for hiding this comment

johnBartos Feb 1, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnBartos Feb 1, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnBartos commented Apr 8, 2019 • edited Loading

jkarthic-akamai commented Apr 10, 2019

heff commented Apr 18, 2019

vlee-harmonicinc commented Jun 10, 2019

zmousm commented Jun 11, 2019

kevleyski commented Jul 15, 2019

ScottKell commented Jul 15, 2019

johnBartos commented Apr 6, 2020

kevleyski commented Apr 6, 2020 via email

heff commented Apr 6, 2020

TBoshoven commented Apr 7, 2020

johnBartos commented Sep 28, 2018 •

edited

Loading

johnBartos commented Dec 10, 2018 •

edited

Loading

johnBartos commented Dec 11, 2018 •

edited

Loading

nicoweilelemental commented Dec 20, 2018 •

edited

Loading

johnBartos Jan 17, 2019 •

edited

Loading

johnBartos Jan 17, 2019 •

edited

Loading

heff Jan 23, 2019 •

edited

Loading

johnBartos commented Jan 26, 2019 •

edited

Loading

johnBartos Feb 1, 2019 •

edited

Loading

johnBartos Feb 1, 2019 •

edited

Loading

johnBartos commented Apr 8, 2019 •

edited

Loading