-
Notifications
You must be signed in to change notification settings - Fork 6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added the CEA-708 support to the open-source project.
Issue: #1807 ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=144726542
- Loading branch information
Showing
3 changed files
with
1,299 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
67 changes: 67 additions & 0 deletions
67
library/src/main/java/com/google/android/exoplayer2/text/cea/Cea708Cue.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
/* | ||
* Copyright (C) 2016 The Android Open Source Project | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package com.google.android.exoplayer2.text.cea; | ||
|
||
import android.text.Layout.Alignment; | ||
import com.google.android.exoplayer2.text.Cue; | ||
|
||
/** | ||
* A {@link Cue} for CEA-708. | ||
*/ | ||
/* package */ final class Cea708Cue extends Cue implements Comparable<Cea708Cue> { | ||
|
||
/** | ||
* An unset priority. | ||
*/ | ||
public static final int PRIORITY_UNSET = -1; | ||
|
||
/** | ||
* The priority of the cue box. | ||
*/ | ||
public final int priority; | ||
|
||
/** | ||
* @param text See {@link #text}. | ||
* @param textAlignment See {@link #textAlignment}. | ||
* @param line See {@link #line}. | ||
* @param lineType See {@link #lineType}. | ||
* @param lineAnchor See {@link #lineAnchor}. | ||
* @param position See {@link #position}. | ||
* @param positionAnchor See {@link #positionAnchor}. | ||
* @param size See {@link #size}. | ||
* @param windowColorSet See {@link #windowColorSet}. | ||
* @param windowColor See {@link #windowColor}. | ||
* @param priority See (@link #priority}. | ||
*/ | ||
public Cea708Cue(CharSequence text, Alignment textAlignment, float line, @LineType int lineType, | ||
@AnchorType int lineAnchor, float position, @AnchorType int positionAnchor, float size, | ||
boolean windowColorSet, int windowColor, int priority) { | ||
super(text, textAlignment, line, lineType, lineAnchor, position, positionAnchor, size, | ||
windowColorSet, windowColor); | ||
this.priority = priority; | ||
} | ||
|
||
@Override | ||
public int compareTo(Cea708Cue other) { | ||
if (other.priority < priority) { | ||
return -1; | ||
} else if (other.priority > priority) { | ||
return 1; | ||
} | ||
return 0; | ||
} | ||
|
||
} |
Oops, something went wrong.
18a24a1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some suggestions for this patch.
As per the CEA specs, both 608 and 708 may be present in ATSC DTV transmissions. So for example, if Exoplayer is to be used to render the digital form of the cable TV, there cannot be one mime type.
In my opinion, like most players (example VLC), Exo should identify 4 tracks for captions (CC1, CC2, CC3 & CC4, ) and if either CC1 or CC2 is selected, render the channel 1 and channel 2 of CEA 608, and if CC3 or above is selected, CEA 708 captions are rendered. We can map the service number of CEA 708 to channels more than CC3. i.e primary service number = CC3, secondary service num,ber = CC4, and other service numbers are CCX..
This means, supporting CEA 708 and 608 cannot be exclusive, and the parsing code should handle both simultaneously and be able to switch between them run-time based on selected track.
Please let me know if I am missing something in my understanding.
Here is my proposal:
The main class that is entry point to decoding captions could be CEADecoder (which is currently an abstract class), and internally , it creates two "Parsers", CEA608Parser and CEA708Parser. The current CEA608Decoder can be renamed to CEA608Parser and also does not derive from CEADecoder.
In this way, CEADecoder:decode function first parses the CC Type and based on the type invokes either of the parsers.
Each parser knows if a channel (or track) is enabled depending on currently selected track by the app. If a track is enable, it generates Cues, otherwise skips through the data.
18a24a1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
708 support isn't wired up end-to-end yet, this is just a step toward this. I think our intention is to do the track selection at a higher level though. There's no particular reason to require that a single decoder be capable of decoding all of the tracks simultaneously just because they're muxed into a single stream. As an analogy, we don't require a single decoder to be capable of decoding both audio and video because they're muxed together. They get split out before they reach the decoders, and I think our intention is to do something similar for 608/708 as well (if it's hard to do this without decoding then all of the data can be delivered, with a flag set on the format of each stream to indicate which track should actually be kept during decode). But yeah, this is all work in progress.
18a24a1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a little bit more complicated than handling 4 channels. 608 differentiates 2 main channels defined by the "line" the stream is coming from, like "line 21" of the analog transmission is one incoming stream (2 bytes in every frame). In this single stream for every 2 bytes there is a bit serving as a flag which "sub-channel" this command belongs. This way "line 21" allows transmitting 2 channels itself, but they share the bandwidth. Analog frame-rate was 30, so we have at most 60 useful bytes shared between Channel 1 and 2. Equally shared, without any overhead and commands, we can still easily meet scenarios when 30 characters are not enough in a second to correctly show everything, with the positioning, styling and other overhead the amount of useful bytes is much less. So in practice, we cannot fit 2 languages into Channel 1 and 2 as they must share the 60 bytes / second bandwidth. That is way it became quite standard in the US that Channel 1 (of line 21) is English called primary channel, while Spanish subtitles are transmitted in an independent line (Channel 3) also primary channel. Channel 3 and 4 similarly shares a single "line" of transmission as channel 1 and 2 does. As channel 2 and 4 only uses the leftover bandwidth, they usually convey much less data and are rarely used.
These were all defined in the 608 standard. There are other modes also allowed, outside of the scope of our discussion.
But this means that you need to read every single byte to check the flag that differentiates between Channel 1 and 2, you cannot parse them independently. Similarly channel 3 and 4 are not independent from each other. As you cannot tell what channels are used (what are those bit flags) unless you read through the entire stream, most players show immediately 4 possible subtitle streams to the user to select from. There is no header or metadata which streams will be used (as it was originally a continuous stream of shows in analog television, it can change mid-stream any time). Usually you need to pick Channel 1 for English and 3 for Spanish, but any combination is allowed.
Then came 708 with digital transmission increasing the bandwidth, adding some headers to the new streams as part of MPEG-2, but the original 608 bytes are still transmitted without header and metadata information. Limitations are still present. 608 and 708 bytes are transmitted in a "Closed Caption Data Packet" (cc_data_pkt), that has a type field (2 bits) that differentiates the 2 lines of the 608 stream (value 0 and 1) and defines values 2 and 3 as DTVCC_PACKET_DATA and DTVCC_PACKET_START. So you need to collect the incoming stream into a buffer to have a full 708 Packet to interpret it correctly and you still need to read through the entire stream of all 4 possible 608 channels to figure out if they are really present or not (any one of them can be a stream of continuous 0 values). See LINK
While interpreting the 708 packets, it has 3 bits for service numbers, but when value 7 is used, additional 6 bits are used for identifying the extended service, so there are 7 main services and 63 extended services. Services are like independent subtitle channels, but in 708 it is a more generic term, it can be weather information, traffic updates, age rating, leftover time of the current show or lots of other similar services. Again, there are no headers or metadata suggesting what services will be used in this stream (for live tv it could change any time), you cannot tell how many of the possible 63 services (subtitle channels/streams) are used.
So there are 4 possible 608 channels (2 useful and 2 using leftover bandwidth), and 63 possible 708 channels, usually repeating the same streams that are parallel transmitted as 608 streams as well. But there are much more options.
I expect the main goal would be to correctly interpret 2 channels of 608 subtitles (but figuring out which 2 is stream dependent), and probably the 7 main services of the 708 should cover all main use cases for almost any streams.
18a24a1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AquilesCanta - FYI
18a24a1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ojw28
So this means the SeiReader in ExtractorSampleSource will indicate it supports both CEA608 and CEA708 Tracks and depending on what the user selects, the corresponding Decoder will be invoked. Is this understanding correct? If this is the case, the current design is good.
18a24a1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zsmatyas nice explanation, very accurate. As a side clarification, the service information for 608 and 708 captions are carried as part of PSIP data in broadcasts via Caption Service Descriptor in PMT or in EIT where available. This descriptor identifies each available service by service number (CC1-CC4 for 608, 1...64 for 708), it's type (608/708) and its language.
18a24a1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arifdi
I didn't know about that. Do you mean that the Caption Service Descriptor must always correctly show what captions are present? It seems to be not optional, so we should be able to show UI selectors for the subtitles based only on the Caption Service Descriptor.
18a24a1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It ultimately depends on the TV standard, I guess. The descriptor @arifdi mentions belongs to ATSC(though I am not sure it is mandatory[1]).
The TS Extractor's API will soon support both mechanisms(The code is already available, I am waiting for a blocking commit to go in):
[1]: CEA 708 spec says that, when carried in Nal units, bandwidth for CC should be preallocated so that captions can be added at a later stage of transmission without requiring remultiplexing the stream. In that case, I guess someone could add captions for a service that is not already declared by the PMT's descriptors. This is just a guess, though. I am just considering adding a descriptor to the PMT could require you to split it across multiple packets.