AV1 (AOMedia Video Codec 1.0) evolved on the basis of VP9 (Google), Thor (Cisco) and Daala (Mozila) under the AOM (Alliance for Open Media). It includes a number of enhancement and the new tools that have been added to improve the coding efficiency. The new tools that are added so far include 4 main aspects: prediction, transform, in-loop filter and entropy encoder. This document provides a snapshot of the coding tools in the current finalized version (on March, 2018) of AV1 spec.
According to the AOM web page, AV1 is designed with the following feature.
-
Royally free
-
Scales to any modern device at any bandwidth
-
For use in both commercial and non-commercial content, including user-generated content
-
Developed for the internet and related applications and services-from browsers and streaming to videoconferencing services
-
Designed with a low computational footprint and optimized for hardware
-
Bringing features like 4k UHD, HDR, and WCG to real-time video
Profiles and levels specify restrictions on the capabilities needed to decode the bitstreams. The profile specifies the bit depth and subsampling formats supported, while the level defines resolution and performance characteristics. By now levels is still under discussion and there is no more details.
AV1 support the three named profiles as the table list.
Profile | Bit depth | Monochrome support | Chroma subsampling | Name |
---|---|---|---|---|
0 | 8/10 | Yes | 4:2:0 | Main |
1 | 8/10 | No | 4:4:4 | High |
2 | 8/10 | Yes | 4:2:2 | Professional |
2 | 12 | Yes | 4:2:0, 4:2:2, 4:4:4 | Professional |
Table 1. AV1 Profile
AV1 support the larger super block size, which is up to 128x128 super block is allowed. It supports from 128x128 down to 4x4 coding block. Each 4x4 luma block is allowed to independently select inter or intra mode, its reference mode, and interpolation filter type. For Chroma, 2x2 block size is allowed but still 4x4 transform block size is used.
AV1 support up to 10 partition type. The size of partition unit is allowed down to 4x4 and totally there are 24 types of block size.
Partition index | Type of partition |
---|---|
0 | PARTITION_NONE |
1 | PARTITION_HORZ |
2 | PARTITION_VERT |
3 | PARTITION_SPLIT |
4 | PARTITION_HORZ_A |
5 | PARTITION_HORZ_B |
6 | PARTITION_VERT_A |
7 | PARTITION_VERT_B |
8 | PARTITION_HORZ_4 |
9 | PARTITION_VERT_4 |
Table 2. Type of Block partition
Index | Partition Block size | Index | Partition Block size |
---|---|---|---|
0 | BLOCK_4X4 | 12 | BLOCK_64X64 |
1 | BLOCK_4X8 | 13 | BLOCK_64X128 |
2 | BLOCK_8X4 | 14 | BLOCK_128X64 |
3 | BLOCK_8X8 | 15 | BLOCK_128X128 |
4 | BLOCK_8X16 | 16 | BLOCK_4X16 |
5 | BLOCK_16X8 | 17 | BLOCK_16X4 |
6 | BLOCK_16X16 | 18 | BLOCK_8X32 |
7 | BLOCK_16X32 | 19 | BLOCK_32X8 |
8 | BLOCK_32X16 | 20 | BLOCK_16X64 |
9 | BLOCK_32X32 | 21 | BLOCK_64X16 |
10 | BLOCK_32X64 | 22 | BLOCK_32X128 |
11 | BLOCK_64X32 | 23 | BLOCK_128X32 |
Table 3. Size of Block Partition
Both square and rectangle transform block size is supported in AV1. There are total 19 transform block size.
Index | TxSize | Index | TxSize |
---|---|---|---|
0 | TX_4X4 | 10 | TX_32X16 |
1 | TX_8X8 | 11 | TX_32X64 |
2 | TX_16X16 | 12 | TX_64X32 |
3 | TX_32X32 | 13 | TX_4X16 |
4 | TX_64X64 | 14 | TX_16X4 |
5 | TX_4X8 | 15 | TX_8X32 |
6 | TX_8X4 | 16 | TX_32X8 |
7 | TX_8X16 | 17 | TX_16X64 |
8 | TX_16X8 | 18 | TX_64X16 |
9 | TX_16X32 |
Table 4. Size of Transform Block
Intra Prediction in AV1 expends largely compared to VP9. Here is snapshot of Intra Mode.
Index | Intra mode | AV1 | VP9 | Comments |
---|---|---|---|---|
0 | DC_PRED | X | X | |
1 | V_PRED | X | X | AV1 support 7 kind of mode based on this mode |
2 | H_PRED | X | X | AV1 support 7 kind of mode based on this mode |
3 | D45_PRED | X | X | AV1 support 7 kind of mode based on this mode |
4 | D135_PRED | X | X | AV1 support 7 kind of mode based on this mode |
5 | D113_PRED | X | X | AV1 support 7 kind of mode based on this mode |
6 | D157_PRED | X | X | AV1 support 7 kind of mode based on this mode |
7 | D203_PRED | X | X | AV1 support 7 kind of mode based on this mode |
8 | D67_PRED | X | X | AV1 support 7 kind of mode based on this mode |
9 | SMOOTH_PRED | X | ||
10 | SMOOTH_V_PRED | X | ||
11 | SMOOTH_H_PRED | X | ||
12 | TM_PRED(PAETH_PRED) | X | X | AV1 replace TM_PRED with PAETH_PRED |
13 | Palette Mode | X |
Table 5. Summary of Intra Mode between AV1 and VP9
VP9 only supports 8 directional intra prediction modes: D45_PRED, D63_PRED, H_PRED, D117_PRED, D135_PRED, D153_PRED, V_PRED, D207_PRED. These modes correspond to prediction angles of 45, 63, 90, 117, 135, 153, 180, and 207 degrees, respectively.
To improve intra coding efficiency, more prediction angle options are added to AV1. The prediction angle is calculated as the following:
Prediction angle = nominal_angle + (angle_delta * angle_step),
nominal_angle | angle_step | angle_delta | Total number of angles |
---|---|---|---|
45, 63, 90, 117, 135, 153, 180, 207 | 3 | [-3, +3] | 8*7=56 |
Table 6. Finer of Intra Mode
-
norminal_angle is determined by the prediction mode, and is the same as VP9;
-
angle_delta is in a predefined range and angle_step is a predefined value. In current configuration, angle_delta is in the range of [-3, +3] and angle_step is 3. These settings are selected experimentally.
-
The total number of supported prediction angles is therefore increased from 8 to 8 * 7 = 56.
It is a Non- Directional Intra Prediction mode. VP9 has 2 non-directional intra prediction modes: DC_PRED and TM_PRED. AV1 expands on this by adding 3 new smooth prediction modes: SMOOTH_PRED, SMOOTH_V_PRED and SMOOTH_H_PRED. The new modes work as follows:
Mode | Comments |
---|---|
SMOOTH_PRED | Useful for predicting blocks that have a smooth gradient. It works as follows: estimate the pixels on the rightmost column with the value of the last pixel in top row, and estimate the pixels in the last row of the current block using the last pixel in left column. Then calculate the rest of the pixels by an average of quadratic interpolation in vertical and horizontal directions, based on distance of the pixel from the predicted pixels. |
SMOOTH_V_PRED | Similar to SMOOTH_PRED, but uses quadratic interpolation only in the vertical direction |
SMOOTH_H_PRED | Similar to SMOOTH_PRED, but uses quadratic interpolation only in the horizontal direction |
Table 7. Smooth mode of Intra mode
It is a Non- Directional Intra Prediction mode. The new prediction mode PAETH_PRED replaces the existing mode TM_PRED.
TM_PRED: Predictor(TM) = left + top – top_left
PAETH_PRED: Predictor (PAETH) = argmin |x- Predictor(TM)|
The idea is to find out the One of left, top, top_left closest in value to Predictor(TM).
Sometimes, given intra block can be approximated by a block with small number of unique colors. This is especially true for artificial videos like screen-capture, games etc. For such cases, AV1 introduces a new intra coding mode called palette mode. This predictor for a block is signaled by storing (i) a color palette, with 2 to 8 colors, and (ii) color indices into the palette for all pixels in the block. The residual pixel values of the block are as usual transformed and quantized before being entropy-coded.
Palette mode can be used by both intra-only as well as inter frames. The number of base colors determines the trade-off between fidelity and compactness. The color indices for pixels are obtained by the nearest neighbor method. The color indices are encoded using the neighborhood-based context to be as compact as possible.
Palette Mode is not new. We can see the Palette Mode and Intra block copy in the HEVC SCC (Screen Content Coding) extension.
AV1 adopt the new mode to interpolate (intra filter) the reference samples before prediction. This will reduce the impact of quantization noise. Here is the table to specify the type of intra filtering.
Index | Filter intra type |
---|---|
0 | INTRA_DC_PRED |
1 | INTRA_V_PRED |
2 | INTRA_H_PRED |
3 | INTRA_D153_PRED |
4 | INTRA_TM_PRED |
Table 8 Type of Intra filter Mode
This tool is very efficient for coding of screen content video in that repeated patterns in text and graphics rich content occur frequently within the same picture. Having a previously reconstructed block with equal or similar pattern as a predictor can effectively reduce the prediction error and therefore improve coding efficiency.
In AV1, Intra block copy is only allowed in intra frames. It disables all loop filtering and only integer offsets are allowed in block copy mode.
Chroma from luma (CfL) prediction is a new and promising chroma-only intra predictor that models chroma pixels as a linear function of the coincident reconstructed luma pixels.
Traditional modern codecs, including VP9, use block motion compensation where motion vectors are translational only. This is not sufficient for real video which often contains complex motion. For example, motion due to camera shake, panning and zoom might require transformations that support shearing, scaling, rotation and changes in aspect ratio. In AV1, we introduce warped motion compensation implemented as similarity and affine transformations to better capture the diversity of motion that exists in real video. There are two affine/warped motion compensation.
Affine Motion Compensation | Comments |
---|---|
Global | It is common for videos to contain a global camera motion which is pertinent to an entire inter frame. It is therefore beneficial to transmit a set of motion parameters at the frame level that is applicable to a large number of blocks in the frame. When a frame is encoded, a set of global motion parameters is computed and transmitted between that frame and each reference frame. These parameters may be either translational, similarity or affine motion model. Subsequently, any block in the frame can signal use of the global motion mode with a given reference to create a suitable predictor. |
Local | Affine motion compensation is also useful to describe complex local object motion. Here, we estimate affine parameters for a single block using the translational motion vectors that are typically conveyed for all inter blocks. Specifically, we estimate an affine or similarity model using the motion vectors from the current block and its causal neighbors which share the same reference frame. |
Table 9 Affine Motion Compensation
Motions assigned to surrounding blocks will contribute to predicting a current block, via a well-defined overlapping scheme appropriately designed for advanced variable block-size partitioning frameworks.
The OBMC will blend multiple predictors from neighbor blocks. It is not new concept and was proposed and implemented back in the era of h.263. The OBMC was proved to largely reduce prediction errors but not adopted by recent codecs due to extra complexity in the scenario of hybrid inter/intra variable block size coding. In AV1, a practical overlapping mechanism based on two-stage 1-D filtering is proposed for the advanced partitioning framework to implement causal overlapped block prediction.
The motion vector used in modern video codecs is allowed to have a fractional position for a better prediction quality. So, an interpolation filter module is needed to generate the prediction block at a fractional position in the reference frame. VP9 codec uses a separable interpolation filter to perform inter prediction with ⅛ motion vector precision. Three filter types, SHARP, REGULAR and SMOOTH, in descending order of cutoff frequencies, are provided to deal with various types of noise/distortions that can occur in reference frames/blocks. Given a filter type and a motion vector, the interpolation filter is performed by two one-dimensional filters, one for horizontal direction and one for vertical direction.
In AV1 codec, dual interpolation filter is introduced on top of the interpolation module inherited from VP9. Dual filter allows each block/frame to use a different interpolation filter type in horizontal and vertical direction. Up to 9 types of filter will be applied to the block.
This idea is based on the observation that a reference frame/block’s horizontal and vertical signals may have distinct frequency characteristics; therefore, using different filter types may produce a better prediction. As before, both the filter types are transmitted in the bitstream on a per block or per frame basis.
At the same time AV1 use the high intermediate precision between the horizontal and vertical filter. The same high precision before average the predictors with compound mode.
VP9 has two candidates MV in the ref list and 4 type of mode (NEARESTMV, NEARMV, NEWMV, and ZEROMV) are used. AV1 support 4 candidate MV and more modes.
For single ref mode, AV1 is same as VP9.
For compound mode, VP9 restricts motion vectors for a compound predictor to share one motion vector referencing mode, even though they may use different reference frames. To add more flexibility, on top of existing four combinations (NEAREST_NEARESTMV, NEAR_NEARMV, NEW_NEWMV, ZERO_ZEROMV) in VP9, AV1 supports four more empirically selected combinations: NEAREST_NEWMV, NEW_NEARESTMV, NEAR_NEWMV, and NEW_NEARMV.
Index | Type | Ref Mode |
---|---|---|
0 | NEARESTMV | single ref mode |
1 | NEARMV | single ref mode |
2 | GLOBALMV(ZEROMV) | single ref mode |
3 | NEWMV | single ref mode |
4 | NEAREST_NEARESTMV | compound mode |
5 | NEAR_NEARMV | compound mode |
6 | NEAREST_NEWMV | compound mode |
7 | NEW_NEARESTMV | compound mode |
8 | NEAR_NEWMV | compound mode |
9 | NEW_NEARMV | compound mode |
10 | GLOBAL_GLOBALMV(ZERO_ZEROMV) | compound mode |
11 | NEW_NEWMV | compound mode |
Table 10 MV mode
AV1 Compound mode support both predictors from the same direction and VP9 only support from the different direction (One forward and one backward reference frame). VP9 only support 1/2 weight to blend the two predictor and AV1 support more flexible weight blending.
Index | Compound type | Comments |
---|---|---|
0 | COMPOUND_WEDGE | Inter-Inter Wedge mode Inter-Intra Wedge mode |
1 | COMPOUND_SEG | Inter-Inter Compound Segment mode |
2 | COMPOUND_AVERAGE | (1/2,1/2) weight will be applied to blend the predictors |
3 | COMPOUND_INTRA | Inter-Intra Gradual mode |
4 | COMPOUND_DISTANCE | This process computes weights to be used for blending predictions together based on the expected output times of the reference frames |
Table 11. Compound type
Here are more details about the Compound Segment Mode:
- Inter-Inter Compound Segment mode
In many cases, regions in one predictor will contain useful content that is not present in the other. The two inter predictors have a larger pixel difference generally.
- Inter-Inter Wedge mode
Boundaries of moving objects in a video often separate two regions with distinct motions. Coding these regions with separate motion vector reference combinations should be beneficial; however, finding exact object boundaries is not only difficult, but expensive to communicate in the bitstream. Our approach is to design a codebook of masks with only a few possible partitioning combinations and signaling the codebook index in the bitstream.
The AV1 wedge codebook contains partition orientations that are either horizontal, vertical or oblique with slopes: 2, -2, 0.5 and -0.5. The wedge prediction mode is used for all square and rectangular blocks, using the 16-ary shape codebooks.
Index | Wedge direction | Comments |
---|---|---|
0 | WEDGE_HORIZONTAL | |
1 | WEDGE_VERTICAL | |
2 | WEDGE_OBLIQUE27 | |
3 | WEDGE_OBLIQUE63 | |
4 | WEDGE_OBLIQUE117 | |
5 | WEDGE_OBLIQUE153 |
Table 12. Wedge direction
- Inter-Intra Gradual mode
Decay the weight gradually for the intra from the prediction boundary and increase the weight of inter correspondingly. It support four modes, which include horizontal mode, vertical mode, DC_PRED, and SMOOTH_PRED.
- Inter-Intra Wedge mode
Blocks cannot always perfectly partition moving objects. For example, occlusion can occur in the middle of a block, it is better to apply different prediction techniques to different contents. Contents that are not occluded in reference frame will prefer inter prediction, while newly revealed content could benefit more from intra prediction using local reference.
Up to 7 reference frames out of 8 in the frame stored buffer are extended to be used in the inter mode. The reference frames is allowed to come from the same side or different side in the AV1.
LAST3_FRAME, LAST2_FRAME and LAST_FRAME are forward references and LAST_FRAME is the near past frame. BWDREF_FRAME is a backward reference, similar to ALTREF_FRAME.
Here is the table to show the reference frame type.
Index | Ref frame Name |
---|---|
0 | INTRA_FRAME |
1 | LAST_FRAME |
2 | LAST2_FRAME |
3 | LAST3_FRAME |
4 | GOLDEN_FRAME |
5 | BWDREF_FRAME |
6 | ALTREF2_FRAME |
7 | ALTREF_FRAME |
Table 13 Reference frame type
Several in-loop tools in AV1 are employed. De-blocking, CDEF and loop restoration are cascaded.
AV1 support 4 filter levels per frame and VP9 only has one. Two levels are for Luma component (horizontal and vertical levels). The other two levels are for U and V component separately. In AV1, filter level is allowed to change superblock by superblock.
CDEF is the combination of CLPF (Constrained Low Pass Filter) and Deringing filter. The main goal of the in-loop CEDF is to filter the coding artifacts and ringing while preserving the detail of image. It takes into account the direction of edge and patterns in the image. It is the similar to the SAO of HEVC.
The CDEF is based on the following observation. The amount of ringing artifacts in a coded image tends to be roughly proportional to the quantization step size. The amount of detail is a property of the input image, but the smallest detail actually retained in the quantized image tends to also be proportional to the quantization step size. For a given quantization step size, the amplitude of the ringing is generally less than the amplitude of the details.
CDEF works as the following steps:
-
The frame is divided into filter blocks of 64x64 pixels. Some CDEF parameters are signaled at the frame level, and some may be signaled at the filter block level.
-
To identify the direction of edge or pattern in each filter block.
-
To adaptively filter along the identified direction and to a lesser degree along directions rotated 45 degrees from the identified direction. The filter strengths are signaled explicitly, which allows a high degree of control over the blurring.
The main reason for identifying the direction is to align the filter taps along that direction to reduce ringing while preserving the directional edges or patterns. CDEF defines primary taps and secondary taps filter. The primary taps follow the direction and the secondary taps form a cross, oriented 45 off the direction. Both primary and secondary taps filter have 8 types.
AV1 employ a set of in-loop image restoration tool after de-blocking to generally de-noise and enhance the quality of the edge. In-loop restoration scheme have two types of filter to remove blur artifacts due to block processing. One is Wiener Filter. The other is Dual Self-Guided filter. These tools are integrated into AV1 with a switchable framework, which trigger the different tool in the different image region.
Multi-symbol adaptive arithmetic coding model is adopted in AV1. Both syntax element and coefficient are coded with this model.
Most recent video codecs encode information using binary arithmetic coding, such as CABA or CAVLC in AVC/HEVC, meaning that each symbol can only take two values. The AV1 entropy encoder come from the Daala range coder and supports up to 16 values per symbol, making it possible to encode fewer symbols. This is equivalent to coding up to four binary values in parallel and reduces serial dependencies, allowing hardware implementations to use lower clock rates, and thus less power.
For AV1, there is a richer set of transforms for coding Inter and Intra prediction residues. Inter prediction residues do not have a well-defined structure as in the Intra case, but using a bank of transforms, each adapted to a specific type of residue profile within the block, is generally helpful.
In AV1, four types of transform are used mainly in the horizontal and vertical direction separately. The total 16 different transforms are available.
Transform type | Comments |
---|---|
DCT | Inter and Intra modes continue to make use of DCT. |
ADST | Asymmetric Discrete Sine Transform |
Flip ADST | It applies ADST in reverse order |
IDTX | Identity transform seems to be particularly useful for coding residue with sharp lines and edges. Identity transform is useful for screen content coding |
Table 14 The Main Transform Type in each of direction
For each small coded block (4x4 or 8x8), it is possible to choose one of up to 16 different transforms as follows(Detail in Table):
{DCT, ADST, FlipADST, IDTX} horizontal x {DCT, ADST, FlipADST, IDTX} vertical
As block sizes get larger, some of these transforms begin to act similarly. Thus, a reduced set of transforms is used for 16x16, 32x32 and 64x64 block sizes. In the transform selection process for Inter and Intra modes, the encoder does a search over the entire set of transforms and selects the one that produces the best rate-distortion cost. Once a transform is selected, a transform type symbol from the set of types available at that size is used to indicate the actual transform used in the bitstream.
There are 6 types of transform sets in the AV1 spec, which specify the transform type of Intra and Inter blocks. The transform sets determine what subset of transform types can be used, according to the following table.
Inter or not | Set Number | Transform set |
---|---|---|
Don’t care | 0 | TX_SET_DCTONLY |
0 | 1 | TX_SET_INTRA_1 |
0 | 2 | TX_SET_INTRA_2 |
1 | 1 | TX_SET_INTER_1 |
1 | 2 | TX_SET_INTER_2 |
1 | 3 | TX_SET_INTER_3 |
Table 15 Transform Set in the AV1 spec
Transform type | TX_SET_DCTONLY | TX_SET_INTRA_1 | TX_SET_INTRA_2 | TX_SET_INTER_1 | TX_SET_INTER_2 |
---|---|---|---|---|---|
DCT_DCT | X | X | X | X | X |
ADST_DCT | X | X | X | X | |
DCT_ADST | X | X | X | X | |
ADST_ADST | X | X | X | X | |
FLIPADST_DCT | X | X | |||
DCT_FLIPADST | X | X | |||
FLIPADST_FLIPADST | X | X | |||
ADST_FLIPADST | X | X | |||
FLIPADST_ADST | X | X | |||
IDTX | X | X | X | X | |
V_DCT | X | X | X | ||
H_DCT | X | X | X | ||
V_ADST | X | ||||
H_ADST | X | ||||
V_FLIPADST | X | ||||
H_FLIPADST | X |
Table 16 Detailed Transform type supported in each transform set.
Both square and rectangle shape block are used in AV1. The transform block size is less than the partition block size. The block size is very flexible and up to 64x64 and down to 4x4. Details see the table in the Block section.
AV1 support flexible tiles, which include uniform and non-uniform tile spacing. Tile area is limited to a maximum 4096x2304. Tiles can be grouped into tile group and each group can be decoded independently to achieve error resilience. Loop filter can be enabled or disabled across tiles.
Same as VP9, AV1 provides a means of segmenting the image and then applying various adjustments at the segment level. Up to 8 segments may be specified for any given frame. For each of these segments it is possible to specify:
-
A quantizer (absolute value or delta).
-
A loop filter strength (absolute value or delta).
-
A prediction reference frame.
-
A block skip mode that implies both the use of a (0,0) motion vector and that no residual will be coded.
AV1 support temporal and spatial layer coding. Temporal layer support up to 8 layers and spatial layer support up to 3 layers.
Index | Scalability mode | Index | Scalability mode |
---|---|---|---|
0 | SCALABILITY_L1T2 | 8 | SCALABILITY_L2T2h |
1 | SCALABILITY_L1T3 | 9 | SCALABILITY_L2T3h |
2 | SCALABILITY_L2T1 | 10 | SCALABILITY_S2T1h |
3 | SCALABILITY_L2T2 | 11 | SCALABILITY_S2T2h |
4 | SCALABILITY_L2T3 | 12 | SCALABILITY_S2T3h |
5 | SCALABILITY_S2T1 | 13 | SCALABILITY_SS |
6 | SCALABILITY_S2T2 | 14-255 | reserved |
7 | SCALABILITY_S2T3 |
Table 17. Temporal and Spatial Mode
Scalability mode | Spatial Layers | Resolution Ratio | Temporal Layers | Inter-layer-dependency |
---|---|---|---|---|
SCALABILITY_L1T2 | 1 | 2 | ||
SCALABILITY_L1T3 | 1 | 3 | ||
SCALABILITY_L2T1 | 2 | 2:1 | 1 | Yes |
SCALABILITY_L2T2 | 2 | 2:1 | 2 | Yes |
SCALABILITY_L2T3 | 2 | 2:1 | 3 | Yes |
SCALABILITY_S2T1 | 2 | 2:1 | 1 | No |
SCALABILITY_S2T2 | 2 | 2:1 | 2 | No |
SCALABILITY_S2T3 | 2 | 2:1 | 3 | No |
SCALABILITY_L2T2h | 2 | 1.5:1 | 2 | Yes |
SCALABILITY_L2T3h | 2 | 1.5:1 | 3 | Yes |
SCALABILITY_S2T1h | 2 | 1.5:1 | 1 | No |
Table 17. Details in the Temporal and Spatial Mode
AV1 support 15 sets of QMs, which are based on the contrast-sensitive functions. QMs are applied to a frame based on selectable scaling of its quantization level, higher level of quantization imply flatter matrices. The matrices become flatter as the quantization index value increases (and the quality decreases). Inter matrices are slightly flatter than intra matrices.
AV1 allow the per-superblock changes in quantization parameter to support sub-frame rate control. At the same time it support the ROI level rate control on the top of segmentation level parameter.
An AV1 bitstream consists of a number of OBUs that are normally held within a container format alongside audio and timing information. Here the new tool OBU is introduced in AV1 and it is similar to NAL (Network Abstract Layer) in AVC/HEVC spec.
The OBU header is similar to the NAL header. In general the total 8 bits are presented. The OBU extra 8 bits of extension header is used if temporal and spatial layer exist in the bitstream. obu_type is the most important syntax to describe the type of OBU .
Index | obu_type |
---|---|
0 | Reserved |
1 | OBU_SEQUENCE_HEADER |
2 | OBU_TD |
3 | OBU_FRAME_HEADER |
4 | OBU_TILE_GROUP |
5 | OBU_METADATA |
6 | OBU_FRAME |
7 | OBU_REDUNDANT_FRAME_HEADER |
8-14 | Reserved |
15 | OBU_PADDING |
Table 18. Type of OBU