Added progressive refinement example files #131

leo-barnes · 2021-03-10T20:07:15Z

I added some examples of files that contain multiple layers as well as the lsel and a1lx properties. I could not figure out how to generate streams containing multiple operating points.

joedrago · 2021-03-10T20:12:17Z

It might be better to have @cconcolato review these -- I haven't personally attempted any kind of progressive implementation at all yet, so I'd be coming into this somewhat blind.

testFiles/Apple/multilayer_examples/animals_00_multilayer_lsel.avif.txt

negge · 2021-03-11T20:07:16Z

Thanks for the files @leo-barnes. I had a look at animals_00_multilayer.avif and it appears the spatial scalability layers are not being set correctly.

Using MP4Box to dump the second item, and passing through dump_obu I get:

$ ~/git/gpac/bin/gcc/MP4Box -dump-item 2 testFiles/Apple/multilayer_examples/animals_00_multilayer.avif
$ ~/git/libaom/aom_build/tools/dump_obu item_id02
Temporal unit 0
  OBU type:        OBU_TEMPORAL_DELIMITER
      extension:   no
      length:      2
  OBU type:        OBU_SEQUENCE_HEADER
      extension:   no
      length:      16
  OBU type:        OBU_FRAME
      extension:   no
      length:      88543
  OBU type:        OBU_FRAME
      extension:   yes
      temporal_id: 0
      spatial_id:  0
      length:      1918772
  TU size: 2007333
  OBU overhead:    5
File total OBU overhead: 5

As said above, this second layer is 2048x1536 and should have spatial_id of 1. Also, is the base layer of 1024x768 missing a temporal_id and spatial_id entry?

negge · 2021-03-12T03:07:46Z

As said above, this second layer is 2048x1536 and should have spatial_id of 1.

This appears to be a bug in dump_obu. I have filed an issue against libaom here: https://bugs.chromium.org/p/aomedia/issues/detail?id=2992

leo-barnes · 2021-03-12T08:46:46Z

As said above, this second layer is 2048x1536 and should have spatial_id of 1.

This appears to be a bug in dump_obu. I have filed an issue against libaom here: https://bugs.chromium.org/p/aomedia/issues/detail?id=2992

Yeah, there's a bug in libaom as well with regards to what spatial_id you get out when decoding. At the moment you always get the spatial_id of the final layer even though the dimensions clearly match what I would expect of the base layer. @wantehchang confirmed by code inspection that it seems to be a bug and sent me a patch that I'm going to try today.

Here's the issue I filed about it:
https://bugs.chromium.org/p/aomedia/issues/detail?id=2993

leo-barnes · 2021-03-15T10:23:38Z

Turns out libaom was automatically creating multiple operating points when creating the streams. I haven't found any OBU parser that can actually output what's inside the OBUs for me, and oddly enough the libaom decoder doesn't seem to give me any way to query the number of operating points in the stream either (https://bugs.chromium.org/p/aomedia/issues/detail?id=2995).

I can verify that there are multiple operating points by setting AV1D_SET_OPERATING_POINT when decoding, so I'll create some files that use the new operating point selector property as well.

leo-barnes · 2021-03-15T11:05:57Z

I updated libaom to v2.0.2 per Wan-Teh's suggestion (I was using v1.0.something), so the encoded data is now slightly different.

tdaede · 2021-03-15T23:21:07Z

I am unclear about the usage of a1lx here. The a1lx-containing and a1lx-missing files differ in that the a1lx example contains one Item, whereas the a1lx-less contains two (or more) Items. However, I am not sure that is enough for progressive decoding of a MIAF image - I would expect there still to be two or more Items regardless of the presence or absence of a1lx.

leo-barnes · 2021-03-16T08:59:29Z

I am unclear about the usage of a1lx here. The a1lx-containing and a1lx-missing files differ in that the a1lx example contains one Item, whereas the a1lx-less contains two (or more) Items. However, I am not sure that is enough for progressive decoding of a MIAF image - I would expect there still to be two or more Items regardless of the presence or absence of a1lx.

Good question! We may want to add some more wording to the spec with some examples of how we think files should look. As a summary of the last couple of meetings, the idea is as follows:

Use lsel if you want an item to decode a specific layer. The decoder shall output this layer and no other layer. In other words, no progressive refinement allowed. The use case we had in mind was something like multiple viewpoints sharing a single base layer or something like that.
If lsel is not specified, decode the highest layer, but intermittent layers may be displayed for progressive refinement.
Use a1op to select operating point. lsel may be combined with a1op to specify both operating point and layer. If lsel is not specified, follow point 2 above.

Creating multiple items that selects different layers and adding them to an altr group was discussed, but the conclusion was that it ends up being pretty complicated (and altr groups are not very well supported in HEIF as of right now). It's also hard/impossible for a decoder to know if it can reuse state from decoding an earlier item, which makes decoding inefficient. The conclusion was that if you want to do progressive refinement, you should not create multiple items or specify lsel.

Next question is why we have a1lx. The main use case is to do progressive refinement while downloading. But how do you know when you have enough bits to send to the decoder? You can do "polling" by periodically sending some more data to the decoder and hope you get a frame back, but that is pretty inefficient. The AVIF writer can store the layers in separate extents, but that is ambiguous and may have been done for some other reason, like interleaving chunks between multiple tiles. So a1lx was added as an explicit way of letting the parser know when it has a full layer that can be sent to the decoder.

There is nothing stopping you from using a1lx with a file containing multiple items with lsel sharing the same data, but it's not really why it was added.

cconcolato · 2021-03-16T19:34:36Z

I had a quick look at the files. They look good to me. In the multi-item image, I was able to extract the items separately, repackage them separately and compare the 2 qualities.

One thing I noted at least in animals_00_singlelayer.avif with MP4Box.js FileReader is that the mdat box starts at offset 295, and its payload starts at offset 303 (4 bytes for the length, 4 bytes for the type), while the extent offset starts at 311. I'm curious what these additional 8 bytes are. Any idea?

I haven't found any OBU parser that can actually output what's inside the OBUs for me

I usually use MP4Box (command line) as follows:

MP4Box -dump-item 1:path=file.obu file.avif // 1 is the item id you want to extract
MP4Box -add file.obu file.mp4 // imports the item data as a track
MP4Box -dnal 1 file.mp4 // dumps the OBU structure, 1 here is the track id that was created in the previous call

This produces an XML structure that looks like:

<OBUTrack trackID="1" SampleCount="1" TimeScale="25000">
 <OBUConfig>
   <OBU size="16" type="seq_header" header_size="2" has_size_field="1" has_ext="0" temporalID="0" spatialID="0" width="2048" height="1536" bit_depth="8" still_picture="0" OperatingPointIdc="0" color_range="1" color_description_present_flag="0" color_primaries="2" transfer_characteristics="2" matrix_coefficients="2" profile="0" level="12" />
 </OBUConfig>
 <OBUSamples>
  <Sample number="1" DTS="0" CTS="0" size="1999503" RAP="1" >
   <OBU size="16" type="seq_header" header_size="2" has_size_field="1" has_ext="0" temporalID="0" spatialID="0" width="2048" height="1536" bit_depth="8" still_picture="0" OperatingPointIdc="0" color_range="1" color_description_present_flag="0" color_primaries="2" transfer_characteristics="2" matrix_coefficients="2" profile="0" level="12" />
   <OBU size="122320" type="frame" header_size="4" has_size_field="1" has_ext="0" temporalID="0" spatialID="0" uncompressed_header_bytes="29" frame_type="key" refresh_frame_flags="255" show_frame="1" show_existing_frame="0" nb_tiles="1" >
     <Tile number="0" start="33" size="122287"/>
   </OBU>
   <OBU size="1877167" type="frame" header_size="5" has_size_field="1" has_ext="1" temporalID="0" spatialID="1" uncompressed_header_bytes="17" frame_type="inter" refresh_frame_flags="1" show_frame="1" show_existing_frame="0" nb_tiles="1" >
     <Tile number="0" start="22" size="1877145"/>
   </OBU>
  </Sample>
 </OBUSamples>
</OBUTrack>

tdaede · 2021-03-16T23:26:19Z

Nit: it looks like all the files start with a temporal unit OBU, but they are a SHOULD NOT in the ISOBMFF spec.

cconcolato · 2021-03-16T23:56:53Z

@tdaede I also thought it would be a Temporal Delimiter but it does not seem to be one. I've tried patching the offsets in the file and extracting the whole thing but MP4Box then fails with:

[AV1] computed OBU size -1 (input value = 0). Skipping.

tdaede · 2021-03-17T01:42:57Z

I was able to see it via:

$ MP4Box -dump-item 1 animals_00_multilayer.avif
$ dump_obu item_id01 
Temporal unit 0
  OBU type:        OBU_TEMPORAL_DELIMITER
      extension:   no
      length:      2
  OBU type:        OBU_SEQUENCE_HEADER
      extension:   no
      length:      16
  OBU type:        OBU_FRAME
      extension:   no
      length:      122320
  TU size: 122338
  OBU overhead:    3
File total OBU overhead: 3

I also double checked that 0x12 0x00 appears in the .avif, in case MP4Box was cleverly adding a TU on export.

cconcolato · 2021-03-17T05:17:36Z

@tdaede I had not realized the TD were there, but in my experiment I was talking about the 8 bytes before the TD. The offset indicated in the extent points to the start of the TD. I still don't know what those bytes before are.

@leo-barnes can you double-check in the tool that generated the container what those bytes are?

leo-barnes · 2021-03-17T08:59:38Z

@cconcolato

One thing I noted at least in animals_00_singlelayer.avif with MP4Box.js FileReader is that the mdat box starts at offset 295, and its payload starts at offset 303 (4 bytes for the length, 4 bytes for the type), while the extent offset starts at 311. I'm curious what these additional 8 bytes are. Any idea?

These files are using the 64-bit form box size for mdat. In other words:

uint32_t size = 0x00000001
uint32_t 4cc = 'mdat'
uint64_t longSize = real-size

@tdaede

I was able to see it via:

$ MP4Box -dump-item 1 animals_00_multilayer.avif
$ dump_obu item_id01 
Temporal unit 0
  OBU type:        OBU_TEMPORAL_DELIMITER
      extension:   no
      length:      2
  OBU type:        OBU_SEQUENCE_HEADER
      extension:   no
      length:      16
  OBU type:        OBU_FRAME
      extension:   no
      length:      122320
  TU size: 122338
  OBU overhead:    3
File total OBU overhead: 3

I also double checked that 0x12 0x00 appears in the .avif, in case MP4Box was cleverly adding a TU on export.

Any idea how I configure libaom to not output the TU OBU? I also noticed it but couldn't figure out how to get it to stop. I can of course manually strip it, but there should ideally be some way of configuring it I hope.

tdaede · 2021-03-17T15:27:46Z

Any idea how I configure libaom to not output the TU OBU? I also noticed it but couldn't figure out how to get it to stop. I can of course manually strip it, but there should ideally be some way of configuring it I hope.

Unfortunately no, it has to be manually stripped by the packager.

leo-barnes · 2021-03-18T10:49:31Z

I've now removed the TU delimiter from all the files from what I can see.

I have also added two grid examples. One uses lsel to create two grids, and one uses a1lx that should be able to give you progressive refinement. Both have the layers from all tiles interleaved.

I think the files are correct, but I would be very thankful if people could do some sanity checking on them in case I screwed something up in my scripts.

… refinement.

Added grid examples

Also removed TU delimiter from animals_00_multilayer.avif

leo-barnes · 2021-04-27T09:03:11Z

I've now updated the a1lx files to the new box structure. animals_00_multilayer_grid_a1lx.avif has been changed so that one of the tiles that is small enough to fit in 64k is using the small size a1lx.

I've also updated animals_00_multilayer.avif to get rid of the TU delimiter.

leo-barnes · 2021-04-27T09:20:19Z

I've tried running the files through the compliance warden and fixed an issue for the singlelayer file. All the multilayer files fail with the tool assertion issue in the linked ComplianceWarden issue above, so can't tell if anything is wrong with them.

leo-barnes · 2021-04-27T11:23:04Z

Ran through the multilayer files and fixed the av1C in all of them. Encountered two more issues with the warden:
gpac/ComplianceWarden#24
gpac/ComplianceWarden#25

leo-barnes · 2021-05-04T19:14:14Z

@tdaede @negge
If you have some way of sanity checking my files that would be great. Thanks!

tdaede · 2021-05-06T02:46:13Z

I tested these with our most recent stack of patches on GPAC and noticed the following:

animals_00_multilayer_a1op.avif: This seems to still have the fullbox bytes at the beginning of the a1op box, it's 4 bytes larger than it should be.

animals_00_multilayer_a1lx.avif: The a1lx box seems to be encoding the large_size bitfield as 4 bytes rather than 1 byte.

leo-barnes · 2021-05-06T09:13:18Z

@tdaede

animals_00_multilayer_a1op.avif: This seems to still have the fullbox bytes at the beginning of the a1op box, it's 4 bytes larger than it should be.

Are you sure/using the latest commit? The boxes are supposed to be 9 bytes in size if they are simple boxes. If I pass it through my parser I see this:

      ('a1op' "Operating Point Selector Box", size = 9, offset = 283) {
        Operating point: 0
      }
      ('a1op' "Operating Point Selector Box", size = 9, offset = 292) {
        Operating point: 1
      }

If I do the same for testFiles/Xiph/quebec_3layer_op2.avif, I see this:

      ('a1op' "Operating Point Selector Box", size = 9, offset = 254) {
        Operating point: 2
      }

I see the same when looking at the actual bytes in a hex viewer.

animals_00_multilayer_a1lx.avif: The a1lx box seems to be encoding the large_size bitfield as 4 bytes rather than 1 byte.

My parser shows this:

      ('a1lx' "Layered Image Indexing Box", size = 21, offset = 242) {
        large_sizes: true
        Layer sizes: 122336 0 0
      }

I've looked at the bytes in my hex editor and they look correct. And the size is the size I would expect:
4 bytes for size
4 bytes for a1lx
1 byte for element size
4*3 bytes for elements
Total: 21 bytes

tdaede · 2021-05-06T10:39:26Z

Ah indeed my bad, I failed to set up a tracking branch correctly when I pulled your changes last time. I've updated them and now everything parses correctly here. Sorry about the false alarm!

leo-barnes · 2021-05-06T12:07:12Z

Ah indeed my bad, I failed to set up a tracking branch correctly when I pulled your changes last time. I've updated them and now everything parses correctly here. Sorry about the false alarm!

No worries, I'm just glad someone could sanity check them!
Always so easy to miss things when you write both the writer and reader. 😊

leo-barnes requested review from joedrago, cconcolato and wantehchang March 10, 2021 20:07

cconcolato reviewed Mar 11, 2021

View reviewed changes

testFiles/Apple/multilayer_examples/animals_00_multilayer_lsel.avif.txt Outdated Show resolved Hide resolved

cconcolato mentioned this pull request Mar 16, 2021

Add AVIF 3-layer progressive images #134

Merged

leo-barnes mentioned this pull request Mar 26, 2021

Last layer_size entry in a1lx is redundant #140

Closed

leo-barnes mentioned this pull request Apr 8, 2021

Optional 'progressive' download frame #102

Closed

leo-barnes mentioned this pull request Apr 20, 2021

Create conformance file with multi layer image #40

Closed

leo-barnes linked an issue Apr 20, 2021 that may be closed by this pull request

Create conformance file with multi layer image #40

Closed

leo-barnes added 5 commits April 27, 2021 10:59

Added examples that use the new additions to the spec for progressive…

1634315

… refinement.

Fixed Cyril's comment. Added files that select operating point.

c03014f

Add clarification to a1lx .txt

deca238

Make a1op/lsel essential

12ee05d

Removed TU delimiter

d9b8f70

Added grid examples

Updated to new a1lx box structure.

8acb870

Also removed TU delimiter from animals_00_multilayer.avif

leo-barnes force-pushed the u/lbarnes/multilayer_examples branch from 596185f to 8acb870 Compare April 27, 2021 09:01

leo-barnes mentioned this pull request Apr 27, 2021

Assertion failed: 0 && "obu_has_size_field shall be set" gpac/ComplianceWarden#23

Closed

Fix av1C in singlelayer file

1ff4338

Fixed av1C box for all multilayer files

56fbf37

Changed a1op to be simple box

c0fac33

tdaede approved these changes May 6, 2021

View reviewed changes

leo-barnes merged commit 4be01ec into master May 17, 2021

leo-barnes deleted the u/lbarnes/multilayer_examples branch May 17, 2021 16:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added progressive refinement example files #131

Added progressive refinement example files #131

leo-barnes commented Mar 10, 2021

joedrago commented Mar 10, 2021

negge commented Mar 11, 2021 •

edited

Loading

negge commented Mar 12, 2021

leo-barnes commented Mar 12, 2021 •

edited

Loading

leo-barnes commented Mar 15, 2021

leo-barnes commented Mar 15, 2021

tdaede commented Mar 15, 2021

leo-barnes commented Mar 16, 2021 •

edited

Loading

cconcolato commented Mar 16, 2021

tdaede commented Mar 16, 2021

cconcolato commented Mar 16, 2021

tdaede commented Mar 17, 2021

cconcolato commented Mar 17, 2021

leo-barnes commented Mar 17, 2021 •

edited

Loading

tdaede commented Mar 17, 2021

leo-barnes commented Mar 18, 2021

leo-barnes commented Apr 27, 2021

leo-barnes commented Apr 27, 2021

leo-barnes commented Apr 27, 2021

leo-barnes commented May 4, 2021

tdaede commented May 6, 2021

leo-barnes commented May 6, 2021

tdaede commented May 6, 2021

leo-barnes commented May 6, 2021

Added progressive refinement example files #131

Added progressive refinement example files #131

Conversation

leo-barnes commented Mar 10, 2021

joedrago commented Mar 10, 2021

negge commented Mar 11, 2021 • edited Loading

negge commented Mar 12, 2021

leo-barnes commented Mar 12, 2021 • edited Loading

leo-barnes commented Mar 15, 2021

leo-barnes commented Mar 15, 2021

tdaede commented Mar 15, 2021

leo-barnes commented Mar 16, 2021 • edited Loading

cconcolato commented Mar 16, 2021

tdaede commented Mar 16, 2021

cconcolato commented Mar 16, 2021

tdaede commented Mar 17, 2021

cconcolato commented Mar 17, 2021

leo-barnes commented Mar 17, 2021 • edited Loading

tdaede commented Mar 17, 2021

leo-barnes commented Mar 18, 2021

leo-barnes commented Apr 27, 2021

leo-barnes commented Apr 27, 2021

leo-barnes commented Apr 27, 2021

leo-barnes commented May 4, 2021

tdaede commented May 6, 2021

leo-barnes commented May 6, 2021

tdaede commented May 6, 2021

leo-barnes commented May 6, 2021

negge commented Mar 11, 2021 •

edited

Loading

leo-barnes commented Mar 12, 2021 •

edited

Loading

leo-barnes commented Mar 16, 2021 •

edited

Loading

leo-barnes commented Mar 17, 2021 •

edited

Loading