-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supercompression update #103
Conversation
I will take the absence of comments as approval of these changes. I will wait until Monday morning, JST, before merging to give everyone a final chance. |
components of the pre-deflation image as detailed below. | ||
|
||
_One-component images_ must have the component value replicated in all 3 | ||
components of the _rgb_ slice of the compressed data and have no _alpha_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that slice data is generated by Basis Universal encoder from an unknown source data, it looks like the KTX2 spec technically disallows ambiguous source/target combinations. For example:
- One-component image can be encoded to BasisU with RGB channels populated with the same value. Transcoding it to BC1/ETC1: three RGB slice channels populating three RGB texture sampler channels with the same value.
- One-component image can be encoded to BasisU with only green channel populated with zeros in red and blue . Transcoding it to BC1/ETC1: three RGB slice channels populating three RGB texture sampler channels with data only in green.
This (and subsequent) restrictions are added to potentially reduce endpoints footprint and to ensure consistency of transcoded images when the runtime transcoded format is uncompressed or suboptimal (otherwise 1- and 2-component images could always use green channel), right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I want to restrict the possibilities so as to make behavior consistent and predictable and to reduce endpoint footprint. I'm basically following the undocumented behaviour of the BasisU encoder. It works only on 4 component images and uses lodepng to load png files. When converting to RGBA lodepng replicates "grey" to R, G & B for 1 and 2 component images. libktx
does similar when preparing in memory RGBA images for the encoder. The BasisU encoder uses those RGB values unchanged. The encoder copies the alpha value to all 3 components of the alpha slice.
Note that the transcoder does not currently have an option to transcode to uncompressed 1 or 2 component images only 3 or 4 component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be these particular restrictions, based on behaviour inherited by the BasisU encoder from lodepng, are not the right restrictions. In PNG, 1 & 2 component images are labelled grey and grey-alpha so swizzling the "grey" component to r, g & b makes sense. In KTX, OpenGL & Vulkan 1 & 2 component images are known as R & RG. I think the restrictions outlined make sense in this case too but I am open to alternatives.
Another issue is that currently the KTX encoder supports a separateRGToRGB_A
option. This causes the R & G components of an RGB{,A} input image to be swizzled to RGB and A respectively. This option is copied from BasisU. I am not sure there is any utility to it. Is there any content creation pipeline that creates tangent space normal maps in 3 or 4 component textures? Those being the the advertised use of the option. Unless there is a real use case I am tempted to drop the option. [Note: if you were to pass a 2 component .png file to the BasisU encoder and set this option, the result would be wrong. The "grey" component would end up in all 3 components of the RGB and alpha slices.]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In PNG, 1 & 2 component images are labelled grey and grey-alpha so swizzling the "grey" component to r, g & b makes sense. In KTX, OpenGL & Vulkan 1 & 2 component images are known as R & RG.
Legacy versions of OpenGL had Luminance and Luminance-Alpha (1- and 2-component) formats that semantically match PNG. I think that the current approach is correct.
This causes the R & G components of an RGB{,A} input image to be swizzled to RGB and A. This option is copied from BasisU. I am not sure there is any utility to it.
Original 2-component image may be authored in JPEG which supports only 1 or 3-component images. Also, it could be an RG texture saved in PNG verbatim (with zeros in blue channel). This encoder option gives some additional flexibility, but maybe the right approach is to generalize input data swizzling so this option would translate to something like --swizzle-inputs rrrg
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Implementation of the option is already by a generalized swizzle. I think exposing that is a good idea. But that is a matter for libktx
and toktx
not the spec. Specwise I'll leave this as is but add some words so offering a swizzle option doesn't conflict.
ktxspec.adoc
Outdated
_Video_ files have a <<_layercount,`layerCount`>> > 0 and must have | ||
KTXanimation metadata. See <<KTXanimation>>. The value of | ||
<<_layercount,`layerCount`>> is the number of frames in the video, | ||
i.e. layers become the temporal axis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since KTXanimation
metadata could be used without supercompression, should the definition of video file be put outside of BasisU section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this when writing it. I agree but could not decide on a good place to put it. How about in section 4 "General Comments"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine. Probably right after the table with texture types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
.Rationale | ||
==== | ||
The Basis Universal encoder combines encoding to ETC1S with deflation. | ||
The transcoder combines inflation to ETC1S with transcoding to one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KDFS has a normative definition of ETC1S (as a strict subset of ETC1), while BasisU seems to diverge from it (at least wrt endpoints encoding). In my opinion, KTX2 should not call that custom BasisU encoding as just "ETC1S".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition in KDFS is supposed to be what Rich calls ETC1S. We did run the definition by Rich when we added it to KDFS. Has something changed since? If so, we need to update KDFS not remove ETC1S from here. There is no other use for ETC1S beyond BasisU.
Both the name and value of the ETC1S flag in globalFlags are taken from the BasisU code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Back then (before BasisU became open-source), we expected to use ETC1S RDO with some generic lossless step. So the KDFS definition of ETC1S is correct.
The issue is that BasisU is similar to but not exactly ETC1S.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I pointed out, BasisU still refers to what it is using as ETC1S. We should update KDFS as the purpose of adding ETC1S was to describe what we would be doing in KTX2.
What exactly are the differences?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, BasisU is a software product that uses "ETC1S" in its source code. Specwise, the KTX2 should refer to the BasisU bitstream (when it's available).
BasisU stores all endpoints and selectors in global tables; image slices contain only pointers to that data. Moreover, all data blocks are additionally losslessly compressed. To get GPU-ready ETC1S payload (even without transcoding), the decoder needs to decompress Huffman and to reconstruct the blocks. The last step is not documented anywhere yet.
KDFS ETC1S reference is useful for understanding the compression technique in general but certainly not enough to implement the codec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are at cross purposes here. By referring to ETC1S I am not attempting, or claiming, to provide enough info to implement a codec. I am attempting to provide background as to why there are potentially 2 slices. I know steps are needed to reconstruct the ETC1S blocks and I know we need to refer to the BasisU bitstream, once available. There is already a place for the reference in the table of supercompression schemes.
It sounds like your earlier statement that "BasisU seems to diverge from it (at least wrt endpoints encoding)." is referring to BasisU putting the endpoints in the global codebook. Is that so or is there some other difference?
I will attempt rewording to make it clear the ETC1S is a building block and that further info is required to make a complete codec. I will include another T.B.D reference to the BasisU bitstream spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at 385a6b6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reworded text LGTM.
Make clear there is more to it than ETC1S and prepare for the future.
Minimum KV length should be 2.
Minimum KV length should be 2.
This introduces a change incompatible with previously generated Basis compressed .ktx2 files in order to store the number of components of the original image after Basis supercompression. The information is retained in the DFD.
Fixes #70. Fixes #86. Fixes #89. Fixes #98. Fixes #100