Implement FEC generation and validation #145

chenxiaolong · 2023-09-16T03:55:13Z

Currently, avbroot's AVB logic can only verify and update the hash tree portion of hash tree descriptors. The FEC (forward error correction) data is left untouched. We currently don't modify dm-verity partitions, but if we need to in the future, broken/invalid FEC data could cause the user's device to crash or soft brick in the event of a bit flip that trips dm-verity.

(My original understanding of how things worked)

AVB FEC uses RS(255, 253) (Reed-Solomon codewords with 253 bytes of data and 2 bytes of parity) over GF(256) with the 0x11d primitive polynomial. The data fed into RS is not the raw data, but instead, an interleaving pattern. The interleaving is determined by 3 variables:

FEC block size. This is always 4096.
RS K value. The K component of RS(N, K), ie. the number of data bytes per codeword. AOSP defaults to 253, ~~which is the only value avbroot will support (because the best Rust library for these calculations uses code generation that requires the K value at compile time).~~ EDIT: avbroot will support everything in [231, 253] (everything allowed by dm-verity).
"rounds" value. ~~It's undocumented in AOSP, but is equal to ceil(file size / FEC block size / RS K value).~~

~~Interleaving pattern:~~
* Input offset 0 == output offset 0.
* Each time the input offset is incremented by 1, the output offset is incremented by this value times the block size.
~* Once the input offset cross a multiple of rs_k, the output offset "wraps around" and starts again at the beginning again.`

~~For example, with file_size=2048000 and rs_k=253 (and thus, rounds=2):~~

offset 0   -> offset 0       (= 0 + 0   * 2 * 4096)
offset 1   -> offset 8192    (= 0 + 1   * 2 * 4096)
offset 2   -> offset 16384   (= 0 + 2   * 2 * 4096)
...
offset 252 -> offset 2064384 (= 0 + 252 * 2 * 4096)
# rs_k chunk boundary.
offset 253 -> offset 1       (= 1 + 0   * 2 * 4096)
offset 254 -> offset 8193    (= 1 + 1   * 2 * 4096)
offset 255 -> offset 16385   (= 1 + 2   * 2 * 4096)
...
# Pattern repeats until EOF.

EDIT: Better documented in the implementation:

avbroot/avbroot/src/format/fec.rs

Lines 121 to 144 in f18a681

    
           /// The interleaving access pattern can be visualized by placing the file 
        
           /// offsets in a two-dimensional grid. For example, when reading a 2072576-byte 
        
           /// file for calculating RS(255, 253): 
        
           /// 
        
           /// ```text 
        
           ///      | <-------- Round 0 --------> | <-------- Round 1 --------> | 
        
           ///      |-----------------------------|-----------------------------| 
        
           ///  ^   | 0       1       ... 4095    | 4096    4097    ... 8191    | 
        
           ///  |   | 8192    8192    ... 12287   | 12288   12289   ... 16383   | 
        
           /// rs_k | 16384   16385   ... 20479   | 20480   20481   ... 24575   | 
        
           ///  |   | ....... ....... ... ....... | ....... ....... ... ....... | 
        
           ///  v   | 2064384 2064385 ... 2068479 | 2068480 2068481 ... 2072575 | 
        
           /// ``` 
        
           /// 
        
           /// A regular sequential read of the file is traversing the grid row-by-row, 
        
           /// while an interleaving read is traversing the grid column-by-column. Each 
        
           /// column is always `rs_k` items tall, so each column forms the data portion of 
        
           /// an RS codeword. The number of columns is always a multiple of the FEC block 
        
           /// size. Since RS operates on fixed-size codewords and a file size might not 
        
           /// always fill the grid completely, out-of-bounds offsets are treated as if 
        
           /// they contain a `\0` byte. 
        
           /// 
        
           /// All operations are multithreaded with I/O operations parallelized at the 
        
           /// "round" level and RS operations parallelized at the column level.

When calculating this FEC data, avbroot will probably default to reading everything into memory with an option to use mmap for memory-constrained systems. The access pattern is probably going to cause a whole lot of thrashing with mmap due to it needing to read one byte at a time all over the file. EDIT: avbroot now uses a much smarter way of doing the interleaved reads, inspired by cryptsetup's veritysetup.

Once implemented (already have a working proof of concept), avbroot avb verify and avbroot ota verify will be updated to check the validity of the FEC data.

The text was updated successfully, but these errors were encountered:

chenxiaolong · 2023-09-16T03:56:46Z

Also, once this is implemented, I plan on adding avbroot avb unpack and avbroot avb pack commands to make it possible to edit and re-sign AVB images on the command line (#144).

chenxiaolong · 2023-09-16T05:18:09Z

OK... I might only implement the read-whole-file-to-memory method. That runs in 5 seconds against the Pixel 7 Pro's stock system.img.

The read + seek (well, pread) method with 16 threads took 10m16s, spending 99.3% of the CPU time in the kernel's zfs driver. Reading 1 byte at a time all over the place really is a worst case scenario for zfs 🙂 ~~Mmap wouldn't change that.~~

EDIT: Some benchmarks. mmap is far faster than what I assumed. Even with the page caches cleared out, the first run only takes twice as long as the average time.

Tool	Method	Time
AOSP libfec	Read to memory	4.9s
avbroot	Read to memory	1.4s
avbroot	`pread`	10m16s
avbroot	`mmap`	0.7s

EDIT 2: Reading 1-byte at a time is dumb. avbroot fec is now smarter, taking inspiration from how cryptsetup's veritysetup reads data. #146 (comment)

This commit adds initial support for the FEC (forward error correction) used by AVB hashtree descriptors. AVB FEC uses standard Reed-Solomon in the RS(255, 253) + Gf(256) + 0x11d configuration, but does not read data sequentially. Instead, it uses an interleaving pattern where each RS block covers bytes that span nearly the entire file. Due to the interleaving file access pattern, we're forced to use mmap or read the entire file into memory. Using seek = read when dealing with a large input file results in a ~500x increase in processing time. Reading one byte at a time non-sequentially is about the worst possible access pattern for seek + read. All of the processing is done multithreaded. Although the AOSP libfec implementation is also multithreaded, this implementation runs at ~5x the speed of AOSP libfec. Unfortunately, while the implementation is extremely fast, it required using `unsafe` blocks. Multiple threads need to write to disjoint, non-contiguous indices in a buffer, but there's no way to prove to the compiler that this is safe, so we're stuck with using raw pointers. Issue: #145 Signed-off-by: Andrew Gunnerson <accounts+github@chiller3.com>

This commit adds initial support for the FEC (forward error correction) used by AVB hashtree descriptors. AVB FEC uses standard Reed-Solomon in the RS(255, 253) + GF(256) + 0x11d configuration, but does not read data sequentially. Instead, it uses an interleaving pattern where each RS block covers bytes that span nearly the entire file. Due to the interleaving file access pattern, we're forced to use mmap or read the entire file into memory. Using seek = read when dealing with a large input file results in a ~500x increase in processing time. Reading one byte at a time non-sequentially is about the worst possible access pattern for seek + read. All of the processing is done multithreaded. Although the AOSP libfec implementation is also multithreaded, this implementation runs at ~5x the speed of AOSP libfec. Unfortunately, while the implementation is extremely fast, it required using `unsafe` blocks. Multiple threads need to write to disjoint, non-contiguous indices in a buffer, but there's no way to prove to the compiler that this is safe, so we're stuck with using raw pointers. Issue: #145 Signed-off-by: Andrew Gunnerson <accounts+github@chiller3.com>

This commit adds initial support for the FEC (forward error correction) used by dm-verity. The new `avbroot fec` commands operate on FEC data stored in AOSP's standalone FEC file format. It is compatible with AOSP's `fec` tool. Issue: #145 Signed-off-by: Andrew Gunnerson <accounts+github@chiller3.com>

This also adds additional checks to ensure that there are no gaps between the image data and hash tree and between the hash tree and FEC data. Issue: #145 Signed-off-by: Andrew Gunnerson <accounts+github@chiller3.com>

This commit adds initial support for the FEC (forward error correction) used by dm-verity. The new `avbroot fec` commands operate on FEC data stored in AOSP's standalone FEC file format. It is compatible with AOSP's `fec` tool. Issue: #145 Signed-off-by: Andrew Gunnerson <accounts+github@chiller3.com>

This also fixes the wrong error being reported when the hash tree does not match and adds additional checks to ensure that there are no gaps between the image data, hash tree, and FEC data. Issue: #145 Signed-off-by: Andrew Gunnerson <accounts+github@chiller3.com>

chenxiaolong self-assigned this Sep 16, 2023

chenxiaolong mentioned this issue Sep 17, 2023

Add support for AVB FEC #146

Merged

chenxiaolong closed this as completed Sep 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement FEC generation and validation #145

Implement FEC generation and validation #145

chenxiaolong commented Sep 16, 2023 •

edited

Loading

chenxiaolong commented Sep 16, 2023

chenxiaolong commented Sep 16, 2023 •

edited

Loading

Implement FEC generation and validation #145

Implement FEC generation and validation #145

Comments

chenxiaolong commented Sep 16, 2023 • edited Loading

chenxiaolong commented Sep 16, 2023

chenxiaolong commented Sep 16, 2023 • edited Loading

chenxiaolong commented Sep 16, 2023 •

edited

Loading

chenxiaolong commented Sep 16, 2023 •

edited

Loading