Skip to content

Fuzzing Crash: ListViewArray size/offset type mismatch after compression #5322

@github-actions

Description

@github-actions

Fuzzing Crash Report

Analysis

Crash Location: vortex-array/src/arrays/listview/array.rs:validate

Error Message:

Failed to crate ListViewArray: size type U16 (max 65535) must fit within offset type U8 (max 255)

Stack Trace:

   3: validate
             at ./vortex-array/src/arrays/listview/array.rs:227:9
   4: new_unchecked
             at ./vortex-array/src/arrays/listview/array.rs:179:13
   5: compress_canonical
             at ./vortex-layout/src/layouts/compact.rs:150:21
   6: compress
             at ./vortex-layout/src/layouts/compact.rs:65:14
   7: compress_canonical
             at ./vortex-layout/src/layouts/compact.rs:136:45
   8: vortex_layout::layouts::compact::CompactCompressor::compress
             at ./vortex-layout/src/layouts/compact.rs:65:14
   9: __libfuzzer_sys_run
             at ./fuzz/fuzz_targets/array_ops.rs:34:26

Root Cause:

The bug occurs in the CompactCompressor::compress_canonical function when compressing a ListViewArray. The code (vortex-layout/src/layouts/compact.rs:140-150) independently narrows the offsets and sizes arrays:

let compressed_offsets =
    self.compress(&listview.offsets().to_primitive().narrow()?.into_array())?;
let compressed_sizes =
    self.compress(&listview.sizes().to_primitive().narrow()?.into_array())?;

unsafe {
    ListViewArray::new_unchecked(
        compressed_elems,
        compressed_offsets,
        compressed_sizes,
        listview.validity().clone(),
    )
}

The narrow() operation reduces each array to its minimum required integer type independently. In this crash:

  • The offsets array narrowed to U8 (max value 255)
  • The sizes array narrowed to U16 (max value 65535)

However, the ListViewArray::validate function enforces an invariant at line 227 that size_max <= offset_max to prevent overflows when computing offset + size. When sizes can be larger than offsets, accessing elements could overflow.

The SAFETY comment at line 145 incorrectly assumes that "compression does not change the logical values of arrays" means all invariants are preserved. However, narrowing the offset and size types independently can violate the size/offset type compatibility invariant.

Debug Output
FuzzArrayAction {
    array: ChunkedArray {
        dtype: List(
            List(
                Utf8(
                    NonNullable,
                ),
                NonNullable,
            ),
            Nullable,
        ),
        len: 9,
        chunk_offsets: Buffer<u64> {
            length: 3,
            alignment: Alignment(
                8,
            ),
            as_slice: [0, 9, 9],
        },
        chunks: [
            ListViewArray {
                dtype: List(
                    List(
                        Utf8(
                            NonNullable,
                        ),
                        NonNullable,
                    ),
                    Nullable,
                ),
                elements: ListViewArray {
                    dtype: List(
                        Utf8(
                            NonNullable,
                        ),
                        NonNullable,
                    ),
                    elements: VarBinViewArray {
                        dtype: Utf8(
                            NonNullable,
                        ),
                        buffers: [
                            Buffer<u8> {
                                length: 119,
                                alignment: Alignment(
                                    1,
                                ),
                                as_slice: [39, 2, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, ...],
                            },
                        ],
                        views: Buffer<vortex_vector::binaryview::view::BinaryView> {
                            length: 278,
                            alignment: Alignment(
                                16,
                            ),
                            ...
                        },
                        validity: NonNullable,
                        ...
                    },
                    offsets: PrimitiveArray {
                        dtype: Primitive(
                            U64,
                            NonNullable,
                        ),
                        buffer: Buffer<u8> {
                            length: 24,
                            alignment: Alignment(
                                8,
                            ),
                            as_slice: [0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, ...],
                        },
                        validity: NonNullable,
                        ...
                    },
                    sizes: PrimitiveArray {
                        dtype: Primitive(
                            U64,
                            NonNullable,
                        ),
                        buffer: Buffer<u8> {
                            length: 24,
                            alignment: Alignment(
                                8,
                            ),
                            as_slice: [2, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, ...],
                        },
                        validity: NonNullable,
                        ...
                    },
                    is_zero_copy_to_list: true,
                    validity: NonNullable,
                    ...
                },
                offsets: PrimitiveArray {
                    dtype: Primitive(
                        I16,
                        NonNullable,
                    ),
                    buffer: Buffer<u8> {
                        length: 18,
                        alignment: Alignment(
                            2,
                        ),
                        as_slice: [0, 0, 0, 0, 0, 0, 3, 0, 3, 0, 3, 0, 3, 0, 3, 0, ...],
                    },
                    validity: NonNullable,
                    ...
                },
                sizes: PrimitiveArray {
                    dtype: Primitive(
                        I16,
                        NonNullable,
                    ),
                    buffer: Buffer<u8> {
                        length: 18,
                        alignment: Alignment(
                            2,
                        ),
                        as_slice: [0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...],
                    },
                    validity: NonNullable,
                    ...
                },
                is_zero_copy_to_list: true,
                validity: AllValid,
                ...
            },
            ListViewArray {
                dtype: List(
                    List(
                        Utf8(
                            NonNullable,
                        ),
                        NonNullable,
                    ),
                    Nullable,
                ),
                ...
            },
        ],
        ...
    },
)

Summary

Reproduction

  1. Download the crash artifact:

  2. Reproduce locally:

# The artifact contains array_ops/crash-64ecb0246a39229c41791e865bcb196334a7cf9f
cargo +nightly fuzz run --sanitizer=none array_ops array_ops/crash-64ecb0246a39229c41791e865bcb196334a7cf9f
  1. Get full backtrace:
RUST_BACKTRACE=full cargo +nightly fuzz run --sanitizer=none array_ops array_ops/crash-64ecb0246a39229c41791e865bcb196334a7cf9f

Auto-created by fuzzing workflow with Claude analysis

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions