Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soundness: CreatingBinaryArray from ArrayData does not perform bound checks on reading offsets #773

Closed
jorgecarleitao opened this issue Sep 14, 2021 · 2 comments

Comments

@jorgecarleitao
Copy link
Member

jorgecarleitao commented Sep 14, 2021

When an ArrayData reports a length inconsistent with the length of the offsets buffer, value_offsets reads out of bounds.

use arrow::array::*;
use arrow::buffer::*;
use arrow::datatypes::*;

fn main() {
    let data = ArrayData::new(
        DataType::Binary,
        1000,
        None,
        None,
        0,
        vec![
            Buffer::from_slice_ref(&[0i32, 1]),
            Buffer::from_slice_ref(&[0u8, 1, 1]),
        ],
        vec![],
    );
    let a = BinaryArray::from(data);
    let b = a.value(10);
}
cargo miri run --example unsafe

error: Undefined Behavior: memory access failed: pointer must be in-bounds at offset 4004, but is outside bounds of alloc1459 which has size 64
  --> /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/raw.rs:95:14
   |
95 |     unsafe { &*ptr::slice_from_raw_parts(data, len) }
   |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ memory access failed: pointer must be in-bounds at offset 4004, but is outside bounds of alloc1459 which has size 64
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
           
   = note: inside `std::slice::from_raw_parts::<i32>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/slice/raw.rs:95:14
   = note: inside `arrow::array::GenericBinaryArray::<i32>::value_offsets` at /home/azureuser/projects/arrow-rs/arrow/src/array/array_binary.rs:71:13
   = note: inside `arrow::array::GenericBinaryArray::<i32>::value` at /home/azureuser/projects/arrow-rs/arrow/src/array/array_binary.rs:104:28
note: inside `main` at arrow/examples/unsafe.rs:19:13
  --> arrow/examples/unsafe.rs:19:13
   |
19 |     let b = a.value(10);
   |             ^^^^^^^^^^^
   = note: inside `<fn() as std::ops::FnOnce<()>>::call_once - shim(fn())` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5
   = note: inside `std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:125:18
   = note: inside closure at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:63:18
   = note: inside `std::ops::function::impls::<impl std::ops::FnOnce<()> for &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>::call_once` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:259:13
   = note: inside `std::panicking::r#try::do_call::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:401:40
   = note: inside `std::panicking::r#try::<i32, &dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:365:19
   = note: inside `std::panic::catch_unwind::<&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe, i32>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:434:14
   = note: inside closure at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:45:48
   = note: inside `std::panicking::r#try::do_call::<[closure@std::rt::lang_start_internal::{closure#2}], isize>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:401:40
   = note: inside `std::panicking::r#try::<isize, [closure@std::rt::lang_start_internal::{closure#2}]>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:365:19
   = note: inside `std::panic::catch_unwind::<[closure@std::rt::lang_start_internal::{closure#2}], isize>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:434:14
   = note: inside `std::rt::lang_start_internal` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:45:20
   = note: inside `std::rt::lang_start::<()>` at /home/azureuser/.rustup/toolchains/nightly-2021-07-04-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/rt.rs:62:5

error: aborting due to previous error; 1 warning emitted

Affects all versions up to today.

@alamb alamb changed the title Soundness: BinaryArray does not perform bound checks on reading offsets Soundness: CreatingBinaryArray from ArrayData does not perform bound checks on reading offsets Sep 29, 2021
@alamb
Copy link
Contributor

alamb commented Sep 29, 2021

Updating title of the ticket to make it clear this affects misusing the lower level APIs, as described in https://github.com/apache/arrow-rs/tree/master/arrow#safety

@alamb
Copy link
Contributor

alamb commented Oct 29, 2021

This is a specific case of the general issue described in #817, so closing this one as a duplicate in favor of the more general ticket.

@alamb alamb closed this as completed Oct 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants