-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Labels
Description
Describe the bug
In arrow_json, Decoder::decode
can panic if it encounters two high surrogates in a row. Since this method returns a Result
, panics are not expected, even in error cases.
To Reproduce
The following program reproduces the bug:
use std::io::{BufRead, BufReader};
use std::sync::Arc;
use arrow::datatypes::{DataType, Field};
use arrow_json::ReaderBuilder;
fn main() {
let mut decoder =
ReaderBuilder::new_with_field(Arc::new(Field::new("test", DataType::Utf8, true)))
.build_decoder()
.unwrap();
let s = r#"{"test": "\uD800\uD801"}"#;
let mut reader = BufReader::new(s.as_bytes());
let buf = reader.fill_buf().unwrap();
let _ = decoder.decode(buf);
}
Running this gives:
thread 'main' panicked at /home/user/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/arrow-json-55.1.0/src/reader/tape.rs:708:49:
attempt to subtract with overflow
stack backtrace:
0: rust_begin_unwind
at /rustc/05f9846f893b09a1be1fc8560e33fc3c815cfecb/library/std/src/panicking.rs:695:5
1: core::panicking::panic_fmt
at /rustc/05f9846f893b09a1be1fc8560e33fc3c815cfecb/library/core/src/panicking.rs:75:14
2: core::panicking::panic_const::panic_const_sub_overflow
at /rustc/05f9846f893b09a1be1fc8560e33fc3c815cfecb/library/core/src/panicking.rs:178:21
3: arrow_json::reader::tape::char_from_surrogate_pair
at [..]/arrow-json-55.1.0/src/reader/tape.rs:708:49
4: arrow_json::reader::tape::TapeDecoder::decode
at [..]/arrow-json-55.1.0/src/reader/tape.rs:514:37
5: arrow_json::reader::Decoder::decode
at [..]/arrow-json-55.1.0/src/reader/mod.rs:439:9
6: arrow_panic::main
at ./src/main.rs:15:13
7: core::ops::function::FnOnce::call_once
at [..]/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5
Expected behavior
decode
should return an error as the string is invalid, but it should not panic.
Additional context