-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Control Characters wrongly allowed #3
Comments
Thanks for spotting this, @indietyp. I corrected that right now. I am curious about your fuzzing procedure: How did you find this bug? |
Yea for sure, I am currently working on a parser that is also able to parse into a #[derive(Debug, Clone, PartialEq, Eq)]
pub enum ValueKind<'source> {
/// Represents a JSON null value
Null,
/// Represents a JSON boolean
Bool(bool),
/// Represents a JSON number, whether integer or floating point
Number(Cow<'source, Number>),
/// Represents a JSON string
String(Cow<'source, str>),
/// Represents a JSON array
Array(Vec<Value<'source>>),
/// Represents a JSON object
Object(HashMap<Cow<'source, str>, Value<'source>>),
} ( Therefore, using test-fuzz, I created a simple little test function - very similar to the following: #[test_fuzz::test_fuzz]
fn assert_eq_serde(input: &str) {
let value_result = parse_complete(&input);
let serde_result = serde_json::from_str::<serde_json::Value>(input);
if let Err(error) = &serde_result {
// serde_json number out of range is normal, because we don't have any precision checks
// (we save it as a string for later)
if error.to_string().contains("number out of range") {
return;
}
}
if let Err(error) = &value_result {
let is_duplicate = /* was the error we received duplicate? check removed for brevities sake */
if is_duplicate {
// we error out on duplicate keys, serde_json does not
return;
}
}
match (value_result, serde_result) {
(Ok(a), Ok(b)) => assert_eq!(a, b),
(Err(_), Err(_)) => {}
(value, serde) => {
panic!("input: {input:?}\nvalue: {value:?}\nserde: {serde:?}");
}
};
}
const INPUT: &str = r#"
[
{
"Name": "Edward the Elder",
"Country": "United Kingdom",
"House": "House of Wessex",
"Reign": "899-925",
"ID": 1
},
{
"Name": "Athelstan",
"Country": "United Kingdom",
"House": "House of Wessex",
"Reign": "925-940",
"ID": 2
},
{
"Name": "Edmund",
"Country": "United Kingdom",
"House": "House of Wessex",
"Reign": "940-946",
"ID": 3
},
{
"Name": "Edred",
"Country": "United Kingdom",
"House": "House of Wessex",
"Reign": "946-955",
"ID": 4
},
{
"Name": "Edwy",
"Country": "United Kingdom",
"House": "House of Wessex",
"Reign": "955-959",
"ID": 5
}
]"#;
#[test]
fn serde_integration() {
// ensure that everything that serde can parse, we can parse as well
assert_eq_serde(INPUT);
assert_eq_serde("[[");
assert_eq_serde("[[[]]");
assert_eq_serde("{}");
assert_eq_serde("{}}");
assert_eq_serde(r#"{12: "12"}"#);
} Then just run: cargo test
cargo test-fuzz assert_eq_serde This then found the control character error. I don't know if something like this could also work in this repository, because of the reliance on a |
That's a pretty nice idea! Thanks for explaining it in such detail. |
During fuzzing, I found that
str_fold
is incorrectly implemented. The RFC 8259 states in Section 7:but
str_fold
incorrectly implements this incorrectly, by only marking everything until0x13
as a control character, not until0x1F
hifijson/src/str.rs
Line 193 in 602cd36
The text was updated successfully, but these errors were encountered: