Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check group id, metric name, and group name uniqueness #382

Merged
merged 13 commits into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ All notable changes to this project will be documented in this file.

What's changed

* Detect duplicate group ids, group names, and metric names. ([#382](https://github.com/open-telemetry/weaver/pull/382) by lquerel).
* Add support for Maps `map[]` to the definition of an `AnyValue`. ([#396](https://github.com/open-telemetry/weaver/pull/396) by @MSNev).

## [0.10.0] - 2024-09-23
Expand Down
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion crates/weaver_diff/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ rust-version.workspace = true

[dependencies]
walkdir.workspace = true
serde_json.workspace = true
similar = "2.5.0"


[lints]
workspace = true

Expand Down
4 changes: 3 additions & 1 deletion crates/weaver_diff/allowed-external-types.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,6 @@
# SPDX-License-Identifier: Apache-2.0
# This is used with cargo-check-external-types to reduce the surface area of downstream crates from
# the public API. Ideally this can have a few exceptions as possible.
allowed_external_types = []
allowed_external_types = [
"serde_json::*",
]
316 changes: 316 additions & 0 deletions crates/weaver_diff/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@

//! This crate provides bare minimum support for colorized string differencing.

use serde_json::Value;
use similar::TextDiff;
use std::cmp::Ordering;
use std::collections::HashSet;
use std::fs;
use std::path::Path;
Expand Down Expand Up @@ -117,9 +119,172 @@
Ok(are_identical)
}

/// Canonicalizes a JSON string by parsing it into a `serde_json::Value` and then canonicalizing it.
///
/// # Returns
/// - The canonicalized and prettyfied JSON string.
pub fn canonicalize_json_string(value: &str) -> Result<String, serde_json::error::Error> {
let json_value: Value = serde_json::from_str(value)?;
let canonicalized = canonicalize_json(json_value);
serde_json::to_string_pretty(&canonicalized)
}

/// Recursively canonicalizes a JSON value by sorting objects and arrays.
/// - Objects: Keys are sorted lexicographically (handled automatically by `BTreeMap`).
/// - Arrays:
/// - Arrays of primitive values are sorted based on their values.
/// - Arrays containing objects or arrays are sorted based on their canonical form.
/// - Primitive Values: Returned as is.
///
/// # Returns
/// - A new `serde_json::Value` that is the canonical form of the input value.
pub fn canonicalize_json(value: Value) -> Value {
match value {
// Primitive types are returned as is.
Value::Null | Value::Bool(_) | Value::Number(_) | Value::String(_) => value,
Value::Array(arr) => {
// Recursively canonicalize each item in the array.
let mut sorted_items: Vec<Value> = arr.into_iter().map(canonicalize_json).collect();
// Sort the array using `compare_values` to ensure consistent ordering.
sorted_items.sort_by(compare_values);
Value::Array(sorted_items)
}
Value::Object(map) => {
// Recursively canonicalize each value in the object.
let sorted_map = map
.into_iter()
.map(|(k, v)| (k, canonicalize_json(v)))
.collect::<serde_json::Map<String, Value>>();
Value::Object(sorted_map)
// No need to sort keys; BTreeMap keeps them sorted.
}
}
}

/// Compares two `serde_json::Value` instances for sorting purposes.
/// Defines a total ordering among JSON values to allow sorting within arrays.
/// The order is defined as:
/// `Null < Bool < Number < String < Array < Object`
///
/// # Returns
/// - An `Ordering` indicating the relative order of `a` and `b`.
fn compare_values(a: &Value, b: &Value) -> Ordering {
match (a, b) {
// Both values are `Null`.
(Value::Null, Value::Null) => Ordering::Equal,

Check warning on line 174 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L174

Added line #L174 was not covered by tests
// `Null` is less than any other type.
(Value::Null, _) => Ordering::Less,

Check warning on line 176 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L176

Added line #L176 was not covered by tests
(_, Value::Null) => Ordering::Greater,

// Both values are booleans.
(Value::Bool(a_bool), Value::Bool(b_bool)) => a_bool.cmp(b_bool),

Check warning on line 180 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L180

Added line #L180 was not covered by tests
// `Bool` is less than `Number`, `String`, `Array`, and `Object`.
(Value::Bool(_), _) => Ordering::Less,
(_, Value::Bool(_)) => Ordering::Greater,

Check warning on line 183 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L183

Added line #L183 was not covered by tests

// Both values are numbers.
// Compare numbers as floating-point values.
(Value::Number(a_num), Value::Number(b_num)) => a_num
.as_f64()
.partial_cmp(&b_num.as_f64())
.unwrap_or(Ordering::Equal), // Handle NaN cases gracefully.
// `Number` is less than `String`, `Array`, and `Object`.
(Value::Number(_), _) => Ordering::Less,
(_, Value::Number(_)) => Ordering::Greater,

Check warning on line 193 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L193

Added line #L193 was not covered by tests

// Both values are strings.
(Value::String(a_str), Value::String(b_str)) => a_str.cmp(b_str),
// `String` is less than `Array` and `Object`.
(Value::String(_), _) => Ordering::Less,
(_, Value::String(_)) => Ordering::Greater,

Check warning on line 199 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L199

Added line #L199 was not covered by tests

// Both values are arrays.
(Value::Array(a_arr), Value::Array(b_arr)) => compare_arrays(a_arr, b_arr),
// `Array` is less than `Object`.
(Value::Array(_), _) => Ordering::Less,
(_, Value::Array(_)) => Ordering::Greater,

Check warning on line 205 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L205

Added line #L205 was not covered by tests

// Both values are objects.
(Value::Object(a_obj), Value::Object(b_obj)) => compare_objects(a_obj, b_obj),
}
}

/// Compares two JSON arrays element-wise for sorting purposes.
/// Arrays are compared based on their elements:
/// - First, by length.
/// - Then, by comparing each pair of elements in order.
///
/// # Returns
/// - An `Ordering` indicating the relative order of `a` and `b`.
fn compare_arrays(a: &[Value], b: &[Value]) -> Ordering {
// Compare lengths first.
let len_cmp = a.len().cmp(&b.len());
if len_cmp != Ordering::Equal {
return len_cmp;

Check warning on line 223 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L223

Added line #L223 was not covered by tests
}

// Compare each pair of elements.
for (a_item, b_item) in a.iter().zip(b.iter()) {
let ord = compare_values(a_item, b_item);
if ord != Ordering::Equal {
return ord;
}
}

// All elements are equal.
Ordering::Equal

Check warning on line 235 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L235

Added line #L235 was not covered by tests
}

/// Compares two JSON objects by their sorted key-value pairs for sorting purposes.
/// Objects are compared based on:
/// - Their number of key-value pairs.
/// - The keys and associated values in lexicographical order.
///
/// # Returns
/// - An `Ordering` indicating the relative order of `a` and `b`.
fn compare_objects(
a: &serde_json::Map<String, Value>,
b: &serde_json::Map<String, Value>,
) -> Ordering {
// Compare number of key-value pairs.
let len_cmp = a.len().cmp(&b.len());
if len_cmp != Ordering::Equal {
return len_cmp;

Check warning on line 252 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L252

Added line #L252 was not covered by tests
}

// Iterate over key-value pairs (keys are sorted in BTreeMap).
let mut a_iter = a.iter();
let mut b_iter = b.iter();

loop {
match (a_iter.next(), b_iter.next()) {
(Some((a_key, a_val)), Some((b_key, b_val))) => {
// Compare keys.
let key_cmp = a_key.cmp(b_key);
if key_cmp != Ordering::Equal {
return key_cmp;
}
// Compare associated values.
let val_cmp = compare_values(a_val, b_val);
if val_cmp != Ordering::Equal {
return val_cmp;
}
}
// Both iterators have reached the end.
(None, None) => break,
// This case should not occur since lengths are equal.
_ => unreachable!("Maps have the same length; this should not happen."),
}
}

// All key-value pairs are equal.
Ordering::Equal

Check warning on line 281 in crates/weaver_diff/src/lib.rs

View check run for this annotation

Codecov / codecov/patch

crates/weaver_diff/src/lib.rs#L281

Added line #L281 was not covered by tests
}

#[cfg(test)]
mod tests {
use super::*;
use serde_json::json;

#[test]
fn test_diff_output() {
Expand Down Expand Up @@ -148,4 +313,155 @@
diff_dir(&expected_dir, &observed_dir).expect("Failed to diff directories");
assert!(!are_identical);
}

#[test]
fn test_canonicalize_primitives() {
let null = Value::Null;
assert_eq!(canonicalize_json(null.clone()), null);

let boolean = Value::Bool(true);
assert_eq!(canonicalize_json(boolean.clone()), boolean);

let number = Value::Number(serde_json::Number::from(42));
assert_eq!(canonicalize_json(number.clone()), number);

let string = Value::String(String::from("hello"));
assert_eq!(canonicalize_json(string.clone()), string);
}

#[test]
fn test_canonicalize_empty_array_and_object() {
let empty_array = json!([]);
assert_eq!(canonicalize_json(empty_array.clone()), empty_array);

let empty_object = json!({});
assert_eq!(canonicalize_json(empty_object.clone()), empty_object);
}

#[test]
fn test_canonicalize_array_of_primitives() {
let array = json!([3, 1, 2]);
let expected = json!([1, 2, 3]);
assert_eq!(canonicalize_json(array), expected);
}

#[test]
fn test_canonicalize_array_of_arrays() {
let array = json!([[3, 1, 2], [6, 5, 4]]);
let expected = json!([[1, 2, 3], [4, 5, 6]]);
assert_eq!(canonicalize_json(array), expected);
}

#[test]
fn test_canonicalize_array_of_objects() {
let array = json!([
{"b": 2, "a": 1},
{"d": 4, "c": 3}
]);
let expected = json!([
{"a": 1, "b": 2},
{"c": 3, "d": 4}
]);
assert_eq!(canonicalize_json(array), expected);
}

#[test]
fn test_canonicalize_nested_structures() {
let json_value = json!({
"b": [3, 1, 2],
"a": {"y": 2, "x": 1},
"c": [{"b": 2}, {"a": 1}]
});
let expected = json!({
"a": {
"x": 1,
"y": 2
},
"b": [
1,
2,
3
],
"c": [
{"a": 1},
{"b": 2}
]
});
assert_eq!(canonicalize_json(json_value), expected);
}

#[test]
fn test_canonicalize_numbers() {
let array = json!([3.1, 2.2, 1.3]);
let expected = json!([1.3, 2.2, 3.1]);
assert_eq!(canonicalize_json(array), expected);
}

#[test]
fn test_canonicalize_mixed_types_in_array() {
let array = json!([null, "string", 2, true, 3]);
let expected = json!([null, true, 2, 3, "string"]);
assert_eq!(canonicalize_json(array.clone()), expected);
}

#[test]
fn test_canonicalize_mixed_array() {
let array = json!([{"a": 2}, [3, 1], "b", 4]);
let expected = json!([4, "b", [1, 3], {"a": 2}]);
assert_eq!(canonicalize_json(array), expected);
}

#[test]
fn test_canonicalize_complex() {
let json_value = json!({
"z": [{"b": [3, 2, 1]}, {"a": [6, 5, 4]}],
"y": {"b": {"d": 4, "c": 3}, "a": {"b": 2, "a": 1}},
"x": [9, 7, 8]
});
let expected = json!({
"x": [7, 8, 9],
"y": {
"a": {"a": 1, "b": 2},
"b": {"c": 3, "d": 4}
},
"z": [
{"a": [4, 5, 6]},
{"b": [1, 2, 3]}
]
});
assert_eq!(canonicalize_json(json_value), expected);
}

#[test]
fn test_compare_values() {
let a = json!(1);
let b = json!(2);
assert_eq!(compare_values(&a, &b), Ordering::Less);

let a = json!("apple");
let b = json!("banana");
assert_eq!(compare_values(&a, &b), Ordering::Less);

let a = json!([1, 2, 3]);
let b = json!([1, 2, 4]);
assert_eq!(compare_values(&a, &b), Ordering::Less);

let a = json!({"a": 1});
let b = json!({"a": 2});
assert_eq!(compare_values(&a, &b), Ordering::Less);
}

#[test]
fn test_handling_of_special_numbers() {
let array = json!([0.0, -0.0, 1.0, -1.0]);
let expected = json!([-1.0, -0.0, 0.0, 1.0]);
assert_eq!(canonicalize_json(array), expected);
}

#[test]
fn test_unicode_strings() {
let array = json!(["éclair", "apple", "Æther", "zebra"]);
let expected = json!(["apple", "zebra", "Æther", "éclair"]);
assert_eq!(canonicalize_json(array), expected);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"Err":{"DuplicateAttributeId":{"group_ids":["registry.messaging_bis","registry.messaging"],"attribute_id":"messaging.batch.message_count"}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Test that duplicate group ids are detected and reported.
This test must fail.
Loading
Loading