-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Describe the bug
I was migrating tests to insta
in a PR #15248 and got a problem. For same expected output in a test, I was finding differing old and new snapshots while using batches_to_sort_string
and no differing snapshots while using assert_batches_sorted_eq
. I did not encounter this issue while migrating many other tests in /datafusion/physical-plan
, this weirdly was the first time I encountered this issue.
Edit: Also referencing at the following PR comment #15288 (comment) . This is also a problem discovered so far.
Previous code (using assert_batches_sorted_eq) :
let expected = [
"+---+---+---+----+---+---+",
"| a | b | c | a | b | c |",
"+---+---+---+----+---+---+",
"| | | | 30 | 3 | 6 |",
"| | | | 40 | 4 | 4 |",
"| 2 | 7 | 9 | 10 | 2 | 7 |",
"| 2 | 7 | 9 | 20 | 2 | 5 |",
"| 0 | 4 | 7 | | | |",
"| 1 | 5 | 8 | | | |",
"| 2 | 8 | 1 | | | |",
"+---+---+---+----+---+---+",
];
assert_batches_sorted_eq!(expected, &batches);
New code (using batches_to_sort_string) :
allow_duplicates! {
assert_snapshot!(batches_to_sort_string(&batches), @r#"
+---+---+---+----+---+---+
| a | b | c | a | b | c |
+---+---+---+----+---+---+
| | | | 30 | 3 | 6 |
| | | | 40 | 4 | 4 |
| 2 | 7 | 9 | 10 | 2 | 7 |
| 2 | 7 | 9 | 20 | 2 | 5 |
| 0 | 4 | 7 | | | |
| 1 | 5 | 8 | | | |
| 2 | 8 | 1 | | | |
+---+---+---+----+---+---+
"#)
}
In both cases, I had made sure several times that the expected output is the same.
I am getting the following output while using new code:
To Reproduce
In /datafusion/physical-plan/src/joins/hash_join.rs
,
replace following part in async fn join_full_with_filter(batch_size: usize) -> Result<()>
:
let expected = [
"+---+---+---+----+---+---+",
"| a | b | c | a | b | c |",
"+---+---+---+----+---+---+",
"| | | | 30 | 3 | 6 |",
"| | | | 40 | 4 | 4 |",
"| 2 | 7 | 9 | 10 | 2 | 7 |",
"| 2 | 7 | 9 | 20 | 2 | 5 |",
"| 0 | 4 | 7 | | | |",
"| 1 | 5 | 8 | | | |",
"| 2 | 8 | 1 | | | |",
"+---+---+---+----+---+---+",
];
assert_batches_sorted_eq!(expected, &batches);
with
allow_duplicates! {
assert_snapshot!(batches_to_sort_string(&batches), @r#"
+---+---+---+----+---+---+
| a | b | c | a | b | c |
+---+---+---+----+---+---+
| | | | 30 | 3 | 6 |
| | | | 40 | 4 | 4 |
| 2 | 7 | 9 | 10 | 2 | 7 |
| 2 | 7 | 9 | 20 | 2 | 5 |
| 0 | 4 | 7 | | | |
| 1 | 5 | 8 | | | |
| 2 | 8 | 1 | | | |
+---+---+---+----+---+---+
"#)
}
Expected behavior
Similar results for both the tests.
Additional context
I did not encounter this issue while migrating many other tests in /datafusion/physical-plan
, this weirdly was the first time I encountered this issue.