Skip to content

Conversation

@blaginin
Copy link
Collaborator

@blaginin blaginin commented Oct 20, 2025

blaginin and others added 30 commits October 16, 2025 10:58
Replace old-style expected string arrays with insta snapshot assertions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Replace old-style expected string arrays with insta snapshot assertions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Replace old-style expected string arrays with insta snapshot assertions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Replace old-style expected string arrays with insta snapshot assertions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Replace old-style expected string arrays with insta snapshot assertions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- smj_join_key_ordering
- reorder_join_keys_to_left_input (inline snapshots with filter)
- parallelization_ignores_limit
- parallelization_prior_to_sort_preserving_merge
- parallelization_sort_preserving_merge_with_union
- parallelization_does_not_benefit

Replace old-style expected string arrays with insta inline snapshot assertions.
For reorder_join_keys_to_left_input, use regex filter to replace all join_type
values with '...' and add separate assertion for top join type.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Migrated the following tests to use inline insta snapshots:
- repartition_sorted_limit
- repartition_sorted_limit_with_filter
- repartition_ignores_limit
- repartition_ignores_union
- repartition_through_sort_preserving_merge
- repartition_ignores_sort_preserving_merge
- repartition_ignores_sort_preserving_merge_with_union
- repartition_does_not_destroy_sort
- repartition_does_not_destroy_sort_more_complex
- repartition_ignores_transitively_with_projection
- repartition_transitively_past_sort_with_projection

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Migrated the following tests to use inline insta snapshots:
- remove_redundant_roundrobins
- remove_unnecessary_spm_after_filter
- preserve_ordering_through_repartition
- no_need_for_sort_after_filter
- do_not_preserve_ordering_through_repartition3
- do_not_add_unnecessary_hash
- do_not_add_unnecessary_hash2
- optimize_away_unnecessary_repartition

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Migrated the following tests to use inline insta snapshots:
- optimize_away_unnecessary_repartition2
- parallelization_ignores_transitively_with_projection_csv

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Migrated the following tests to use inline insta snapshots:
- parallelization_single_partition
- parallelization_multiple_files

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…tputs

Migrated the following tests to use inline insta snapshots:
- repartition_transitively_with_projection
- repartition_transitively_past_sort_with_filter
- repartition_transitively_past_sort_with_projection_and_filter

These tests have different outputs for DISTRIB_DISTRIB_SORT vs SORT_DISTRIB_DISTRIB,
so each gets two separate inline snapshots.

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Migrated test_distribute_sort_parquet to use inline insta snapshots.

This test uses different optimizer configurations:
- Initial plan before optimization
- After Run::Distribution
- After Run::Distribution + Run::Sorting

All three states now use inline snapshots instead of string arrays.

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Migrated test_distribute_sort_memtable to use inline insta snapshot.

This test checks the final optimized plan after creating and executing
a SQL query on a MemTable.

All tests pass without --accept flag.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…nsta snapshots

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This test uses a loop to test different compression types with conditional
logic for expected outputs. Used insta::allow_duplicates! to handle inline
snapshots within the loop.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…to insta snapshots

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@blaginin blaginin changed the title Move enforce distrivution Insta for enforce_distrubution Oct 20, 2025
@github-actions github-actions bot added the core Core DataFusion crate label Oct 20, 2025
let mut settings = Settings::clone_current();
settings.add_filter(&format!("join_type={join_type}"), "join_type=...");

#[rustfmt::skip]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screen.Recording.2025-10-21.at.11.25.18.mov

@alamb
Copy link
Contributor

alamb commented Oct 22, 2025

So close...

@blaginin
Copy link
Collaborator Author

blaginin commented Oct 22, 2025

This happened to be a bigger change than I expected, there are some really big and complicated tests, and if we just compare with a separate snapshot on every case, we'll add +1K more lines (like in the original PR). I'm doing something more complicated from the code point of view, but on the positive side the file size is the same.

I also think we may want to split some tests / and the file itself, but this may be the one to do on top of this PR

@alamb
Copy link
Contributor

alamb commented Oct 22, 2025

I think a sequence of PRs sounds like a good plan to me

blaginin added a commit to blaginin/datafusion that referenced this pull request Oct 23, 2025
@blaginin
Copy link
Collaborator Author

Sure, moved all easy cases to #18248

github-merge-queue bot pushed a commit that referenced this pull request Oct 29, 2025
- part of #15791

All easy cases from #18185
(that are nicely-ish displayed in git diff).

Note on preserving comments: if it was note about what should happen (or
what will be tested), it's placed on top of the snapshot. If that's
something that comments part of the plan, I put it below the plan
tobixdev pushed a commit to tobixdev/datafusion that referenced this pull request Nov 2, 2025
- part of apache#15791

All easy cases from apache#18185
(that are nicely-ish displayed in git diff).

Note on preserving comments: if it was note about what should happen (or
what will be tested), it's placed on top of the snapshot. If that's
something that comments part of the plan, I put it below the plan
# Conflicts:
#	datafusion/core/tests/physical_optimizer/enforce_distribution.rs
@blaginin blaginin marked this pull request as ready for review November 3, 2025 20:29
@blaginin blaginin self-assigned this Nov 3, 2025
@blaginin blaginin requested a review from alamb November 3, 2025 20:29
@alamb alamb changed the title Insta for enforce_distrubution Complete migrating enforce_distrubution tests to insta Nov 3, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing -- we (well you mostly and @Chen-Yuan-Lai ) did it!

@alamb alamb added the development-process Related to development process of DataFusion label Nov 3, 2025
@alamb alamb enabled auto-merge November 4, 2025 18:13
@alamb alamb added this pull request to the merge queue Nov 5, 2025
Merged via the queue into apache:main with commit acdd263 Nov 5, 2025
28 checks passed
@blaginin
Copy link
Collaborator Author

blaginin commented Nov 5, 2025

Thank you for the review!!!

codetyri0n pushed a commit to codetyri0n/datafusion that referenced this pull request Nov 11, 2025
- part of apache#15791

All easy cases from apache#18185
(that are nicely-ish displayed in git diff).

Note on preserving comments: if it was note about what should happen (or
what will be tested), it's placed on top of the snapshot. If that's
something that comments part of the plan, I put it below the plan
codetyri0n pushed a commit to codetyri0n/datafusion that referenced this pull request Nov 11, 2025
- Closes apache#15791
- Closes apache#15178 🥳

- Surpasses part of apache#16978

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate development-process Related to development process of DataFusion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate core tests to insta [Epic] Add snapshot tests (migrate to insta for tests)

3 participants