Releases: getyourguide/dataframe-expectations
v0.5.0
Features
Tag-based Filtering
Add support for selective expectation execution using custom tags.
Key Features:
- New
TagMatchModeenum withANY(OR logic) andALL(AND logic) options - Tag expectations with
"key:value"format (e.g.,"priority:high","env:prod") - Filter expectations at build time
Example:
# Tag expectations
suite = (
DataFrameExpectationsSuite()
.expect_value_greater_than(column_name="age", value=18, tags=["priority:high", "env:prod"])
.expect_value_not_null(column_name="name", tags=["priority:high"])
.expect_min_rows(min_rows=1, tags=["priority:low", "env:test"])
)
# Run only high-priority checks (OR logic)
runner = suite.build(tags=["priority:high"], tag_match_mode=TagMatchMode.ANY)
# Run production-critical checks (AND logic)
runner = suite.build(tags=["priority:high", "env:prod"], tag_match_mode=TagMatchMode.ALL)Programmatic Result Inspection
Enhanced SuiteExecutionResult for detailed validation analysis.
Key Features:
- Use
raise_on_failure=Falseto inspect results without raising exceptions - Access comprehensive metrics:
total_expectations,total_passed,total_failed,pass_rate,total_duration_seconds - Inspect individual expectation results with status, violation counts, descriptions, and timing
- View applied tag filters in execution results
Example:
# Get results without raising exceptions
result = runner.run(df, raise_on_failure=False)
# Inspect the results programmatically
print(f"Total expectations: {result.total_expectations}")
print(f"Passed: {result.total_passed}, Failed: {result.total_failed}")
print(f"Pass rate: {result.pass_rate:.2%}")
print(f"Applied filters: {result.applied_filters}")
print(f"Tag match mode: {result.tag_match_mode}")
# Access individual expectation results
for exp_result in result.results:
if exp_result.status == "failed":
print(f"Failed: {exp_result.description}")
print(f"Violation count: {exp_result.violation_count}")Documentation
- Added tag-based filtering examples to README.md and getting_started.rst
- Updated adding_expectations.rst with proper tag handling patterns for custom expectations
- Documented programmatic result inspection with comprehensive examples
- Reorganized documentation structure: user guide in getting_started.rst, developer notes in adding_expectations.rst
Full Changelog: v0.4.0...v0.5.0
v0.4.0
0.4.0 (2025-11-10)
⚠ BREAKING CHANGES
‼️ BREAKING CHANGE: Major codebase restructuring with new module organization. However, most changes are made to the internal modules.
What changed:
- All internal modules have been reorganized into a
core/package - Expectation registry simplified from three-dictionary to two-dictionary structure with O(1) lookups
- Main imports updated from
expectations_suitetosuite
Migration guide:
Update your imports to use the new module structure:
# Before
from dataframe_expectations.expectations_suite import DataFrameExpectationsSuite
# After
from dataframe_expectations.suite import DataFrameExpectationsSuiteFeatures
- restructure codebase with core/ module and explicit imports (42a233a)
- restructure codebase, and registry refactoring (111bca1)
- simplified registry (c182858)
Bug Fixes
- consolidate imports (9a76467)
- deleted duplicate dataclass and enums from registry (82bec0c)
- deleted duplicate DataFrameExpectation codefrom expectations package (d47eb8b)
- import enums from types (fa84764)
- manually trigger CI for release-please PRs (49419e6)
- manually trigger CI for release-please PRs (9585cf5)
- return corrent version when package is built (82ff343)
Documentation
- remove unused imports (276589d)
Full Changelog: v0.3.0...v0.4.0
v0.3.0
🎯 DataFrame Expectations v0.3.0
⚠️ Breaking Changes
This release introduces a builder pattern for the DataFrameExpectationsSuite that changes how you create and run expectation suites.
Migration Guide:
# Before (v0.2.0)
suite = DataFrameExpectationsSuite()
suite.expect_min_rows(min_rows=3)
suite.run(df)
# After (v0.3.0)
suite = DataFrameExpectationsSuite()
suite.expect_min_rows(min_rows=3)
runner = suite.build() # New: Build a runner
runner.run(df) # Run on the runner✨ New Features
🏗️ Builder Pattern & Immutable Runners
- Introduces
DataFrameExpectationsSuiteRunner- an immutable runner created via.build() - Allows reusing the same validation logic across multiple DataFrames
- Enables building multiple independent runners from the same suite at different stages
🎨 Decorator Pattern for Automatic Validation
Validate DataFrames returned by functions automatically using the @runner.validate decorator:
@runner.validate
def load_data():
return pd.DataFrame({"col": [1, 2, 3]})
# Supports optional DataFrame returns
@runner.validate(allow_none=True)
def maybe_load_data():
if condition:
return pd.DataFrame(...)
return None🔍 Expectation Inspection
- Added
expectation_countproperty to check the number of expectations - Added
list_expectations()method to view all expectations in a runner
📚 Documentation
- Added Spark session initialization to PySpark examples in README and documentation
- Improved example code to be immediately runnable
🔧 Maintenance
- Updated release configuration for simpler tag generation
- Dependency updates: pytest 9.0.0, ruff 0.14.4, pre-commit 4.4.0
📦 What's Changed
- fix: update release please config to generate simple tags by @ryanseq-gyg in #13
- feat!: implement builder pattern for expectation suite runner by @ryanseq-gyg in #18
- build(deps): bump pre-commit from 4.3.0 to 4.4.0 by @dependabot in #17
- build(deps): bump ruff from 0.14.3 to 0.14.4 by @dependabot in #16
- build(deps): bump pytest from 8.4.2 to 9.0.0 in the 01_major-updates group by @dependabot in #15
Full Changelog: v0.2.0...v0.3.0
v0.2.0
This release introduces a major refactoring of the expectation registration system, replacing 800+ lines of boilerplate with dynamic method generation from a central registry. The refactoring maintains full IDE type-ahead support through auto-generated stub files while significantly improving maintainability.
Features
- Dynamic Expectation Registration: Implement dynamic method generation with centralized registry system
- Replaces manual method definitions in DataFrameExpectationsSuite
- Maintains IDE type hints through auto-generated .pyi stub files
- Reduces boilerplate and improves maintainability
Bug Fixes
- Handle pandas DataFrame.map() compatibility for older versions
- Convert expectation category to str while generating stubs
Documentation
- Update documentation for new registration system
- Remove API reference button on expectation cards
- Update README with additional badges
Chores
- Add publishing and release workflows
- Pin action commit hashes and update PR template
- Update sanity checks script for dynamic expectation calls
- Update release-please to approved version
What's Changed
- fix: updated release-please hash to approve version by @ryanseq-gyg in #6
- fix: added more badges to readme by @ryanseq-gyg in #8
- Pin uv to commit by @ryanseq-gyg in #9
- Bump ruff from 0.14.2 to 0.14.3 by @dependabot in #10
- Refactor expectation registration by @ryanseq-gyg in #12
- chore(main): release dataframe-expectations 0.2.0 by @github-actions in #7
Full Changelog: v0.1.1...dataframe-expectations-v0.2.0
v0.1.1
What's Changed
- Initial commits by @ryanseq-gyg in #1
- [dependabutler] update .github/dependabot.yml by @gygsecrobot in #2
- Bump ruff from 0.14.1 to 0.14.2 by @dependabot[bot] in #3
- Updated runners in CI by @ryanseq-gyg in #4
- fix: added publishing and release workflows by @ryanseq-gyg in #5
New Contributors
- @ryanseq-gyg made their first contribution in #1
- @gygsecrobot made their first contribution in #2
- @dependabot[bot] made their first contribution in #3
Full Changelog: https://github.com/getyourguide/dataframe-expectations/commits/v0.1.1