Skip to content

Releases: getyourguide/dataframe-expectations

v0.5.0

22 Nov 14:11
d62ce0b

Choose a tag to compare

Features

Tag-based Filtering

Add support for selective expectation execution using custom tags.

Key Features:

  • New TagMatchMode enum with ANY (OR logic) and ALL (AND logic) options
  • Tag expectations with "key:value" format (e.g., "priority:high", "env:prod")
  • Filter expectations at build time

Example:

# Tag expectations
suite = (
    DataFrameExpectationsSuite()
    .expect_value_greater_than(column_name="age", value=18, tags=["priority:high", "env:prod"])
    .expect_value_not_null(column_name="name", tags=["priority:high"])
    .expect_min_rows(min_rows=1, tags=["priority:low", "env:test"])
)

# Run only high-priority checks (OR logic)
runner = suite.build(tags=["priority:high"], tag_match_mode=TagMatchMode.ANY)

# Run production-critical checks (AND logic)
runner = suite.build(tags=["priority:high", "env:prod"], tag_match_mode=TagMatchMode.ALL)

Programmatic Result Inspection

Enhanced SuiteExecutionResult for detailed validation analysis.

Key Features:

  • Use raise_on_failure=False to inspect results without raising exceptions
  • Access comprehensive metrics: total_expectations, total_passed, total_failed, pass_rate, total_duration_seconds
  • Inspect individual expectation results with status, violation counts, descriptions, and timing
  • View applied tag filters in execution results

Example:

# Get results without raising exceptions
result = runner.run(df, raise_on_failure=False)

# Inspect the results programmatically
print(f"Total expectations: {result.total_expectations}")
print(f"Passed: {result.total_passed}, Failed: {result.total_failed}")
print(f"Pass rate: {result.pass_rate:.2%}")
print(f"Applied filters: {result.applied_filters}")
print(f"Tag match mode: {result.tag_match_mode}")

# Access individual expectation results
for exp_result in result.results:
    if exp_result.status == "failed":
        print(f"Failed: {exp_result.description}")
        print(f"Violation count: {exp_result.violation_count}")

Documentation

  • Added tag-based filtering examples to README.md and getting_started.rst
  • Updated adding_expectations.rst with proper tag handling patterns for custom expectations
  • Documented programmatic result inspection with comprehensive examples
  • Reorganized documentation structure: user guide in getting_started.rst, developer notes in adding_expectations.rst

Full Changelog: v0.4.0...v0.5.0

v0.4.0

10 Nov 16:19
a110faf

Choose a tag to compare

0.4.0 (2025-11-10)

⚠ BREAKING CHANGES

  • ‼️ BREAKING CHANGE: Major codebase restructuring with new module organization. However, most changes are made to the internal modules.

What changed:

  • All internal modules have been reorganized into a core/ package
  • Expectation registry simplified from three-dictionary to two-dictionary structure with O(1) lookups
  • Main imports updated from expectations_suite to suite

Migration guide:
Update your imports to use the new module structure:

# Before
from dataframe_expectations.expectations_suite import DataFrameExpectationsSuite

# After
from dataframe_expectations.suite import DataFrameExpectationsSuite

Features

  • restructure codebase with core/ module and explicit imports (42a233a)
  • restructure codebase, and registry refactoring (111bca1)
  • simplified registry (c182858)

Bug Fixes

  • consolidate imports (9a76467)
  • deleted duplicate dataclass and enums from registry (82bec0c)
  • deleted duplicate DataFrameExpectation codefrom expectations package (d47eb8b)
  • import enums from types (fa84764)
  • manually trigger CI for release-please PRs (49419e6)
  • manually trigger CI for release-please PRs (9585cf5)
  • return corrent version when package is built (82ff343)

Documentation

Full Changelog: v0.3.0...v0.4.0

v0.3.0

09 Nov 12:43
5567760

Choose a tag to compare

🎯 DataFrame Expectations v0.3.0

⚠️ Breaking Changes

This release introduces a builder pattern for the DataFrameExpectationsSuite that changes how you create and run expectation suites.

Migration Guide:

# Before (v0.2.0)
suite = DataFrameExpectationsSuite()
suite.expect_min_rows(min_rows=3)
suite.run(df)

# After (v0.3.0)
suite = DataFrameExpectationsSuite()
suite.expect_min_rows(min_rows=3)
runner = suite.build()  # New: Build a runner
runner.run(df)          # Run on the runner

✨ New Features

🏗️ Builder Pattern & Immutable Runners

  • Introduces DataFrameExpectationsSuiteRunner - an immutable runner created via .build()
  • Allows reusing the same validation logic across multiple DataFrames
  • Enables building multiple independent runners from the same suite at different stages

🎨 Decorator Pattern for Automatic Validation

Validate DataFrames returned by functions automatically using the @runner.validate decorator:

@runner.validate
def load_data():
    return pd.DataFrame({"col": [1, 2, 3]})

# Supports optional DataFrame returns
@runner.validate(allow_none=True)
def maybe_load_data():
    if condition:
        return pd.DataFrame(...)
    return None

🔍 Expectation Inspection

  • Added expectation_count property to check the number of expectations
  • Added list_expectations() method to view all expectations in a runner

📚 Documentation

  • Added Spark session initialization to PySpark examples in README and documentation
  • Improved example code to be immediately runnable

🔧 Maintenance

  • Updated release configuration for simpler tag generation
  • Dependency updates: pytest 9.0.0, ruff 0.14.4, pre-commit 4.4.0

📦 What's Changed

  • fix: update release please config to generate simple tags by @ryanseq-gyg in #13
  • feat!: implement builder pattern for expectation suite runner by @ryanseq-gyg in #18
  • build(deps): bump pre-commit from 4.3.0 to 4.4.0 by @dependabot in #17
  • build(deps): bump ruff from 0.14.3 to 0.14.4 by @dependabot in #16
  • build(deps): bump pytest from 8.4.2 to 9.0.0 in the 01_major-updates group by @dependabot in #15

Full Changelog: v0.2.0...v0.3.0

v0.2.0

08 Nov 20:16
0170ac5

Choose a tag to compare

This release introduces a major refactoring of the expectation registration system, replacing 800+ lines of boilerplate with dynamic method generation from a central registry. The refactoring maintains full IDE type-ahead support through auto-generated stub files while significantly improving maintainability.

Features

  • Dynamic Expectation Registration: Implement dynamic method generation with centralized registry system
    • Replaces manual method definitions in DataFrameExpectationsSuite
    • Maintains IDE type hints through auto-generated .pyi stub files
    • Reduces boilerplate and improves maintainability

Bug Fixes

  • Handle pandas DataFrame.map() compatibility for older versions
  • Convert expectation category to str while generating stubs

Documentation

  • Update documentation for new registration system
  • Remove API reference button on expectation cards
  • Update README with additional badges

Chores

  • Add publishing and release workflows
  • Pin action commit hashes and update PR template
  • Update sanity checks script for dynamic expectation calls
  • Update release-please to approved version

What's Changed

Full Changelog: v0.1.1...dataframe-expectations-v0.2.0

v0.1.1

31 Oct 15:25
3f89e95

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: https://github.com/getyourguide/dataframe-expectations/commits/v0.1.1