Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow float in interpolate_by by column #18015

Merged
merged 11 commits into from
Aug 18, 2024

Conversation

agossard
Copy link
Contributor

@agossard agossard commented Aug 2, 2024

Ok, splitting this out in to a simpler PR. Fixes #16794

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Aug 2, 2024
Copy link

codecov bot commented Aug 2, 2024

Codecov Report

Attention: Patch coverage is 87.50000% with 1 line in your changes missing coverage. Please review.

Project coverage is 80.34%. Comparing base (d5265d3) to head (c9ff1a8).
Report is 84 commits behind head on main.

Files Patch % Lines
...ops/src/series/ops/interpolation/interpolate_by.rs 87.50% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #18015      +/-   ##
==========================================
- Coverage   80.49%   80.34%   -0.16%     
==========================================
  Files        1496     1496              
  Lines      196786   197684     +898     
  Branches     2817     2821       +4     
==========================================
+ Hits       158407   158820     +413     
- Misses      37858    38343     +485     
  Partials      521      521              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @agossard 🙏 !

implementation looks good! you just need to run make pre-commit

thanks for amending the hypothesis test - could we also have one of the unit tests use a float by column?

@agossard
Copy link
Contributor Author

agossard commented Aug 3, 2024

Thanks Marco. I’ll try to fix the remaining style problem. In the unit test, just to confirm… as is, I’ve made it so the tests cover a float by column… by casting the existing example, like the other data types. Are you looking for a different piece of input data that actually has true non integer data in it?

Copy link
Collaborator

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @agossard for updating!

sorry I just meant - on top of the hypothesis test, to modify one of the unit tests (say, test_interpolate_by_trailing_nulls) to have a float by column

@@ -143,15 +145,24 @@ def test_interpolate_by_trailing_nulls() -> None:


@given(data=st.data())
def test_interpolate_vs_numpy(data: st.DataObject) -> None:
@pytest.mark.parametrize("x_dtype", [pl.Date, pl.Float64])
Copy link
Collaborator

@MarcoGorelli MarcoGorelli Aug 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be in given, you can use st.sampled_from

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in, x_dtype=st.sampled_from([pl.Date, pl.Float64])

Copy link
Collaborator

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @agossard , just one pending comment

@@ -143,15 +145,24 @@ def test_interpolate_by_trailing_nulls() -> None:


@given(data=st.data())
def test_interpolate_vs_numpy(data: st.DataObject) -> None:
@pytest.mark.parametrize("x_dtype", [pl.Date, pl.Float64])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in, x_dtype=st.sampled_from([pl.Date, pl.Float64])

Copy link
Collaborator

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like there's a test failure

__________________________ test_interpolate_vs_numpy ___________________________
[gw3] darwin -- Python 3.12.4 /Users/runner/work/polars/polars/.venv/bin/python3
tests/unit/operations/test_interpolate_by.py:168: in test_interpolate_vs_numpy
    def test_interpolate_vs_numpy(data: st.DataObject, x_dtype: pl.DataType) -> None:
tests/unit/operations/test_interpolate_by.py:228: in test_interpolate_vs_numpy
    assert_series_equal(result, expected)
polars/_utils/deprecation.py:91: in wrapper
    return function(*args, **kwargs)
E   AssertionError: Series are different (nan value mismatch)
E   [left]:  [0.0, 0.0, nan, 0.0]
E   [right]: [0.0, 0.0, 0.0, 0.0]
E   Falsifying example: test_interpolate_vs_numpy(
E       data=data(...),
E       x_dtype=Float64,  # or any other generated value
E   )
E   Draw 1: shape: (5, 2)
E   ┌─────────────┬───────┐
E   │ ts          ┆ value │
E   │ ---         ┆ ---   │
E   │ f64         ┆ f64   │
E   ╞═════════════╪═══════╡
E   │ 0.0         ┆ null  │
E   │ 0.0         ┆ 0.0   │
E   │ 9.9792e291  ┆ null  │
E   │ 9.9792e291  ┆ 0.0   │
E   │ -1.7977e308 ┆ 0.0   │
E   └─────────────┴───────┘
E   Explanation:
E       These lines were always and only run by failing examples:
E           /Users/runner/work/polars/polars/py-polars/polars/series/series.py:4039
E           /Users/runner/work/polars/polars/py-polars/polars/series/series.py:696
E           /Users/runner/work/polars/polars/py-polars/polars/series/series.py:739
E           /Users/runner/work/polars/polars/py-polars/polars/series/series.py:752
E           /Users/runner/work/polars/polars/py-polars/polars/series/series.py:782
E           (and 4 more with settings.verbosity >= verbose)
E   
E   You can reproduce this example by temporarily adding @reproduce_failure('6.108.9', b'AXicY2RhwAn2LEBnMgLR/v8QAOIgACMONgMA/UgKew==') as a decorator on your test case

---------- coverage: platform darwin, python 3.12.4-final-0 ----------
Coverage XML written to file main.xml

Required test coverage of 85.0% reached. Total coverage: 90.40%
=========================== short test summary info ============================
FAILED tests/unit/operations/test_interpolate_by.py::test_interpolate_vs_numpy - AssertionError: Series are different (nan value mismatch)
[left]:  [0.0, 0.0, nan, 0.0]
[right]: [0.0, 0.0, 0.0, 0.0]
Falsifying example: test_interpolate_vs_numpy(
    data=data(...),
    x_dtype=Float64,  # or any other generated value
)
Draw 1: shape: (5, 2)
┌─────────────┬───────┐
│ ts          ┆ value │
│ ---         ┆ ---   │
│ f64         ┆ f64   │
╞═════════════╪═══════╡
│ 0.0         ┆ null  │
│ 0.0         ┆ 0.0   │
│ 9.9792e291  ┆ null  │
│ 9.9792e291  ┆ 0.0   │
│ -1.7977e308 ┆ 0.0   │
└─────────────┴───────┘
Explanation:
    These lines were always and only run by failing examples:
        /Users/runner/work/polars/polars/py-polars/polars/series/series.py:4039
        /Users/runner/work/polars/polars/py-polars/polars/series/series.py:696
        /Users/runner/work/polars/polars/py-polars/polars/series/series.py:739
        /Users/runner/work/polars/polars/py-polars/polars/series/series.py:752
        /Users/runner/work/polars/polars/py-polars/polars/series/series.py:782
        (and 4 more with settings.verbosity >= verbose)

You can reproduce this example by temporarily adding @reproduce_failure('6.108.9', b'AXicY2RhwAn2LEBnMgLR/v8QAOIgACMONgMA/UgKew==') as a decorator on your test case

It's probably OK to keep the float case out of the hypothesis test, the unit tests you've added should be enough

@agossard
Copy link
Contributor Author

agossard commented Aug 9, 2024

OK, hopefully working now. Thanks, Marco!

Copy link
Collaborator

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice one, thanks @agossard !

@ritchie46 ritchie46 merged commit 49747c1 into pola-rs:main Aug 18, 2024
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interpolate based on other Float64 column
3 participants