Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve datachain subtract #352

Merged
merged 4 commits into from
Aug 28, 2024
Merged

Conversation

EdwardLi-coder
Copy link
Contributor

@EdwardLi-coder EdwardLi-coder commented Aug 25, 2024

Fix: DataChain.subtract() on differently named columns

Fixes #181

Copy link

codecov bot commented Aug 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.72%. Comparing base (8e0034e) to head (ccf4375).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #352      +/-   ##
==========================================
+ Coverage   86.69%   86.72%   +0.02%     
==========================================
  Files          91       91              
  Lines       10075    10092      +17     
  Branches     2042     2051       +9     
==========================================
+ Hits         8735     8752      +17     
  Misses        985      985              
  Partials      355      355              
Flag Coverage Δ
datachain 86.65% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@EdwardLi-coder EdwardLi-coder force-pushed the update_dc_subtract branch 6 times, most recently from 1b3468e to 171818b Compare August 26, 2024 00:39
@shcheklein shcheklein requested a review from rlamy August 26, 2024 01:05
@EdwardLi-coder EdwardLi-coder force-pushed the update_dc_subtract branch 2 times, most recently from 41fc4a7 to caccaf4 Compare August 27, 2024 13:01
@@ -296,15 +296,23 @@ def q(*columns):

@frozen
class Subtract(DatasetDiffOperation):
on: Sequence[str]
on: Sequence[Union[str, tuple[str, str]]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be Sequence[tuple[str, str]]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be Sequence[tuple[str, str]]?

Yeah.I will update it.Thank you.

@@ -1811,3 +1846,32 @@ def test_from_csv_nan_inf(tmp_dir, test_session):
assert np.isnan(res[0])
assert np.isposinf(res[1])
assert np.isneginf(res[2])


def test_subtract_with_different_column_names(test_session):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test useful? It seems to duplicate the addition to test_subtract().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test useful? It seems to duplicate the addition to test_subtract().

Sorry.I forgot to delete it and will do so now

Copy link
Contributor

@rlamy rlamy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thank you!

@rlamy rlamy merged commit 477d7d5 into iterative:main Aug 28, 2024
32 checks passed
@EdwardLi-coder EdwardLi-coder deleted the update_dc_subtract branch August 28, 2024 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataChain.subtract() on differently named columns
2 participants