Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Numerical Range Data Quality Check #408

Merged
merged 21 commits into from
Jul 19, 2024
Merged

Conversation

jnesfield
Copy link
Contributor

Added a data quality check for numerical range on continuous columns. I also updated the examples for the existing data quality classes as they had errors when I attempted them. I still need to add testing within ~tree/main/tests/data_quality and will work on those later. Attached is a notebook containing some examples and testing I did.
Untitled19.zip

Copy link

codecov bot commented Jul 12, 2024

Codecov Report

Attention: Patch coverage is 95.58824% with 6 lines in your changes missing coverage. Please review.

Project coverage is 77.43%. Comparing base (a24ab81) to head (5730807).
Report is 34 commits behind head on main.

Files Patch % Lines
nannyml/data_quality/range/calculator.py 96.03% 2 Missing and 2 partials ⚠️
nannyml/data_quality/range/result.py 92.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #408      +/-   ##
==========================================
+ Coverage   76.80%   77.43%   +0.63%     
==========================================
  Files         108      111       +3     
  Lines        9264     9444     +180     
  Branches     1656     1684      +28     
==========================================
+ Hits         7115     7313     +198     
+ Misses       1685     1668      -17     
+ Partials      464      463       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jnesfield jnesfield changed the title Added Numberical Range Data Quality Check Added Numerical Range Data Quality Check Jul 12, 2024
@nnansters
Copy link
Contributor

What's this!? A shiny new calculator?!

I'll take a closer look at your PR at the end of this week, but already a big fat "thank you" @jnesfield !

@jnesfield
Copy link
Contributor Author

NP! I was chatting about this with Hakim E. and decided to give it a whirl!

Copy link
Contributor

@nnansters nnansters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, but I'll take care of them quickly. Just wanted to explain the little changes I'll make.

nannyml/data_quality/range/calculator.py Outdated Show resolved Hide resolved
nannyml/data_quality/range/calculator.py Outdated Show resolved Hide resolved
nannyml/data_quality/range/calculator.py Outdated Show resolved Hide resolved
nannyml/data_quality/range/calculator.py Show resolved Hide resolved
nnansters and others added 6 commits July 19, 2024 11:27
To ensure the result object also contains results for the reference period. Even if they're all just 0 by definition.
Adjusting some comments and general linting stuff
@nnansters
Copy link
Contributor

There we go, also added some texts. That's good to go for me!

There is quite a bit of boilerplate going on right now, I'm hoping to do something about that soon. Feel free to incorporate any other ideas you might have!

Once again, thank you for your contribution, much appreciated!

@nnansters nnansters merged commit 123d3b7 into NannyML:main Jul 19, 2024
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants