-
Hello, I was just checking the check = Check().add_rule(method="is_empty", column="id", pct=0) However, this fail because: ValueError: Coverage should be between 0 and 1 This is not a very informative error because the type of bounds (> or >= ?) are not defined and they are asymmetrical (i.e. pct can be 1 but cannot be 0). So considering this, my suggestion is to add an check = Check().add_rule(method="is_not_empty", column="id", pct=1) And a more general question, why is zero not allowed as a Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Oh wow... I feel silly now: Nevertheless, my question on informative errors in terms of valid bounds still stands. |
Beta Was this translation helpful? Give feedback.
-
Hi @marrov no worries, I also sometime overlook the checks available, and choosing for a representative name is always tricky. |
Beta Was this translation helpful? Give feedback.
Hi @marrov no worries, I also sometime overlook the checks available, and choosing for a representative name is always tricky.
The Data Management Body of Knowledge or DAMA-DMBOK has some nice references when it comes to data quality standards. In table 2, first row, the quality attribute of completeness is cited. From there the
is_complete
checkReference: https://www.dama-nl.org/wp-content/uploads/2020/09/DDQ-Dimensions-of-Data-Quality-Research-Paper-version-1.2-d.d.-3-Sept-2020.pdf
With regards to the zero
pct
is a nice one, I am thinking in area of logical constructs, that like mathematically there is an inverse function,cuallee
could inverse the function based on the percentage. Mea…