Looking for best approach to validate uniqueness across multiple columns #386
-
FYI, I raised a question in Stake Overflow (thought it might help generate more awareness of pandera, as well as maybe help me use pandera more effectively). The stack overflow link is here The jist of the design issue that I was looking at is I want to add a uniqueness validation check to a combination of columns. For context, the columns are X, Y Coordinates, and for the data set there should only be one row for each combination of X and Y coordinates. Any pointers on how to approve the approach that I have used is appreciated. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 5 replies
-
Thanks @TerrySnow1963, just posted an answer: https://stackoverflow.com/a/65721438/3205067 |
Beta Was this translation helpful? Give feedback.
-
Also, maybe we can add a |
Beta Was this translation helpful? Give feedback.
-
I stumbled upon this answer, and thought it might be good to mention that now a DataFrameSchema has a unique parameter: https://pandera.readthedocs.io/en/stable/dataframe_schemas.html#validating-the-joint-uniqueness-of-columns.
|
Beta Was this translation helpful? Give feedback.
Thanks @TerrySnow1963, just posted an answer: https://stackoverflow.com/a/65721438/3205067