Reorder columns in returned validated dataframe #1198
Closed
JeremyL-01
started this conversation in
Ideas
Replies: 1 comment
-
Duplicate conversation occurring here: #1317 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
My team heavily use Pandera to validate our data coming into our transformations as well as "rubber-stamp" the dataframes we generate as output.
Our output dataframes have to explicitly adhere to the datatypes AND column order of our defined Pandera schema.
We use the
strict="filter"
to ensure we cut out any columns not specified in our schema. We usecoerce=True
to ensure our validated dataframe has the correct data types. However, there's nothing built in (that we've found) to reorder the dataframe columns to align with the schema. Any reason this can't be done automatically as part of the returned validated dataframe? Or perhaps an option could be added?We currently use this manual work-around every time to fix the column order (which, coincidently, negates the usefulness of
strict="filter"
) :df = df[schema.columns.keys()]
Example:
Validated dataframe doesn't have columns in order specified in schema:
Manual work-around:
Validated dataframe now has columns in the order specified in the schema:
Beta Was this translation helpful? Give feedback.
All reactions