You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DataFrames have dimension checking (# of rows/columns) and column name checking, but no dtype checking
This would be particularly useful on schema deserialization - datetimes and numbers can be ambiguous in json, currently are loaded in as ints or strings, depending on how they were serialized
Proposal:
add "schema" to df init func, accept a dict of dtypes, same formats as pandas' as_type
for the columns schema is set for, check in _validate - this would likely involve casting the columns specified in schema and failing on error... potentially could also save that to the dataframe so downstream code wouldn't have to do the casting.
allow schema to be used by json deserializer - cast specified columns after pandas.read_json
Example of problem with dates - full recovery only possible when "iso" string output is used instead of epoch, and col is cast from str to datetime by pandas:
Inspired by #600, further discussion there
Problem:
Proposal:
Example of problem with dates - full recovery only possible when "iso" string output is used instead of epoch, and col is cast from str to datetime by pandas:
The text was updated successfully, but these errors were encountered: