-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Open
Labels
BugIO JSONread_json, to_json, json_normalizeread_json, to_json, json_normalizeStringsString extension data type and string dataString extension data type and string data
Milestone
Description
(noticed because of some doctest failures cfr #61886)
Currently, for the strings as object dtype, it seems that we assume that object dtype are actually strings, and encode that as such in the schema part of the JSON Table Schema output:
>>> pd.Series(["a", "b", None], dtype=object).to_json(orient="table", index=False)
'{"schema":{"fields":[{"name":"values","type":"string"}],"pandas_version":"1.4.0"},"data":[{"values":"a"},{"values":"b"},{"values":null}]}'
But for the now-default string dtype, this is still seen as some custom extension dtype:
>>> pd.Series(["a", "b", None], dtype="str").to_json(orient="table", index=False)
'{"schema":{"fields":[{"name":"values","type":"any","extDtype":"str"}],"pandas_version":"1.4.0"},"data":[{"values":"a"},{"values":"b"},{"values":null}]}'
(note the "type":"string"
vs "type":"any","extDtype":"str"
)
Given that the Table Schema spec has a "string" type, let's also use that when exporting our string dtype.
Metadata
Metadata
Assignees
Labels
BugIO JSONread_json, to_json, json_normalizeread_json, to_json, json_normalizeStringsString extension data type and string dataString extension data type and string data