-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
json_normalize should supply empty columns if record_path are not present #21830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There's a good chance this was solved by #20399 but there's not enough information to confirm. Please try on the latest release and if that doesn't work re-open with all of the required information here: https://pandas.pydata.org/pandas-docs/stable/contributing.html#bug-reports-and-enhancement-requests Along with a mininmally reproducible example: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports |
Based on #20030, I made up another example about my case, as follows. My problem is that the key of record_path is missing (not the value) while the keys of meta do exist. In such case, could we include the missing key of record_path as an empty column?
The examples I made up actually come from one dataframe. Each row is a dictionary. So when I loop through rows, there will be failure case for such scenario. |
@WillAyd Please find my example as above. Thank you! |
I'm still not clear on what you are looking for. Can you update to show expectations? The following works: In [8]: json_normalize(json.loads('[{"desc": "CPU", "id": "e030"}]'))
Out[8]:
desc id
0 CPU e030 Using |
Although 'data' doesn't exist for the second record but it does exist in the first record. Obviously, they come from same source. The key is missing for the second one. Hopefully, this time it is clearer. |
I get what you are saying but I don't think this is desired behavior. If a user specifics that a particular record path exists but it does not we raise an error, which is preferable to arbitrarily assuming that all users want to continue on in that case. For your need you should use a try...except and handle the lack of a record_path as desired |
I see. Thanks for the suggestion! |
The example below always generate an error
The
as I set @WillAyd Could you please explain the best way to handle this? |
@WillAyd I disagree, it is often the case that this function is being used on a JSON http response, and it is often the case that http responses are dynamic and exclude data. A simple way to please everyone would be adding a parameter to the function: "fail_on_missing_record" which defaults to True. If set to False the normalizing function could create rows that only contain meta data for JSON objects where the record_path is missing. |
Code Sample, a copy-pastable example if possible
Problem description
suppose json file, in some line, doesn't have anything to be normalized on data but does have id and desc information, no-empty. The function should be able to normalize that to empty columns while keep id and desc in the final result.
I am not sure whether this mean "ignore" but I don't think excluding the lines like that from the results is a good choice. I would recommend to supply with empty columns is not all columns are missing, both for record_path and meta.
The text was updated successfully, but these errors were encountered: