Skip to content

DEPR: Deprecate numpy argument in read_json #28512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
WillAyd opened this issue Sep 18, 2019 · 6 comments · Fixed by #30636
Closed

DEPR: Deprecate numpy argument in read_json #28512

WillAyd opened this issue Sep 18, 2019 · 6 comments · Fixed by #30636
Labels
Deprecate Functionality to remove in pandas good first issue IO JSON read_json, to_json, json_normalize
Milestone

Comments

@WillAyd
Copy link
Member

WillAyd commented Sep 18, 2019

I've never really been clear on the purpose of the numpy argument in to_json. Some digging brought me here #3876 (comment) where it is explained that this maintains some kind of sequence to elements. To illustrate the only difference I could find

>>> pd.read_json('[{"a": 1, "b": 2}, {"b":2, "a" :1}]',numpy=False, orient='records')
   a  b
0  1  2
1  1  2

>>> pd.read_json('[{"a": 1, "b": 2}, {"b":2, "a" :1}]',numpy=True, orient='records')
   a  b
0  1  2
1  2  1

I might be missing the point but I don't understand why this would be useful. Objects in JSON are by definition not ordered, so this is non-compliant and I think just plain confusing.

So I think good to deprecate unless anyone has objections.

@WillAyd WillAyd added IO JSON read_json, to_json, json_normalize Deprecate Functionality to remove in pandas labels Sep 18, 2019
@WillAyd
Copy link
Member Author

WillAyd commented Sep 18, 2019

Should also note there are not tests that I can see that actually test numpy=True making any kind of difference

@WillAyd WillAyd added this to the Contributions Welcome milestone Sep 18, 2019
@lucaionescu
Copy link

Hi, I'd like to work on this one. What would be the best way to go? Removing the argument from the function and adding a deprecation warning to the next release notes?

@WillAyd
Copy link
Member Author

WillAyd commented Sep 19, 2019

You should be able to apply the decorate_kwarg decorator to read_json:

def deprecate_kwarg(

And then update tests in pandas.tests.io.json . Right now this argument is used for parametrization but doesn't actually change anything, so could probably just remove parametrization

@WillAyd
Copy link
Member Author

WillAyd commented Sep 19, 2019

Don't remove the argument yet though just add warning

@alimcmaster1
Copy link
Member

@WillAyd

According to user_guide/io.rst there are also some performance considerations.

Are we still good to deprecate do you think?

If ``numpy=True`` is passed to ``read_json`` an attempt will be made to sniff
an appropriate dtype during deserialization and to subsequently decode directly
to NumPy arrays, bypassing the need for intermediate Python objects.

This can provide speedups if you are deserialising a large amount of numeric
data:
   randfloats = np.random.uniform(-100, 1000, 10000)
   randfloats.shape = (1000, 10)
   dffloats = pd.DataFrame(randfloats, columns=list('ABCDEFGHIJ'))
   jsonfloats = dffloats.to_json()
   
   pd.read_json(jsonfloats) 
   pd.read_json(jsonfloats, numpy=True)

@WillAyd
Copy link
Member Author

WillAyd commented Jan 3, 2020

Yea should still deprecate

@jreback jreback modified the milestones: Contributions Welcome, 1.0 Jan 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas good first issue IO JSON read_json, to_json, json_normalize
Projects
None yet
4 participants