-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Replace skiprows with skip_rows to begin standardizing underscore usage in keyword arguments #22587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Haven't reviewed the change in detail but am rather neutral on the idea, maybe a slight -1. I'm a huge fan of consistency, though the thought of deprecating 12 parameters in one function alone adds a lot of code and headache from an end user perspective that I am not sure is worth it. Even the built in csv module has arguments like Let's see what others think. In the future you'd be better served opening an issue first for the discussion to be had before going in and coding, though it sounds like this exercise was partially to familiarize yourself with the code as well so no big deal |
I am sympathetic to the desire to maintain backwards compatibility, but the lack of consistency in syntax across pandas has gradually grown into a serious drawback. I have taught pandas to dozens of newbies across the country and I can testify from experience that small variations in the naming style of commonly used methods introduces unnecessary frustration, and even reduces user confidence in the quality of the overall product. As a frequent user of pandas, I can also attest that the inconsistencies require me, someone who uses the library daily, to routinely consult the documentation to ensure I use the proper kwarg naming style. All of this can be avoided by selecting a style and sticking to it, just as authors of professional documents outside of programming do. The deprecation warnings, if included, could be temporary, and ultimately removed in a future version, much in the way |
Agreed, @palewire! I know a lot of the common two-word arguments for |
I think this is a fantastic proposal! Pandas serves me greatly almost daily, but almost daily I find myself consulting the documentation, some of it may be attributed to my own shortcomings, but a part of it is this lack of standardization in Pandas APIs. |
At the cost of internal complexity, there could be a path forward where both styles are accepted and, in a later release, removed. But I struggle to remember which arguments have underscores and which don't. Personally, I'd happily revise old code that needs to work with newer versions of Pandas if it meant consistency in the keyword arguments over the long haul. |
@gfyoung, will do. |
@palewire can you list all of the kwargs that you are attempting to change. I think some of them could be ok to actually deprecate and change (with a suitable deprecation period), but others are just not worth the effort and/or maybe just support both spelllings. |
@gfyoung I believe we have an issue for this anyhow? |
I will file an issue shortly. Sorry for the delay.
…On Fri, Sep 7, 2018, 9:17 AM gfyoung ***@***.***> wrote:
@jreback <https://github.com/jreback> : No, we don't. The closest was
#13349 <#13349>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#22587 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAnCSs1vkIzRKO2UN6cUNLuFiQygIY7ks5uYpwWgaJpZM4WZKuZ>
.
|
Closing this PR as stale. For any discussion please move to referenced issue |
By my rough count, the
read_csv
method has nearly 50 keyword arguments.Of those, 32 arguments are made up or two or more words. Twenty of those multi-word arguments use an underscore to mark the space between words, like
skip_blank_lines
andparse_dates
. Twelve do not, likechunksize
andlineterminator
.It is my opinion this is a small flaw in pandas' API, and the library would benefit by standardizing how spaces are handled. It will make pandas more legible and consistent, and therefore easier for users of all experience levels.
Since the underscore method is more common and more legible, I propose it be adopted.
All existing arguments without an underscore will need to be modified. As a first salvo, I have attempted to change the
skiprows
kwarg toread_csv
and its sibling methods toskip_rows
. I have included an experimental deprecation warning that aims to continue to support the old argument for some interval into the future.Due to my lack of expertise on pandas' internals, I expect this request includes some flaws that would need to be corrected before inclusion. However, I hope that the maintainers of the library will agree with the overall aims of this patch, which is to begin a process of introducing greater consistency to the style of keyword arguments.
If you do agree, I would be pleased to lead an effort to gradually standardize the inputs and put in the work to finish the job.
Thank you for your consideration.