-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: RangeIndex.format performance #35712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: RangeIndex.format performance #35712
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Do we have benchmarks for this? If not do we want to add them?
pandas/core/indexes/range.py
Outdated
@@ -187,6 +187,11 @@ def _format_data(self, name=None): | |||
# we are formatting thru the attributes | |||
return None | |||
|
|||
def _format_with_header(self, header, na_rep="NaN") -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you type
bbb33d5
to
5ff29f3
Compare
looks good. this is worth a perf mention in 1.2 whatsnew. ping on green. |
The TravisCI issue is just a timeout issue, but In
The source code for this in # This can be removed when the ipython directive fails when there are errors,
# including the `tee sphinx.log` in te previous step (https://github.com/ipython/ipython/issues/11547)
- name: Check ipython directive errors
run: "! grep -B1 \"^<<<-------------------------------------------------------------------------$\" sphinx.log" I don't understand what this check is for or why my PR fails here. @TomAugspurger, you posted that mentioned ipython issue, do you mind looking at why my PR fails on this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an exception in the whatsnew/0.25.0.rst file: https://github.com/pandas-dev/pandas/pull/35712/checks?check_run_id=1010393272#step:6:2305
Updated. |
lgtm. I think this is ok for 1.1.2 as a perf regresion, can you move the note? ping on green. |
c08d426
to
667b501
Compare
667b501
to
224b46e
Compare
Updated & green. |
pandas/core/indexes/range.py
Outdated
@@ -187,6 +187,15 @@ def _format_data(self, name=None): | |||
# we are formatting thru the attributes | |||
return None | |||
|
|||
def _format_with_header(self, header: List[str], na_rep: str = "NaN") -> List[str]: | |||
if len(self._range) == 0: | |||
return [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be return header
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
I added tests for this, and found that DateTimeIndex([]).format(name=True)
returns ["None"], where it should return [""]. Same for PeriodIndex. I fixed that bug.
224b46e
to
041d4df
Compare
@topper-123 merge conflicts in whatsnew |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @topper-123 lgtm pending green
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment, if you can address and merge on green.
pandas/core/indexes/range.py
Outdated
@@ -187,6 +187,15 @@ def _format_data(self, name=None): | |||
# we are formatting thru the attributes | |||
return None | |||
|
|||
def _format_with_header(self, header: List[str], na_rep: str = "NaN") -> List[str]: | |||
if len(self._range) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not len(self._range)
That merge brancg master has caused some strange error. anyone know how to fix it? |
Thanks, @simonjayhawkins, were there any git oneliners you used for this? |
I don't profess to know what was going on. resetting HEAD, merging master and then pulling from the branch seemed to fix except for the duplicate release note. |
comment addressed |
Thanks @topper-123 |
@meeseeksdev backport 1.1.x |
Owee, I'm MrMeeseeks, Look at me. There seem to be a conflict, please backport manually. Here are approximate instructions:
And apply the correct labels and milestones. Congratulation you did some good work ! Hopefully your backport PR will be tested by the continuous integration and merged soon! If these instruction are inaccurate, feel free to suggest an improvement. |
Co-authored-by: Terji Petersen <contribute@tensortable.com>
#35440 dropped
RangeIndex._format_with_header
, which was functionally not needed, but needed to avoid creating an internal ndarray.This rectifies that + gives some perf. improvements.
Also, now the
_data
attribute isn't called, so this PR gives a perf. & memory improvement in some use cases compared to master: