-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
REF: Special case NumericIndex._append_same_dtype() #17307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
42acf07 to
b6c0f19
Compare
Codecov Report
@@ Coverage Diff @@
## master #17307 +/- ##
==========================================
- Coverage 91.03% 91.01% -0.02%
==========================================
Files 162 162
Lines 49567 49572 +5
==========================================
- Hits 45121 45117 -4
- Misses 4446 4455 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #17307 +/- ##
==========================================
- Coverage 91.24% 91.23% -0.02%
==========================================
Files 163 163
Lines 50168 50176 +8
==========================================
Hits 45777 45777
- Misses 4391 4399 +8
Continue to review full report at Codecov.
|
|
looks fine |
|
@toobaz indeed append, but question was about asv benchmark, not tests |
Aha, OK, I had no idea of what "asv" meant. Looking into it. |
47fa09f to
1b00189
Compare
|
I guess the benchmarking part is ready... but I wasn't able to test it (by the way, can you confirm that the content of this page is obsolete?). Is it automatically ran anywhere? |
|
@toobaz yeah, that page is out of date. New docs are here. I'm tempted to just remove that wiki page. Any objections? Results are run nightly and posted to http://pandas.pydata.org/speed/pandas/ (still setting this up) |
|
I removed the content of the page and put a link to the contributing docs for now. |
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small comment about the location of the benchmarks
asv_bench/benchmarks/index_concat.py
Outdated
| from .pandas_vb_common import * | ||
|
|
||
|
|
||
| class Int64Indexing(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put this one in index_object.py ? And maybe call the class IndexOps or something (not 'indexing' as it append has nothing to do with indexing)
asv_bench/benchmarks/index_concat.py
Outdated
| def setup(self): | ||
| idx = Index(range(10)) | ||
| self.ridx = [idx] * 10 | ||
| self.iidx = [idx.astye(int)] * 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
astype
asv_bench/benchmarks/index_concat.py
Outdated
| class Int64Indexing(object): | ||
| goal_time = 0.2 | ||
|
|
||
| def setup(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this bigger like 1000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@toobaz to see what size is best (without being able to run asv), just do the timing yourself with %timeit, and set the size so that it takes something in the range of a few milliseconds to few tens of milliseconds
bcc85e8 to
3827072
Compare
|
OK, the benchmarks now take ~5 ms. each. |
asv_bench/benchmarks/index_object.py
Outdated
| idx = Index(range(10)) | ||
| self.ridx = [idx] * N | ||
| self.iidx = [idx.astype(int)] * N | ||
| self.oidx = [idx.astype(str)] * N |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can can you post these results (commit before your recent fix of range index concat) to this commit
3827072 to
6324b86
Compare
|
Hello @toobaz! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on October 28, 2017 at 06:48 Hours UTC |
You mean here? (notice I changed the benchmarks so the Before special-casing In [19]: %timeit ridx[0].append(ridx[1:])
100 loops, best of 3: 10.3 ms per loop
In [20]: %timeit iidx[0].append(iidx[1:])
100 loops, best of 3: 9.5 ms per loop
In [21]: %timeit oidx[0].append(oidx[1:])
100 loops, best of 3: 10.6 ms per loopBefore special-casing In [6]: %timeit ridx[0].append(ridx[1:])
100 loops, best of 3: 6.15 ms per loop
In [7]: %timeit iidx[0].append(iidx[1:])
100 loops, best of 3: 10.9 ms per loop
In [8]: %timeit oidx[0].append(oidx[1:])
100 loops, best of 3: 11.7 ms per loopNow (with 6324b86 ): In [6]: %timeit ridx[0].append(ridx[1:])
100 loops, best of 3: 6.08 ms per loop
In [7]: %timeit iidx[0].append(iidx[1:])
100 loops, best of 3: 9.71 ms per loop
In [8]: %timeit oidx[0].append(oidx[1:])
100 loops, best of 3: 11.7 ms per loop |
6324b86 to
b941cce
Compare
asv_bench/benchmarks/index_object.py
Outdated
| goal_time = 0.2 | ||
|
|
||
| def setup(self): | ||
| N = 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this 10000
|
@toobaz I wanted to see the asv results for these benchmarks; and I don't care about the internal benchmarks (a |
b941cce to
c45d2a5
Compare
OK, when I'll find the time to set up asv.
Not following you here... indeed I'm testing only public methods. |
|
asv_bench/benchmarks/index_concat.py is included but no changes. |
that's a leftover that should be removed, but the actual benchmarks are in the other file and time |
c45d2a5 to
fb05c39
Compare
oops fixed, sorry |
11b3172 to
9f69db2
Compare
|
@jreback @jorisvandenbossche |
9f69db2 to
795305b
Compare
795305b to
3123fcf
Compare
|
Rebased and ran asv: |
|
IIRC was the purpose of this PR to make code more consistent; small perf bump as a result? |
|
The patch avoids an int -> object -> int roundtrip by not relying on |
|
lgtm. pls rebase just to make sure. ping on green. |
3123fcf to
31c8a52
Compare
|
@jreback ping |
|
thanks! |
git diff master -u -- "*.py" | flake8 --diffSimple patch which results in a modest speedup (~8%) of the following:
I don't know whether this is worth the extra code... just inquiring so that I can finalize #16236.