Skip to content

Continue porting period_helper to cython #19608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 10, 2018

Conversation

jbrockmendel
Copy link
Member

This moves functions to cython but retains their C names to make comparison easier. The next pass will get rid of the camelCase. A couple of redundant functions are left behind in period_helper, will be cleaned up in a follow-up.

The next hurdle is to get rid of all the adding and subtracting of ORD_OFFSET, WEEK_OFFSET, BDAY_OFFSET. (The period_helper code puts 0 at Jan 1 0001 instead of Jan 1 1970). This would be straightforward but I want to be careful about C division/mod conventions.

@codecov
Copy link

codecov bot commented Feb 9, 2018

Codecov Report

Merging #19608 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19608      +/-   ##
==========================================
- Coverage   91.62%   91.61%   -0.01%     
==========================================
  Files         150      150              
  Lines       48790    48790              
==========================================
- Hits        44703    44701       -2     
- Misses       4087     4089       +2
Flag Coverage Δ
#multiple 89.99% <ø> (-0.01%) ⬇️
#single 41.73% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/util/testing.py 83.64% <0%> (-0.21%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b835127...769feee. Read the comment docs.

@jreback jreback added Period Period data type Clean labels Feb 10, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you run perf on periods

int64_t absdate, double abstime) nogil:
"""
Set the instance's value using the given date and time.
Assumes GREGORIAN_CALENDAR.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally add some full doc-strings on these which list the args and types

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Mind doing those in the next round to avoid clogging the CI?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fine

@jbrockmendel
Copy link
Member Author

taskset 5 asv continuous -f 1.1 -E virtualenv master HEAD -b period -b period -b timeseries
[...]
   before     after       ratio
  [b8351277] [769feee1]
+   19.37μs    23.65μs      1.22  period.PeriodUnaryMethods.time_now('M')
+    2.01ms     2.40ms      1.19  timeseries.ResampleSeries.time_resample('period', '5min', 'mean')
+  133.42ms   154.58ms      1.16  timeseries.DatetimeIndex.time_to_pydatetime('tz_aware')
+    7.40μs     8.44μs      1.14  timeseries.AsOf.time_asof_single_early('Series')
+    2.38ms     2.71ms      1.14  timeseries.ResampleSeries.time_resample('period', '5min', 'ohlc')
+    4.27μs     4.79μs      1.12  timeseries.DatetimeIndex.time_get('dst')
+    2.46ms     2.76ms      1.12  timeseries.ToDatetimeCache.time_dup_seconds_and_unit(False)
-    4.53μs     4.04μs      0.89  timeseries.DatetimeIndex.time_get('tz_naive')
-   71.76μs    63.44μs      0.88  period.PeriodProperties.time_property('M', 'start_time')
-    3.16μs     2.72μs      0.86  timeseries.DatetimeIndex.time_get('repeated')

    before     after       ratio
  [b8351277] [769feee1]
+   20.97ms    25.91ms      1.24  timeseries.IrregularOps.time_add
+  886.50ns     1.01μs      1.14  period.PeriodProperties.time_property('min', 'dayofweek')
+   18.87μs    20.96μs      1.11  period.PeriodUnaryMethods.time_asfreq('M')
-   19.24μs    17.26μs      0.90  timeseries.AsOf.time_asof_single('Series')
-    4.06ms     3.64ms      0.90  timeseries.ToDatetimeISO8601.time_iso8601_nosep
-  962.36ns   852.72ns      0.89  period.PeriodProperties.time_property('min', 'dayofyear')
-  153.68ms   128.59ms      0.84  timeseries.DatetimeIndex.time_to_pydatetime('tz_aware')
-   24.75μs    20.10μs      0.81  period.PeriodUnaryMethods.time_now('M')

@jreback jreback added this to the 0.23.0 milestone Feb 10, 2018
@jreback jreback merged commit 507a2a2 into pandas-dev:master Feb 10, 2018
@jbrockmendel jbrockmendel deleted the phelper12 branch February 11, 2018 21:36
harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Period Period data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants