Skip to content

BUG: pd.period_range ignores multiple of start frequency #47465

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
jaheba opened this issue Jun 22, 2022 · 3 comments
Closed
3 tasks done

BUG: pd.period_range ignores multiple of start frequency #47465

jaheba opened this issue Jun 22, 2022 · 3 comments
Labels
Bug Period Period data type

Comments

@jaheba
Copy link

jaheba commented Jun 22, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

start = pd.Period("2020", freq="3D")

a = pd.period_range(start, periods=2, freq=start.freq)
assert a[1] == start + 1

# Fails
b = pd.period_range(start, periods=2)
assert b[1] == start + 1

Issue Description

The issue lies in _get_ordinal_range:

def _get_ordinal_range(start, end, periods, freq, mult=1):
if com.count_not_none(start, end, periods) != 2:
raise ValueError(
"Of the three parameters: start, end, and periods, "
"exactly two must be specified"
)
if freq is not None:
freq = to_offset(freq)
mult = freq.n
if start is not None:
start = Period(start, freq)
if end is not None:
end = Period(end, freq)
is_start_per = isinstance(start, Period)
is_end_per = isinstance(end, Period)
if is_start_per and is_end_per and start.freq != end.freq:
raise ValueError("start and end must have same freq")
if start is NaT or end is NaT:
raise ValueError("start and end must not be NaT")
if freq is None:
if is_start_per:
freq = start.freq
elif is_end_per:
freq = end.freq
else: # pragma: no cover
raise ValueError("Could not infer freq from start/end")
if periods is not None:
periods = periods * mult
if start is None:
data = np.arange(
end.ordinal - periods + mult, end.ordinal + 1, mult, dtype=np.int64
)
else:
data = np.arange(
start.ordinal, start.ordinal + periods, mult, dtype=np.int64
)
else:
data = np.arange(start.ordinal, end.ordinal + 1, mult, dtype=np.int64)
return data, freq

A fix could be to replace:

if freq is None:
if is_start_per:
freq = start.freq
elif is_end_per:
freq = end.freq

with

   if freq is None:
        if is_start_per:
            freq = start.freq
            mult = freq.n
        elif is_end_per:
            freq = end.freq
            mult = freq.n

Expected Behavior

Omitting the freq argument of pd.period_range should be the same as passing the freq of the start or end field to it via freq=.

Installed Versions

Not relevant to issue.

@jaheba jaheba added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 22, 2022
@simonjayhawkins
Copy link
Member

simonjayhawkins commented Jul 5, 2022

Thanks @jaheba for the report.

Expected Behavior

Omitting the freq argument of pd.period_range should be the same as passing the freq of the start or end field to it via freq=.

from the docs...

freq: str or DateOffset, optional
Frequency alias. By default the freq is taken from start or end if those are Period objects. Otherwise, the default is "D" for daily frequency.

So this does indeed appear to be a bug.

start = pd.Period("2020", freq="3D")
print(pd.period_range(start, periods=2))
# PeriodIndex(['2020-01-01', '2020-01-02'], dtype='period[3D]')
print(b.freq)
# <3 * Days>

Note : the PeriodIndex frequency is not the difference between any two consecutive elements. see #47227

but should be consistent with specifying the freq explicitly as in the OP.

start = pd.Period("2020", freq="3D")
print(start.freq)
<3 * Days>
print(pd.period_range(start, periods=2, freq=start.freq))
# PeriodIndex(['2020-01-01', '2020-01-04'], dtype='period[3D]')

contributions and PRs welcome.

@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Jul 5, 2022
@simonjayhawkins simonjayhawkins added Period Period data type and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 5, 2022
@jaheba
Copy link
Author

jaheba commented Jul 5, 2022

Thanks @simonjayhawkins -- I have submitted a PR: #47598

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@natmokval
Copy link
Contributor

I think, PR #53709 closes this issue.
If there is no objection, I close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Period Period data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants