Skip to content

[BUG] Rolling count aggregations produce different results than Pandas 1.0+ #5580

@brandon-b-miller

Description

@brandon-b-miller

Describe the bug
Count aggregations on rolling windows do not match Pandas behavior when using pandas 1.0+. This is due to changes to handling of NaNs and the conditions under which a non-NaN value is allowed to be produced for a particular data window. More discussion in #4546

Steps/Code to reproduce bug

>>> cudf.Series.from_pandas(pd.Series([1,1,1,None])).rolling(2, min_periods=2, center=True).count()
0    null
1       2
2       2
3    null
dtype: int32

>>> pd.__version__
'1.0.3'
>>> pd.Series([1,1,1,None]).rolling(2, min_periods=2, center=True).count()
0    NaN
1    2.0
2    2.0
3    1.0
dtype: float64

Expected behavior
Either matching behavior or an understanding of why we differ.

Environment overview (please complete the following information)

  • Environment location: Bare-metal
  • Method of cuDF install: Source
    • If method of install is [Docker], provide docker pull & docker run commands used

Environment details
cuDF 0.15

Additional context
pandas-dev/pandas#30923
pandas-dev/pandas#34466

Metadata

Metadata

Assignees

No one assigned

    Labels

    PythonAffects Python cuDF API.bugSomething isn't working

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions