Skip to content

DataFrame.groupby() causes loss of precision on large integer values #5260

Closed
@nisaggarwal

Description

@nisaggarwal

Here's what I've in a dataframe to start with:

In [298]: g
Out[298]: 
                     ts  level
33  1382016600011617669     32
34  1382016600011625872     32
44  1382016600013590377     43
45  1382016600013598606     43

Running groupby() causes my 'ts' value to change and causes some loss of precision in the smallest units. My guess would be somewhere internally this column
is converted to a np.float64 and then returned back as an np.int64.

In [299]: g.groupby('level').last()
Out[299]: 
                        ts
level                     
32     1382016600011625984
43     1382016600013598720

Type information:

In [300]: g.dtypes
Out[300]: 
ts       int64
level    int64
dtype: object

Version info:

In [303]: pd.__version__
Out[303]: '0.12.0'

In [304]: np.__version__
Out[304]: '1.7.1'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions