Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: diff with multiple columns of type np.int8 fails #14773

Closed
simonm3 opened this issue Nov 30, 2016 · 4 comments · Fixed by #41493
Closed

BUG: diff with multiple columns of type np.int8 fails #14773

simonm3 opened this issue Nov 30, 2016 · 4 comments · Fixed by #41493
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@simonm3
Copy link

simonm3 commented Nov 30, 2016

df = pd.DataFrame(np.random.randint(9, size=(4,5)), columns=list("abcde"))
df = df.astype(np.int8)
print(df.astype(int).diff())
print(df.a.diff())
df.diff()

diff works fine on a dataframe with int columns; or on a single column of type np.int8. However if there are multiple columns of np.int8 then it fails

     a    b    c    d    e
0  NaN  NaN  NaN  NaN  NaN
1  6.0 -5.0 -7.0 -7.0 -1.0
2 -5.0  7.0 -1.0  5.0 -7.0
3  2.0 -6.0  8.0  0.0  3.0
0    NaN
1    6.0
2   -5.0
3    2.0
Name: a, dtype: float64
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-145-2724544318c3> in <module>()
      3 print(df.astype(int).diff())
      4 print(df.a.diff())
----> 5 df.diff()

C:\Users\s\Anaconda3\lib\site-packages\pandas\core\frame.py in diff(self, periods, axis)
   4068         """
   4069         bm_axis = self._get_block_manager_axis(axis)
-> 4070         new_data = self._data.diff(n=periods, axis=bm_axis)
   4071         return self._constructor(new_data)
   4072 

C:\Users\s\Anaconda3\lib\site-packages\pandas\core\internals.py in diff(self, **kwargs)
   3172 
   3173     def diff(self, **kwargs):
-> 3174         return self.apply('diff', **kwargs)
   3175 
   3176     def interpolate(self, **kwargs):

C:\Users\s\Anaconda3\lib\site-packages\pandas\core\internals.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
   3054 
   3055             kwargs['mgr'] = self
-> 3056             applied = getattr(b, f)(**kwargs)
   3057             result_blocks = _extend_blocks(applied, result_blocks)
   3058 

C:\Users\s\Anaconda3\lib\site-packages\pandas\core\internals.py in diff(self, n, axis, mgr)
   1040     def diff(self, n, axis=1, mgr=None):
   1041         """ return block for the diff of the values """
-> 1042         new_values = algos.diff(self.values, n, axis=axis)
   1043         return [self.make_block(values=new_values, fastpath=True)]
   1044 

C:\Users\s\Anaconda3\lib\site-packages\pandas\core\algorithms.py in diff(arr, n, axis)
   1196     if arr.ndim == 2 and arr.dtype.name in _diff_special:
   1197         f = _diff_special[arr.dtype.name]
-> 1198         f(arr, out_arr, n, axis)
   1199     else:
   1200         res_indexer = [slice(None)] * arr.ndim

pandas\src\algos_common_helper.pxi in pandas.algos.diff_2d_int8 (pandas\algos.c:62940)()

ValueError: Buffer dtype mismatch, expected 'float32_t' but got 'double'

######################################################################
INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.19.1
nose: 1.3.7
pip: 9.0.1
setuptools: 28.7.1
Cython: 0.24
numpy: 1.11.2
scipy: 0.17.1
statsmodels: 0.8.0rc1
xarray: None
IPython: 4.2.0
sphinx: 1.3.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: 1.1.0
tables: 3.2.2
numexpr: 2.6.0
matplotlib: 1.5.3
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.2
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.40.0
pandas_datareader: None
time: 5.55 s

@chris-b1 chris-b1 added the Bug label Nov 30, 2016
@chris-b1 chris-b1 changed the title diff with multiple columns of type np.int8 fails with error BUG: diff with multiple columns of type np.int8 fails Nov 30, 2016
@chris-b1 chris-b1 added this to the Next Major Release milestone Nov 30, 2016
@chris-b1
Copy link
Contributor

Yep, looks buggy, PR to fix welcome.

@chris-b1
Copy link
Contributor

xref #4899

@kidpixo
Copy link

kidpixo commented Jan 19, 2021

Just found this bug in my code.

This happens for np.int8, np.int16 but not for np.uint8,np.uint16 and np.int32.

pd.Series(data=[1,0,1], dtype=np.int8).diff()
pd.Series(data=[1,0,1], dtype=np.int16).diff()
# same output 
*** NotImplementedError
Traceback (most recent call last):
  File "[...]/lib/python3.8/site-packages/pandas/core/series.py", line 2438, in diff
    result = algorithms.diff(self.array, periods)
  File "[...]/lib/python3.8/site-packages/pandas/core/algorithms.py", line 2002, in diff
    algos.diff_2d(arr, out_arr, n, axis, datetimelike=is_timedelta)
  File "pandas/_libs/algos.pyx", line 1211, in pandas._libs.algos.diff_2d

@mroeschke
Copy link
Member

This looks to work on master now. Could use a test

In [32]: df = pd.DataFrame(np.random.randint(9, size=(4,5)), columns=list("abcde"))
    ...: df = df.astype(np.int8)

In [33]: df.diff()
Out[33]:
     a    b    c    d    e
0  NaN  NaN  NaN  NaN  NaN
1  3.0 -2.0  3.0  4.0 -6.0
2  3.0  4.0 -2.0 -4.0  2.0
3  0.0 -6.0  0.0 -1.0  6.0

In [34]: pd.__version__
Out[34]: '1.3.0.dev0+1485.g6abb567cb1'

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels May 2, 2021
@mroeschke mroeschke modified the milestones: Contributions Welcome, 1.3 May 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants