Skip to content

BUG: infinite recursion loop when inf interval bounds lead to inf pivot values in an IntervalTree #46658

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
johannes-mueller opened this issue Apr 6, 2022 · 1 comment · Fixed by #46664
Closed
3 tasks done
Labels
Bug Interval Interval data type
Milestone

Comments

@johannes-mueller
Copy link
Contributor

johannes-mueller commented Apr 6, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np

length = 51  # must be > 50

interval_index = pd.Index([pd.Interval(-np.inf, 0.0), pd.Interval(0.0, 1.0)] * length)

print(interval_index.get_indexer_for([pd.Interval(0.0, 1.0)]))

Issue Description

The above code runs into an infinite recursion loop. The reason for this is, that self.pivot in that case is set to -inf. Then the subset for the new_child_node() is the same set as for the parent node. That leads to an infinite recursion.

Observed output:

  File "pandas/_libs/intervaltree.pxi", line 447, in pandas._libs.interval.Float64ClosedRightIntervalNode.__init__
    if np.isinf(self.pivot):
  File "pandas/_libs/intervaltree.pxi", line 492, in pandas._libs.interval.Float64ClosedRightIntervalNode.new_child_node
    cdef new_child_node(self,

[ lots of repetitions ]

  File "pandas/_libs/intervaltree.pxi", line 441, in pandas._libs.interval.Float64ClosedRightIntervalNode.__init__
    self.indices = indices
  File "<__array_function__ internals>", line 5, in median
  File "/home/jmu3si/Devel/pylife/.venv/lib/python3.9/site-packages/numpy/lib/function_base.py", line 3520, in median
    r, k = _ureduce(a, func=_median, axis=axis, out=out,
  File "/home/jmu3si/Devel/pylife/.venv/lib/python3.9/site-packages/numpy/lib/function_base.py", line 3429, in _ureduce
    r = func(a, **kwargs)
  File "/home/jmu3si/Devel/pylife/.venv/lib/python3.9/site-packages/numpy/lib/function_base.py", line 3576, in _median
    return np.lib.utils._median_nancheck(part, rout, axis, out)
  File "/home/jmu3si/Devel/pylife/.venv/lib/python3.9/site-packages/numpy/lib/utils.py", line 1032, in _median_nancheck
    data = np.moveaxis(data, axis, -1)
  File "<__array_function__ internals>", line 5, in moveaxis
  File "/home/jmu3si/Devel/pylife/.venv/lib/python3.9/site-packages/numpy/core/numeric.py", line 1428, in moveaxis
    source = normalize_axis_tuple(source, a.ndim, 'source')
  File "/home/jmu3si/Devel/pylife/.venv/lib/python3.9/site-packages/numpy/core/numeric.py", line 1358, in normalize_axis_tuple
    axis = tuple([normalize_axis_index(ax, ndim, argname) for ax in axis])
  File "/home/jmu3si/Devel/pylife/.venv/lib/python3.9/site-packages/numpy/core/numeric.py", line 1358, in <listcomp>
    axis = tuple([normalize_axis_index(ax, ndim, argname) for ax in axis])
RecursionError: maximum recursion depth exceeded while calling a Python object

Expected Behavior

Expected output

[  1   3   5   7   9  11  13  15  17  19  21  23  25  27  29  31  33  35
  37  39  41  43  45  47  49  51  53  55  57  59  61  63  65  67  69  71
  73  75  77  79  81  83  85  87  89  91  93  95  97  99 101]

Installed Versions

NSTALLED VERSIONS ------------------ commit : 06d2301 python : 3.9.7.final.0 python-bits : 64 OS : Linux OS-release : 5.4.0-107-lowlatency Version : #121-Ubuntu SMP PREEMPT Thu Mar 24 16:45:08 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : de_DE.UTF-8 LOCALE : de_DE.UTF-8

pandas : 1.4.1
numpy : 1.19.5
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : 0.29.26
pytest : 6.2.5
hypothesis : 6.35.0
sphinx : 4.3.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.3
IPython : 7.31.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : 2022.01.0
gcsfs : None
matplotlib : 3.5.1
numba : 0.55.0
numexpr : None
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : None
tables : None
tabulate : None
xarray : 0.20.2
xlrd : None
xlwt : None
zstandard : 0.16.0

@johannes-mueller johannes-mueller added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 6, 2022
johannes-mueller added a commit to boschresearch/pandas that referenced this issue Apr 6, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.
johannes-mueller added a commit to boschresearch/pandas that referenced this issue Apr 6, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.
johannes-mueller added a commit to boschresearch/pandas that referenced this issue Apr 6, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.
johannes-mueller added a commit to boschresearch/pandas that referenced this issue Apr 7, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.
@johannes-mueller
Copy link
Contributor Author

johannes-mueller commented Apr 7, 2022

I addressed this issue in a PR. I would probably need some help with the following questions

  • How to fix this error
    subset in this line is a int64_t array and originates from an Int64Vector. So I understand why the cast to the 32bit size_t is considered unsafe. But I don't understand why the failure occurs only in my tests.

  • Is the way I am calculating the pivot actually correct? It works for the cases that I had IRL. But I can still easily trigger the infinite recursion using the internal IntervalTree e.g. when left of an IntervalTree completely contains right like IntervalTree([-2., 2.] * 101, [-1, 1] * 101)

It would be helpful to have some discussion about that.

@jreback jreback added this to the 1.5 milestone Apr 9, 2022
@jreback jreback added Interval Interval data type and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 9, 2022
johannes-mueller added a commit to boschresearch/pandas that referenced this issue Apr 10, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.
johannes-mueller added a commit to boschresearch/pandas that referenced this issue Apr 19, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.
johannes-mueller added a commit to boschresearch/pandas that referenced this issue Apr 26, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.
johannes-mueller added a commit to boschresearch/pandas that referenced this issue May 31, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.
johannes-mueller added a commit to boschresearch/pandas that referenced this issue May 31, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.

Note that the tests are skipped on 32-bit systems (see pandas-dev#23440)
johannes-mueller added a commit to boschresearch/pylife that referenced this issue Jun 1, 2022
We use integer indices to address the interval index for the haigh diagram and
later replace it by the interval index.

Signed-off-by: Johannes Mueller <johannes.mueller4@de.bosch.com>
johannes-mueller added a commit to boschresearch/pylife that referenced this issue Jun 1, 2022
Merge in FMO/pylife from bugfix/workaround-pandas-46658 to develop

* commit '93bc0846c8202d4191ac625d321623ff3831f034':
  Work around pandas-dev/pandas#46658
johannes-mueller added a commit to boschresearch/pandas that referenced this issue Jun 2, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.

Note that the tests are skipped on 32-bit systems (see pandas-dev#23440)
johannes-mueller added a commit to boschresearch/pandas that referenced this issue Jun 2, 2022
Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.

Note that the tests are skipped on 32-bit systems (see pandas-dev#23440)
mroeschke pushed a commit that referenced this issue Jun 6, 2022
…46664)

Attempts to fix #46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.

Note that the tests are skipped on 32-bit systems (see #23440)
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this issue Jul 13, 2022
…andas-dev#46664)

Attempts to fix pandas-dev#46658.

When the pivot of an IntervalTree becomes ±inf the construction of the
IntervalTree comes to an infinite loop recursion.  This patch tries to fix that
by catching those cases and set the pivot to a reasonable value.

Note that the tests are skipped on 32-bit systems (see pandas-dev#23440)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Interval Interval data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants