Skip to content

Calling qcut with too many duplicates now gives an informative error #9030

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

edjoesu
Copy link

@edjoesu edjoesu commented Dec 6, 2014

Closes #7751.

@@ -165,6 +165,10 @@ def qcut(x, q, labels=None, retbins=False, precision=3):
else:
quantiles = q
bins = algos.quantile(x, quantiles)
if len(algos.unique(bins)) < len(bins):
bins_sorted = np.sort(bins, axis=None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can do this as a set operation (set(bins)-set(unique(bins)) once you know that you have too many bins

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't work. Say bins = [0,0,0,1,1,1]. We want [0,1]. (set(bins)-set(unique(bins)) gives set([]).

Something like it should, but I can't think of it at the moment.

@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Error Reporting Incorrect or improved errors from pandas labels Dec 7, 2014
@jreback
Copy link
Contributor

jreback commented Dec 7, 2014

  • needs a test
  • needs a release note (bug fix section)

@jreback jreback added this to the 0.16.0 milestone Jan 2, 2015
@jreback
Copy link
Contributor

jreback commented Jan 2, 2015

pls move the release note to v0.16.0. and rebase. thxs

@@ -164,7 +164,7 @@ Bug Fixes
- Bug in ``merge`` where ``how='left'`` and ``sort=False`` would not preserve left frame order (:issue:`7331`)
- Fix: The font size was only set on x axis if vertical or the y axis if horizontal. (:issue:`8765`)
- Fixed division by 0 when reading big csv files in python 3 (:issue:`8621`)

- Fixed an unclear error message in ``qcut`` when repeated values result in duplicate bin edges.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls move this to 0.16.0

@jreback
Copy link
Contributor

jreback commented Jan 18, 2015

pls move the release note to 0.16.0 and rebase

@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015
@jreback
Copy link
Contributor

jreback commented May 9, 2015

closing pls reopen if/when updated

@jreback jreback closed this May 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

qcut() should make sure the bins bounderies are unique before passing them to _bins_to_cuts
2 participants