-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Value error in histplot with binwidth smaller than half the data range #3646
Comments
Should probably just reject unless
Yeah, the parameter is documented as
Maybe "default to" implies that you can override it and should be "set |
When the data isn't known in advance, in some situations only 0 or 1 data points will pop up. Without setting binwidth, now an empty plot is returned for 0 data points, and a symmetric bin of 1 wide for only one data point. With binwidth set, the code could behave similarly, using that binwidth. A plot showing something a bit reasonable could be friendlier than an error message.
Well, my comment was a suggestion to also allow other integer binwidths for discrete data, shifting the edges by a half. (But maybe this would complicate things too much. Next, people will be asking for discrete units measured fractions, e.g. 1/10ths.) |
I got this error when all the data in the column of interest was the same, and was pretty confused as to the problem. After figuring it out, I made a MWE: import seaborn as sns
import pandas as pd
sns.set()
data = pd.DataFrame({
'x': [2] * 10
})
# Works
sns.histplot(data=data, x='x', binwidth=1, discrete=True)
# Works
sns.histplot(data=data, x='x')
# Breaks
sns.histplot(data=data, x='x', binwidth=1) It seems weird that sns.histplot would just break if you set a binwidth of 1 and got data all with the same values. If you set discrete=True or just dont set binwidth it does work, but the current error message of :
Is not very interpretable. At the very least there should be a better error message, but I would think that it wouldn't be breaking at all. |
sns.histplot([1, 2, 3], binwidth=7)
crashes with"ValueError:
bins
must be positive, when an integer"It is related to #2721, which is marked as solved, but the error also happens in the dev version.
The cause seems to be line 136 in counting.py with
bins = int(round((stop - start) / binwidth))
, setting it to0
for small ranges. Changing this tobins = max(1, int(round((stop - start) / binwidth)))
would probably solve it adequately.(The code also has problems when
binwidth=0
(division by zero) or negative (this makesbins
negative, which causes numpy to protest)).By the way,
binwidth
is ignored whendiscrete=True
. As an example,sns.histplot(titanic[titanic['who'] == 'woman'], x='age', binwidth=5)
has some bins with 5 and others with 6 ages. Of course, people who really care can provide their own custom bin edges.The text was updated successfully, but these errors were encountered: