-
Notifications
You must be signed in to change notification settings - Fork 723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exception thrown when firstQuartile and thirdQuartile are equal in @vx/stats/computeStats #427
Comments
Yes computeStats should handle error states such as above gracefully. Might take me a bit to get around to fixing this, but will tag it with a |
I'd be happy to send a PR. Do you have any recommendation or guideline on
how to treat it though? Or even an example of how such cases are handled in
other libraries?
Il giorno mar 26 feb 2019 alle ore 15:48 Harrison Shoff <
notifications@github.com> ha scritto:
… Yes computeStats should handle error states such as above gracefully.
Might take me a bit to get around to fixing this, but will tag it with a help
wanted label in the mean time. Happy to review PRs that come in.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#427 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAs8PI2eM7CAzcMmplC0lUnD0veM_9PNks5vRUlXgaJpZM4bQV6m>
.
|
I also think that the median, firstQuartile and thirdQuartile are not computed correctly in computeStats. For example, the median of the array computeStats instead would, in the first case, select 3 as the median. A similar thing goes on with firstQuartile and thirdQuartile. Unless there is a specific reason for this, I'd correct it as well. |
@conglei ^any thoughts on this computation? |
Thanks @mjsarfatti for pointing it out. Yes, it is a bug. And in terms of the corner case you mentioned, I agree that we should handle that case more carefully. |
Yes, the result should be like this, and I remembered it should be the case. Will double check. @mjsarfatti if you haven't start working on the fix PR, I'm happy to fix this issue next week if it is not too late for you. |
I haven't really really started on it, so if you are tackling this anyway I'll leave it to you. export default function computeStats(numericalArray) {
const points = [...numericalArray].sort((a, b) => a - b);
const sampleSize = points.length;
const getMedian = dataSet => {
if (dataSet.length % 2 === 1) {
return dataSet[(dataSet.length - 1) / 2];
}
return (dataSet[dataSet.length / 2 - 1] + dataSet[dataSet.length / 2]) / 2;
};
const median = getMedian(points);
const lowerHalfLength = Math.floor(sampleSize / 2);
const lowerHalf = points.slice(0, lowerHalfLength);
const firstQuartile = getMedian(lowerHalf);
const upperHalfLength = Math.ceil(sampleSize / 2);
const upperHalf = points.slice(upperHalfLength);
const thirdQuartile = getMedian(upperHalf);
const IQR = thirdQuartile - firstQuartile;
[...] |
In a way, it's a very uncommon case. On the other hand it can happen more often than not if your dashboard lets users filter data freely.
My data array turned out as follows:
[10000, 2400, 10000, 10000]
If I feed it into
computeStats
, the function will compute:firstQuartile: 10000
thirdQuartile: 10000
IQR: 0
(firstQuartile - thirdQuartile)Now, IQR is used as a multiplier and as a divisor in several places, turning up 0's and NaN's that eventually just crash the program.
Should computeStats "exit gracefully", for example returning an empty
{ boxPlot, binData }
object and a readable console error stating that the stats provided do not allow to compute a boxPlot?I guess I can easily wrap the function call in a try/catch block for now, but I wouldn't rely on the fact that every developer out there will test their app with such a strange array. But their apps will crush in production once fed with real world data...
The text was updated successfully, but these errors were encountered: