Histogram events & bin hover label improvements #2113

alexcjohnson · 2017-10-23T18:02:38Z

Implements #2071 (histogram events get the input points selected) and #2086 (better display of data ranges for histogram bins) for both histogram and the 2d histogram types (histogram2d and histogram2dcontour)

…ranges in hover labels

make histogram hover wrap bar hover, rather than being conditionals inside it

alexcjohnson · 2017-10-23T18:08:44Z

src/lib/search.js

+// don't trust floating point equality - fraction of bin size to call
+// "on the line" and ensure that they go the right way specified by
+// linelow
+var roundingError = 1e-9;


I'm a little surprised that this hadn't come up before (other than the test below that was actually wrong before), and obviously it's a bit questionable exactly what small fraction to put here... but if we just use straight < vs <= etc it's pretty easy to trick findBin on the edges.

alexcjohnson · 2017-10-23T18:16:39Z

src/components/fx/hover.js

    if(d.xLabelVal !== undefined) {
-        d.xLabel = ('xLabel' in d) ? d.xLabel : Axes.hoverLabelText(d.xa, d.xLabelVal);
+        d.xLabel = ('xLabel' in d) ? d.xLabel : getDimText(d, 'x');


@etpinard I made getDimText in parallel with you adding d.(x|y)Label and Axes.hoverLabelText - I suppose in principle we could use d.(x|y)Label for this too instead of adding d.(x|y)LabelVal(0|1), maybe that would actually be simpler... let me take a look.

yeah, better to use d.(x|x)Label -> 51ad8c2

Much better in 51ad8c2

Thanks!

alexcjohnson · 2017-10-23T18:36:31Z

src/traces/histogram/calc.js

-                cdi.p1 = roundFn(binEdges[i + 1], true);
+
+            // pts and p0/p1 don't seem to make much sense for cumulative distributions
+            if(!cumulativeSpec.enabled) {


A questionable decision... this gets at some of the same issues that led to cumulative.currentbin but in principle a CDF shows the sum of all the data prior to a specific point, not the data within that bin. perhaps I should shift p0 and p1 depending on currentbin so that the hover label at least gets the description "all the data prior to X" precisely correct, even though it'll be a different value than the center of the bar...

And then what about pts? You could say it should be included and again should be "all the data prior to X" but then it would be meaningless to select a single bar and not all the bars before it. Alternatively pts could contain "all the data that was added in this bar" which might be more useful in terms of selecting a single bar, but it's not really what that bar means.

so that the hover label at least gets the description "all the data prior to X"

Yeah. I think something along those lines would be useful down the road.

You could say it should be included and again should be "all the data prior to X" but then it would be meaningless to select a single bar and not all the bars before it. Alternatively pts could contain "all the data that was added in this bar"

Sounds to me like both all-data-prior-to-X and data-added-in-this-bar point lists could be useful, so we'll might have to emit both in the future. Leaving this discussion for another time is 👌 with me. I doubt that most plotly.js users even know about cumulative histograms.

Great, I'll make a new issue to discuss how this should work for CDFs.

alexcjohnson · 2017-10-23T18:38:49Z

test/jasmine/tests/histogram_test.js

+
+        // too small of a right gap: stop disambiguating data at the edge
+        _test(0, 0.0009, [0, 1, 2], false, [0, 1, 1, 2]);
+        _test(0.1, 0.009, [115, 125, 135], false, [115, 125, 125, 135]);


@etpinard @chriddyp as discussed offline this morning - compare to lines 699 and 701 above that are just on the other side of the cutoff I implemented in ad0e08f - seem reasonable?

to be clear, what these tests are indicating is that the hover labels for these bins will be 115 - 124.99 and 125 - 134.99 on the disambiguated side of the cutoff, and 115 - 125 and 125 - 135 on the too-many-digits-to-disambiguate side.

Looks good 👍

alexcjohnson · 2017-10-23T19:02:26Z

test/jasmine/tests/histogram_test.js

+            [cn1_00, Lib.dateTime2ms('2009-01-01', cn), cn1_10, Lib.dateTime2ms('2019-01-01', cn)]);
+        // occasionally we give extra precision for world dates (month when it should be year
+        // or day when it should be month). That's better than doing the opposite... not going
+        // to fix now, too many edge cases, better not to complicate the logic for them all.


The algorithm I came up with for figuring out how many digits to show is basically:

find the minimum distance from data points to bin edges from either side - this is just two numbers for the whole distribution

for two consecutive bins, find the largest digit that changes in both the left and right gaps (with a little shifting to account for rounding errors)

This algorithm is a little complicated to state, but it's quick to calculate and it's robust as far as I can see, on numeric, category, and gregorian date axes (though even in that case there is a weird edge case that confuses it). However, the larger variability of month and year lengths in world calendars causes it to sometimes think a digit doesn't change when in the actual data it does, and increasing the tolerance to catch this breaks some much more common cases.

I suspect we could fix this by looking at all bin edges independent of each other, but that's far more computationally intensive, and has some other potential weird pitfalls. I don't think providing a bit of extra precision in the label for a few world calendar cases justifies all that extra complexity.

✨ algo. Thanks for stating it in details.

Can you comment on what modifications (if any) your algo will need when we implement variable-sized bins #360?

The issue with dates is that the actual time increments - the "digits" we're looking to see if they have changed - have different sizes in terms of milliseconds. The bin sizes being different should not be a problem, though people could do things to thwart the algorithm, like having the first two bin edges on fairly round values but some subsequent edges at not-so-round values. The idea that two edges are enough to determine the biggest digit that always changes comes from the observation that if one bin is extra-round, its neighbor will not be. So to do this robustly with nonuniform bins, we could either:

Do this same algorithm on all bin edges. That's different from looking at all bin edges independently as I suggested above for world calendars, though that would solve this problem too - what I'm suggesting here is to calculate just one left-edge-gap and one right-edge-gap as we do now, but then test those gaps against all bin edges instead of the first two.

As a last resort, make a new attribute for folks to set their own rounding.

One common case mentioned in #360 is equally-sized bins on a log axis. There, the whole assumption of a single digit to round to is wrong, and we'll have to do something completely different or just give up on the idea of disambiguating the edges, and let the regular axis-formatting-for-hover system handle it unchanged.

and move range-or-single-value logic into Axes.hoverLabelText

alexcjohnson · 2017-10-23T20:11:23Z

src/plots/cartesian/axes.js

@@ -1221,7 +1221,7 @@ axes.hoverLabelText = function(ax, val) {
    var tx = axes.tickText(ax, ax.c2l(logOffScale ? -val : val), 'hover').text;

    if(logOffScale) {
-        return val === 0 ? '0' : '-' + tx;
+        return val === 0 ? '0' : MINUS_SIGN + tx;


been around for a while, using regular dash instead of unicode minus. tested in 3aa03f7#diff-846cf5b534aa0d22bdd1da2b43ac3cbaR517

I was wondering about that. Thanks 👌

alexcjohnson · 2017-10-23T20:20:12Z

test/jasmine/tests/hover_label_test.js

+                _hover(gd, 250, 200);
+                assertHoverLabelContent({
+                    nums: '3',
+                    axis: '3.3'


⚠️ another potentially controversial decision: whenever every bin contains a single unique value, label exactly that value, not a range, even if those single values are not at the same places within the bins. But as soon as you have any bin with two distinct values in it, we use the range logic.

Note: for histogram2d I applied this independently to each axis, not to each brick. So if one x brick has only the point (2, 3) and one above it has only the point (2.1, 4), this will make the x bins show ranges, because that one x bin has multiple values, even though each brick has unique values in it.

Nice touch here. I was worried that category histograms would get e.g, apples - lemons bin hover labels. 👌

That can happen - but only if the bins actually contain apples and lemons.

Plotly.newPlot(gd, [{ x: ['apples','apples','apples','lemons','lemons','peaches','peaches','pears'], type: 'histogram', xbins: {start: -0.5, end: 3.5, size: 2} }],{ width: 400, height: 400 })

It's possible to put as many categories as you want into one bin, and as with numbers we'll only show the first and last ones in the label. Never happens automatically since it's in general confusing, but if you want to do it we won't stop you!

but if you want to do it we won't stop you!

Cool. Would you mind adding a test case for this situation?

Cool. Would you mind adding a test case for this situation?

0331a16

etpinard

Looks great. I made one blocking comment (test plotly_hover event data).

Once merged, we should open a new issue or leave #2086 open to discuss what to do with cumulative histograms and variable-sized bins #360

etpinard · 2017-10-23T21:23:07Z

src/components/fx/hover.js

    if(d.xLabelVal !== undefined) {
-        d.xLabel = ('xLabel' in d) ? d.xLabel : Axes.hoverLabelText(d.xa, d.xLabelVal);
+        d.xLabel = ('xLabel' in d) ? d.xLabel : getDimText(d, 'x');


Much better in 51ad8c2

Thanks!

etpinard · 2017-10-23T21:23:38Z

src/traces/histogram/hover.js

+
+var barHover = require('../bar/hover');
+
+module.exports = function hoverPoints(pointData, xval, yval, hovermode) {


Very very nice.

etpinard · 2017-10-23T21:36:38Z

src/plots/cartesian/axes.js

@@ -1221,7 +1221,7 @@ axes.hoverLabelText = function(ax, val) {
    var tx = axes.tickText(ax, ax.c2l(logOffScale ? -val : val), 'hover').text;

    if(logOffScale) {
-        return val === 0 ? '0' : '-' + tx;
+        return val === 0 ? '0' : MINUS_SIGN + tx;


I was wondering about that. Thanks 👌

etpinard · 2017-10-23T21:39:08Z

test/jasmine/tests/hover_label_test.js

+            .then(function() {
+                _hover(gd, 250, 200);
+                assertHoverLabelContent({
+                    nums: '\u22125', // unicode minus


Thanks for 🔒 ing this down!

etpinard · 2017-10-23T21:55:02Z

src/traces/histogram/calc.js

+                }
+                else {
+                    cdi.p0 = roundFn(binEdges[i]);
+                    cdi.p1 = roundFn(binEdges[i + 1], true);


❤️ ing that you were able to put all thing new rounding logic in calc.

etpinard · 2017-10-24T13:50:01Z

test/jasmine/tests/histogram_test.js

+        // 15-day gap - still days
+        _test(0, 15 * day, [jan17, jan18, jan19], 'gregorian',
+            [jan17, jan18 - day, jan18, jan19 - day]);
+        // 28-day gap STILL gets days - in principle this might happen with data


thanks for writing this down

etpinard · 2017-10-24T13:59:50Z

test/jasmine/tests/histogram_test.js

+            [cn1_00, Lib.dateTime2ms('2009-01-01', cn), cn1_10, Lib.dateTime2ms('2019-01-01', cn)]);
+        // occasionally we give extra precision for world dates (month when it should be year
+        // or day when it should be month). That's better than doing the opposite... not going
+        // to fix now, too many edge cases, better not to complicate the logic for them all.


✨ algo. Thanks for stating it in details.

Can you comment on what modifications (if any) your algo will need when we implement variable-sized bins #360?

etpinard · 2017-10-24T14:03:17Z

test/jasmine/tests/hover_label_test.js

+                _hover(gd, 250, 200);
+                assertHoverLabelContent({
+                    nums: '3',
+                    axis: '3.3'


Nice touch here. I was worried that category histograms would get e.g, apples - lemons bin hover labels. 👌

etpinard · 2017-10-24T14:07:47Z

test/jasmine/tests/hover_label_test.js

+            })
+            .then(function() {
+                _hover(gd, 250, 200);
+                assertHoverLabelContent({


Would mind checking that plotly_hover does emit the correct event data?

Good catch -> tested in 2696c1c

etpinard · 2017-10-24T14:14:55Z

src/traces/histogram/calc.js

-                cdi.p1 = roundFn(binEdges[i + 1], true);
+
+            // pts and p0/p1 don't seem to make much sense for cumulative distributions
+            if(!cumulativeSpec.enabled) {


so that the hover label at least gets the description "all the data prior to X"

Yeah. I think something along those lines would be useful down the road.

You could say it should be included and again should be "all the data prior to X" but then it would be meaningless to select a single bar and not all the bars before it. Alternatively pts could contain "all the data that was added in this bar"

Sounds to me like both all-data-prior-to-X and data-added-in-this-bar point lists could be useful, so we'll might have to emit both in the future. Leaving this discussion for another time is 👌 with me. I doubt that most plotly.js users even know about cumulative histograms.

etpinard · 2017-10-24T17:40:30Z

Nicely done 💃

alexcjohnson added 12 commits October 20, 2017 14:32

let timeit work with n=1

59b6463

give Lib.findBin a tiny buffer for on-the-line points

6dc9904

update histograms to include input point numbers in events and smart …

ccf1a76

…ranges in hover labels

lint histogram2d calc

1b356d2

🌴 histogram2d calc

8f8add1

refactor bar/hist hover

7686096

make histogram hover wrap bar hover, rather than being conditionals inside it

factor out bin_label_vals so histogram2d can use it too

8d36d81

extend event data and bin label improvements to histogram2d

a7e2f32

🔪 TODOs we've done

6b74fbe

exclude CDFs from new histogram event & label features

b91d1cb

limit the number of extra digits we give to histogram bin labels

ad0e08f

test bin_label_vals

6cb4e71

alexcjohnson commented Oct 23, 2017

View reviewed changes

alexcjohnson added 3 commits October 23, 2017 15:07

use unicode minus in Axes.hoverLabelText

7581567

get rid of (x|y)LabelVal(0|1), in favor of (x|y)Label

51ad8c2

and move range-or-single-value logic into Axes.hoverLabelText

tests of histogram hover label range format, and log-negative format

3aa03f7

alexcjohnson commented Oct 23, 2017

View reviewed changes

alexcjohnson added status: reviewable feature something new labels Oct 23, 2017

etpinard added this to the v1.32.0 milestone Oct 24, 2017

etpinard suggested changes Oct 24, 2017

View reviewed changes

alexcjohnson mentioned this pull request Oct 24, 2017

Hover labels and event data for cumulative histograms #2115

Closed

alexcjohnson added 3 commits October 24, 2017 11:38

TODO -> comment

cc454dc

test event data for histogram

2696c1c

test case for category range histogram hover labels

0331a16

etpinard approved these changes Oct 24, 2017

View reviewed changes

alexcjohnson merged commit 9bfbabf into master Oct 24, 2017

alexcjohnson deleted the histogram-events branch October 24, 2017 17:55

This was referenced Oct 25, 2017

Histogram events #2071

Closed

display data range for histogram hover text #2086

Closed

extend zhoverformat to 2d histogram types #2127

Merged

chriddyp mentioned this pull request Jan 11, 2018

add support for new histogram event data plotly/dash-core-components#144

Merged

alexcjohnson mentioned this pull request Mar 7, 2018

Single-bar histograms don't respect manual bins specs when data lies in final bin #1229

Closed

etpinard mentioned this pull request Jun 29, 2018

Hovers on cumulative histograms #2738

Closed

This was referenced May 22, 2019

histogram2d: wrong hover labels #3872

Closed

Use correct index to lookup unique histogram2d y vals #3890

Merged

bklingen mentioned this pull request Jul 21, 2021

wrong range in hover info for basic histogram #5848

Open


		var barHover = require('../bar/hover');

		module.exports = function hoverPoints(pointData, xval, yval, hovermode) {

Uh oh!

Histogram events & bin hover label improvements #2113

Histogram events & bin hover label improvements #2113

Conversation

alexcjohnson commented Oct 23, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

etpinard left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

etpinard commented Oct 24, 2017

Uh oh!

Uh oh!