Custom controls for split menu #472

adrianmroz · 2019-08-19T14:36:09Z

We need customization of controls inside Split menu.

Few examples:

For time splits on line/bar chart we shouldn't allow picking sort (always should be ascending on dimension) and limit - should be set to None.
Sometimes we want custom available limits. Per split or per datacube. Does it depend on visualisation?

Food for thought: Maybe we should limit "visualisation resolution" behaviour? Right now, sometimes turnilo changes a lot under user and it happens on every change. If we could pick good defaults at the start and limit user options, most of it could be deleted. Would be good to have behaviour for correcting wrong decisions (removing continuous split on line chart or adding split on totals). Core UX of filter/split/measure should stay flexible. Menus inside could be customised.

adrianmroz · 2019-08-19T14:57:21Z

Visualisation Resolution mechanism conflates few responsibilities.

Way of marking current definition of visualisation unfit, or less fit than others. For example, when adding additional split on Totals, we should mark Totals as unfit and switch to Table. That could and should be run on every essence change. Most of the time it will do nothing.
Adjusting parameters of data visualisation. For example, correcting sorts on line chart or correcting limits with nested splits. This never should be needed. We shouldn't allow user to change to incorrect params. And if user changes visualisation, we should provide good defaults that are visualisation specific. For example after changing to line chart, we should fix sort on continuous dimension.

alexbusu · 2019-11-07T09:51:06Z

@adrianmroz So, let's have customizable limits for split dimensions. 😄
Currently this value is hardcoded.
About the incorrect params that the user might change to: the developer is the one to set the available limits, and it's his responsibility to set the right values; the user will be shown the available options for limits.

alexbusu · 2019-11-07T09:53:24Z

An (extended) idea is to configure limits for each level of splitting. E.g. for 1st level one can allow [ 50, 100, 200 ] while for the 2nd and 3rd just [ 5, 10, 15 ].

adrianmroz · 2019-11-07T10:10:09Z

So, let's have customizable limits for split dimensions. 😄
Currently this value is hardcoded

Yup, that's main driver for this issue :D

About the incorrect params that the user might change to: the developer is the one to set the available limits

That's a little bit harder - there are still some dynamic cases where user can select something that it is immediately changed back. We need something better now. And it depends on visualisation and current state. Hence this issue.

An (extended) idea is to configure limits for each level of splitting. E.g. for 1st level one can allow [ 50, 100, 200 ] while for the 2nd and 3rd just [ 5, 10, 15 ].

I see only one reason to show different limits on consecutive splits - count of underlying druid queries. We have that configured so we could "estimate" options. If you select 10 on first, feel free to select 10 on next one. But if you pick 1000 on first, we need to limit second one probably to 5.

alexbusu · 2019-11-07T10:53:52Z

If you select 10 on first, feel free to select 10 on next one. But if you pick 1000 on first, we need to limit second one probably to 5.

I'm not sure if Druid cares that much about the output limitation (well, a little bit in case of TopN queries, yes, which is the case in data split; but I saw value of 5000 for threshold, hardcoded somewhere). The computation is made on the whole time range and given filters anyway.

adrianmroz · 2019-11-07T10:57:17Z

Oh boy, it is very big issue for plywood-druid integration and how plywood generates groupbys for nested splits. We have setting specially for this: #205

alexbusu · 2019-12-12T16:19:53Z

Oh boy, it is very big issue for plywood-druid integration and how plywood generates groupbys for nested splits. We have setting specially for this: #205

Can this be left for the developer to decide how much he would like to load the druid cluster?

adrianmroz · 2019-12-12T16:27:20Z

What you mean to the developer?

Because of plywood internals, number of druid queries depends on what user selects in UI. If user selects state that yields more than max queries (and it is easy to do so) plywood would return incomplete data and turnilo can't know that. That would result in wrong data presented and that defeat purpose of such tool.

So it is not a decision about load but about correctness.

And this should be decided by administrator of data source in config. I assume that's what you mean by developer?

alexbusu · 2019-12-16T10:56:49Z

An (extended) idea is to configure limits for each level of splitting. E.g. for 1st level one can allow [ 50, 100, 200 ] while for the 2nd and 3rd just [ 5, 10, 15 ].

I'm talking about the existing functionality - one can now select first [5, 10, 25, 50, 100] items on each Split. A "bad" scenario would be 3 Split dimensions with last 100 (max) values each.
I was suggesting to make these values configurable on a nesting level basis. For example on 1st level one (dev) could set [10, 50, 100, 200, 500] while on the 2nd+ levels he could set the allowed values to [5, 10, 15]. Then, depending on how is this tool (Turnilo) used, in case Druid can't hold so many queries, these values can be adjusted accordingly to fit the business needs.

About the correctness, can't tell how one could guarantee that, as I don't know how plywood works, I suppose it should throw when no data can be retrieved. Anyway, how can one be sure that in 1st case I presented (3 split dimensions x 100 rows each) the result would be accurate? And how is this different than the 2nd scenario (1st dimension <=500 rows, followed by N dim. <= 15 rows).

(I hope you're not trying to tell me that if I select 500 rows in first split, then 500 queries will hit Druid for the second split dimension).

What do you think?

adrianmroz · 2019-12-16T17:53:04Z

I'm talking about the existing functionality - one can now select first [5, 10, 25, 50, 100] items on each Split. A "bad" scenario would be 3 Split dimensions with last 100 (max) values each.
I was suggesting to make these values configurable on a nesting level basis. For example on 1st level one (dev) could set [10, 50, 100, 200, 500] while on the 2nd+ levels he could set the allowed values to [5, 10, 15].

We would like to have configuration per dimension, not split (like custom granularity). So use age have custom limits and page path has different limit options. Then we could have some mechanism that adjusts possible values depending on other splits limit setting. And we’re talking about limit values around 1000.

So let’s say dimension page path has limits [10, 50, 100, 500, 1000] and dimension country has [10, 50, 100, 200, 500]. If you select one split you’re able to select highest values. But if you select two- you can’t select 1000 on first split and 500 on second.

About the correctness, can't tell how one could guarantee that, as I don't know how plywood works, I suppose it should throw when no data can be retrieved.

Problem is that we need to guess when data didn’t arrive. We need to calculate number of queries and compare with maxQueries configuration value. And I think we should not let u user select incorrect values instead of throwing errors. Dynamic select boxes.

Anyway, how can one be sure that in 1st case I presented (3 split dimensions x 100 rows each) the result would be accurate?

We can estimate that’s 1 + 100 + 100^2 queries.

And how is this different than the 2nd scenario (1st dimension <=500 rows, followed by N dim. <= 15 rows).

It is 1 + 500 + 500 * 15 queries.

(I hope you're not trying to tell me that if I select 500 rows in first split, then 500 queries will hit Druid for the second split dimension).

If you select limit 500 on first split (ordinal dimension) and second split limit greater than one - plywood would send 501 queries.

We didn’t consider your idea about configuring limit per split. We run that with our users and maybe it could solve some issues!

alexbusu · 2019-12-19T09:19:39Z

If you select limit 500 on first split (ordinal dimension) and second split limit greater than one - plywood would send 501 queries.

Yes, it makes sense actually, because of the limit for the 2nd+ split dimension.

I see only one reason to show different limits on consecutive splits - count of underlying druid queries. We have that configured so we could "estimate" options. If you select 10 on first, feel free to select 10 on next one. But if you pick 1000 on first, we need to limit second one probably to 5.

Probably this would be the way to go.

Also, if #539 gets implemented, then these limits, per dimension, would have no effect, since that query would be a groupBy and the limit is for the whole result set (single query). That "data-grid" feature would also satisfy our needs, since we want more results (~200 rows) for the first split only, no further splits, and with a single groupBy query one can let the user input it's own limit value :)

adrianmroz · 2019-12-19T15:12:38Z

I wouldn’t hope that we enforce group-by for #539.

oshermaf · 2020-07-20T07:03:06Z

Bringing this issue back: I'm curios if any work has been done on it since the end of 2019? I'm mainly talking about customizable limits for split dimensions (or keeping the limits values hard-coded, but add higher possible values)

adrianmroz-allegro · 2021-05-14T17:31:40Z

Closed, work will be done in #755 and #756

adrianmroz mentioned this issue Jan 2, 2020

Sorting - how we should do it? #251

Closed

adrianmroz added this to the next milestone Jan 22, 2020

adrianmroz mentioned this issue Jan 24, 2020

Simplify Colors/Legend #562

Closed

adrianmroz mentioned this issue Feb 22, 2020

GroupBy via plywood #572

Open

mkuthan removed this from the next milestone Apr 10, 2020

adrianmroz mentioned this issue May 15, 2020

Split on boolean dimension has string kind #563

Closed

This was referenced May 14, 2021

Custom limits values for dimensions #755

Closed

Custom split menu #756

Closed

adrianmroz-allegro closed this as completed May 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom controls for split menu #472

Custom controls for split menu #472

adrianmroz commented Aug 19, 2019 •

edited by adrianmroz-allegro

Loading

adrianmroz commented Aug 19, 2019

alexbusu commented Nov 7, 2019

alexbusu commented Nov 7, 2019

adrianmroz commented Nov 7, 2019

alexbusu commented Nov 7, 2019

adrianmroz commented Nov 7, 2019

alexbusu commented Dec 12, 2019

adrianmroz commented Dec 12, 2019

alexbusu commented Dec 16, 2019

adrianmroz commented Dec 16, 2019

alexbusu commented Dec 19, 2019

adrianmroz commented Dec 19, 2019

oshermaf commented Jul 20, 2020

adrianmroz-allegro commented May 14, 2021

Custom controls for split menu #472

Custom controls for split menu #472

Comments

adrianmroz commented Aug 19, 2019 • edited by adrianmroz-allegro Loading

adrianmroz commented Aug 19, 2019

alexbusu commented Nov 7, 2019

alexbusu commented Nov 7, 2019

adrianmroz commented Nov 7, 2019

alexbusu commented Nov 7, 2019

adrianmroz commented Nov 7, 2019

alexbusu commented Dec 12, 2019

adrianmroz commented Dec 12, 2019

alexbusu commented Dec 16, 2019

adrianmroz commented Dec 16, 2019

alexbusu commented Dec 19, 2019

adrianmroz commented Dec 19, 2019

oshermaf commented Jul 20, 2020

adrianmroz-allegro commented May 14, 2021

adrianmroz commented Aug 19, 2019 •

edited by adrianmroz-allegro

Loading