-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom controls for split menu #472
Comments
Visualisation Resolution mechanism conflates few responsibilities.
|
@adrianmroz So, let's have customizable limits for split dimensions. 😄 |
An (extended) idea is to configure limits for each level of splitting. E.g. for 1st level one can allow |
Yup, that's main driver for this issue :D
That's a little bit harder - there are still some dynamic cases where user can select something that it is immediately changed back. We need something better now. And it depends on visualisation and current state. Hence this issue.
I see only one reason to show different limits on consecutive splits - count of underlying druid queries. We have that configured so we could "estimate" options. If you select 10 on first, feel free to select 10 on next one. But if you pick 1000 on first, we need to limit second one probably to 5. |
I'm not sure if Druid cares that much about the output limitation (well, a little bit in case of TopN queries, yes, which is the case in data split; but I saw value of 5000 for threshold, hardcoded somewhere). The computation is made on the whole time range and given filters anyway. |
Oh boy, it is very big issue for plywood-druid integration and how plywood generates groupbys for nested splits. We have setting specially for this: #205 |
Can this be left for the developer to decide how much he would like to load the druid cluster? |
What you mean to the developer? Because of plywood internals, number of druid queries depends on what user selects in UI. If user selects state that yields more than max queries (and it is easy to do so) plywood would return incomplete data and turnilo can't know that. That would result in wrong data presented and that defeat purpose of such tool. So it is not a decision about load but about correctness. And this should be decided by administrator of data source in config. I assume that's what you mean by developer? |
I'm talking about the existing functionality - one can now select first About the correctness, can't tell how one could guarantee that, as I don't know how plywood works, I suppose it should throw when no data can be retrieved. Anyway, how can one be sure that in 1st case I presented (3 split dimensions x 100 rows each) the result would be accurate? And how is this different than the 2nd scenario (1st dimension <=500 rows, followed by N dim. <= 15 rows). (I hope you're not trying to tell me that if I select 500 rows in first split, then 500 queries will hit Druid for the second split dimension). What do you think? |
We would like to have configuration per dimension, not split (like custom granularity). So use age have custom limits and page path has different limit options. Then we could have some mechanism that adjusts possible values depending on other splits limit setting. And we’re talking about limit values around 1000. So let’s say dimension page path has limits [10, 50, 100, 500, 1000] and dimension country has [10, 50, 100, 200, 500]. If you select one split you’re able to select highest values. But if you select two- you can’t select 1000 on first split and 500 on second.
Problem is that we need to guess when data didn’t arrive. We need to calculate number of queries and compare with maxQueries configuration value. And I think we should not let u user select incorrect values instead of throwing errors. Dynamic select boxes.
We can estimate that’s 1 + 100 + 100^2 queries.
It is 1 + 500 + 500 * 15 queries.
If you select limit 500 on first split (ordinal dimension) and second split limit greater than one - plywood would send 501 queries. We didn’t consider your idea about configuring limit per split. We run that with our users and maybe it could solve some issues! |
Yes, it makes sense actually, because of the limit for the 2nd+ split dimension.
Probably this would be the way to go. Also, if #539 gets implemented, then these limits, per dimension, would have no effect, since that query would be a |
I wouldn’t hope that we enforce group-by for #539. |
Bringing this issue back: I'm curios if any work has been done on it since the end of 2019? I'm mainly talking about customizable limits for split dimensions (or keeping the limits values hard-coded, but add higher possible values) |
We need customization of controls inside Split menu.
Few examples:
Food for thought: Maybe we should limit "visualisation resolution" behaviour? Right now, sometimes turnilo changes a lot under user and it happens on every change. If we could pick good defaults at the start and limit user options, most of it could be deleted. Would be good to have behaviour for correcting wrong decisions (removing continuous split on line chart or adding split on totals). Core UX of filter/split/measure should stay flexible. Menus inside could be customised.
The text was updated successfully, but these errors were encountered: