Autogenerated Flux code does not use "aggregateWindow" #13775

nathanielc · 2019-05-03T19:23:32Z

Use the query builder to create a Flux query and add at least one function.

For example click:

telegraf - bucket
disk - measurement
used_percent - field
mean - function

This code is generated

from(bucket: "telegraf")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "cpu" or r._measurement == "disk")
  |> filter(fn: (r) => r._field == "used_percent")
  |> window(period: v.windowPeriod)
  |> mean()
  |> group(columns: ["_value", "_time", "_start", "_stop"], mode: "except")
  |> yield(name: "mean")

The above code can be simplified to use aggregateWindow like:

from(bucket: "telegraf")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "cpu" or r._measurement == "disk")
  |> filter(fn: (r) => r._field == "used_percent")
  |> aggregateWindow(every: v.windowPeriod, fn: mean)
  |> yield(name: "mean")

As users will likely learn Flux best by example I think it is important to introduce good concepts in the Flux code we generate. Using aggregateWindow is something we want users to learn to do. This way they do not have to mange the groups so directly.

Previously only aggregate functions like mean could be used with aggregateWindow while selector functions like max could not. This is no longer the case and so I think we should push users to learn aggregateWindow.

The text was updated successfully, but these errors were encountered:

chnn · 2019-05-06T17:33:07Z

@nathanielc There seems to be a special case when using median with aggregateWindow. This query:

from(bucket: "telegraf")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "mem")
  |> filter(fn: (r) => r._field == "active")
  |> aggregateWindow(every: 1m, fn: median)

returns this error for me:

error in evaluating AST while starting program: type error 5:6-5:56: missing required parameter ""column""

The query seems OK as far as I can tell. Is there a Flux query you would suggest we generate instead?

Maybe also relevant: before we had special handling of median in our generation code that inserted toFloat() before the median call:

from(bucket: "telegraf")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "mem")
  |> filter(fn: (r) => r._field == "active")
  |> window(period: v.windowPeriod)
  |> toFloat()
  |> median()
  |> group(columns: ["_value", "_time", "_start", "_stop"], mode: "except")
  |> yield(name: "median")

Without the value conversion, the following error is thrown:

unsupported aggregate column type int

If we can fix or work around that first error above, should we still be using toFloat when generating a query? Or should the median function itself be updated to accept any type of number?

Every other function I tested with aggregateWindow is working great, and I'm updating the code generation now.

nathanielc · 2019-05-06T18:44:14Z

@chnn The median function should work, so that is a bug on our side that we will fix.

As for the toFloat issue that is an interesting question. So far the design has been to only implement support for differing numeric types when the algorithm can take advantage of it. So for example the sum function supports both the float and int types and the return type is the same as the input type. For median the algorithm only works for floats so we only implemented it for floats and expected the users to handle converting their data. That decision means that the UI needs to be aware an inject toFloat when needed as the UI is taking on some of the role of the user.

Median is not the only function like this and as we add more functions these kinds of problems are going to arise. It seems like we will have a general problem of the UI constructing Flux scripts and the users data not being to correct data type based on the function they want to use. Do you think there is a general solution to this problem on the UI? Something like allowing the user to select type conversion functions? Or possibly the UI auto inserting them when it discovers a type mismatch?

nathanielc · 2019-05-06T18:47:36Z

See for fixing the median aggregateWindow issue influxdata/flux#1244

chnn · 2019-05-06T21:18:05Z

Do you think there is a general solution to this problem on the UI? Something like allowing the user to select type conversion functions? Or possibly the UI auto inserting them when it discovers a type mismatch?

We could certainly build features like that in the UI, and they seem generally useful to me for the query building experience as a whole. They would mesh well with some other updates we are making to the query builder at the moment.

For schema mismatches though, the approach that seems to most appealing to me is to attempt converting values to the expected type automatically, either via an explicit toFloat/etc. call or within Flux itself.

In the case of autogenerated queries, the conversion seems either necessary or harmless/idempotent.

In the case of a user writing queries in an interactive environment, it still seems desirable. It seems like the current UX is

Write + submit a query
Read the error describing the schema mismatch
Tweak the query to avoid it

With automatic value conversion it seems like this flow could be reduced to step 1 only in many cases.

I'm not sure I understand the full picture though—what would be lost by attempting to convert values for a user automatically in a case like this?

nathanielc · 2019-05-06T21:30:51Z

I'm not sure I understand the full picture though—what would be lost by attempting to convert values for a user automatically in a case like this?

In general it is not safe to convert between numeric types. Not all integers can be represented as floats, so if a user has really large ints in their data and they are autoconverted to floats, we destroy the accuracy of the results. This is the motivation for making users explicitly convert types because the assumption is that only the user knows the domain well enough to know if it is safe to convert between the types.

So concretely the median function doesn't support ints because the algorithm would just convert the ints to floats before doing anything, which is not a safe operation. In other words the functions are only implemented for numeric types for which it is safe to implement.

As for the user experience we hope to extend Flux's type system to understand these kinds of collisions, so the workflow could be similar to finding a syntax error in a Flux script, thus reducing the steps/time a user spends to find/fix the issue.

chnn · 2019-05-06T21:38:47Z

Gotcha, makes sense.

Seems appealing to automate some of the troubleshooting around this, along the lines of

possibly the UI auto inserting them when it discovers a type mismatch?

Or perhaps just suggesting the fix, clippy style.

I'll bring it up with @russorat. If we do something like this in the UI, it would be helpful to have a some more information in the schema mismatch error. Perhaps a specific error code and some information about the line/col in the user's Flux script where the error originates? Don't know how hard it is to track that information.

russorat added team/ui kind/enhancement labels May 3, 2019

chnn self-assigned this May 6, 2019

chnn mentioned this issue May 6, 2019

Generate more idiomatic Flux in query builder #13800

Merged

chnn mentioned this issue May 6, 2019

Use aggregateWindow when generating a Flux query with a median call #13806

Closed

nathanielc closed this as completed May 6, 2019

nathanielc reopened this May 6, 2019

chnn closed this as completed in #13800 May 6, 2019

chnn mentioned this issue May 8, 2019

Improve queries in system dashboard template #13847

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autogenerated Flux code does not use "aggregateWindow" #13775

Autogenerated Flux code does not use "aggregateWindow" #13775

nathanielc commented May 3, 2019 •

edited

Loading

chnn commented May 6, 2019

nathanielc commented May 6, 2019

nathanielc commented May 6, 2019

chnn commented May 6, 2019

nathanielc commented May 6, 2019

chnn commented May 6, 2019

Autogenerated Flux code does not use "aggregateWindow" #13775

Autogenerated Flux code does not use "aggregateWindow" #13775

Comments

nathanielc commented May 3, 2019 • edited Loading

chnn commented May 6, 2019

nathanielc commented May 6, 2019

nathanielc commented May 6, 2019

chnn commented May 6, 2019

nathanielc commented May 6, 2019

chnn commented May 6, 2019

nathanielc commented May 3, 2019 •

edited

Loading