Skip to content

Improved error for expand wildcard rule #15004

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

Found by @mhilton upstream in InfluxDB IOx

Running this query, results in a confusing error message

create table foo(a int, b int, c timestamp) as values (1,2,'2025-01-01T12:01:02');

SELECT *
FROM foo
WHERE c >= NOW() - INTERVAL '1 hour'
GROUP BY a;
> SELECT *
FROM foo
WHERE c >= NOW() - INTERVAL '1 hour'
GROUP BY a;
expand_wildcard_rule
caused by
Schema error: No field named foo.b. Valid fields are foo.a.

If you explicitly list out the column (b instead of *) you get a better message:

> SELECT b
FROM foo
WHERE c >= NOW() - INTERVAL '1 hour'
GROUP BY a;
Error during planning: Projection references non-aggregate values: Expression foo.b could not be resolved from available columns: foo.a

Describe the solution you'd like

I would like a better error

Perhaps something like

While expanding wildcard, column "b" must appear in the GROUP BY clause or must be part of an aggregate function.

Bonus points for also improving the message Error during planning: Projection references non-aggregate values: Expression foo.b could not be resolved from available columns: foo.a

Describe alternatives you've considered

No response

Additional context

Here is what DuckDB reports

D SELECT *
  FROM foo
  WHERE c >= NOW() - INTERVAL '1 hour'
  GROUP BY a;
Binder Error:
column "b" must appear in the GROUP BY clause or must be part of an aggregate function.
Either add it to the GROUP BY list, or use "ANY_VALUE(b)" if the exact value of "b" is not important.

Here is what postgres says

postgres=# SELECT *
  FROM foo
  WHERE c >= NOW() - INTERVAL '1 hour'
  GROUP BY a;
ERROR:  column "foo.b" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT *

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions