Skip to content

Conversation

@xiaohuanlin
Copy link
Contributor

The group_by(Log.id) clause was redundant since Log.id is the primary key and doesn't affect the query results. This simplifies the query without changing functionality.

Fixes #53695


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg
Copy link

boring-cyborg bot commented Jul 24, 2025

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Jul 24, 2025
Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't reproduce the linked issue locally, but that group_by sure look suspicious, any specific setup to be able to reproduce the 500 error ? I tried on mysql, postgres and sqllite, without succeeding.

The group_by(Log.id) clause was redundant since Log.id is the primary key
and doesn't affect the query results. This simplifies the query without
changing functionality.

Fixes apache#53695
Copy link
Member

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for the PR and LGTM.

It would be nice to providing the reproduce step for the error.

@potiuk
Copy link
Member

potiuk commented Jul 27, 2025

Yeah. Unit tests would be great.

@xiaohuanlin
Copy link
Contributor Author

If you want to reproduce this error in PostgreSQL, you'll need to use a version older than 9.1, since PostgreSQL 9.1 and later versions allow non-aggregated columns in SELECT if they are functionally dependent on the GROUP BY columns.

https://www.postgresql.org/message-id/20100807024409.35E3A7541D7@cvs.postgresql.org

@potiuk
Copy link
Member

potiuk commented Jul 27, 2025

So why do we need this change at all ?

Screenshot 2025-07-27 at 22 57 17

@xiaohuanlin
Copy link
Contributor Author

I removed the GROUP BY clause because it’s not needed—there’s no aggregation in the query.
Keeping an unnecessary GROUP BY can slow down performance, especially when querying large event logs. Databases like PostgreSQL may still perform sorting or hashing even if the results don’t change.

@pierrejeambrun pierrejeambrun added the backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch label Jul 28, 2025
@pierrejeambrun
Copy link
Member

That should be removed indeed, also the reported of the issues mentioned that he was using postgres 15.12.

@pierrejeambrun pierrejeambrun merged commit 85d80cd into apache:main Jul 28, 2025
104 checks passed
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 28, 2025

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

github-actions bot pushed a commit that referenced this pull request Jul 28, 2025
…53733)

The group_by(Log.id) clause was redundant since Log.id is the primary key
and doesn't affect the query results. This simplifies the query without
changing functionality.

Fixes #53695
(cherry picked from commit 85d80cd)

Co-authored-by: xiaohuanlin <33860497+xiaohuanlin@users.noreply.github.com>
@github-actions
Copy link

Backport successfully created: v3-0-test

Status Branch Result
v3-0-test PR Link

github-actions bot pushed a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Jul 28, 2025
…pache#53733)

The group_by(Log.id) clause was redundant since Log.id is the primary key
and doesn't affect the query results. This simplifies the query without
changing functionality.

Fixes apache#53695
(cherry picked from commit 85d80cd)

Co-authored-by: xiaohuanlin <33860497+xiaohuanlin@users.noreply.github.com>
pierrejeambrun pushed a commit that referenced this pull request Jul 28, 2025
…53733) (#53807)

The group_by(Log.id) clause was redundant since Log.id is the primary key
and doesn't affect the query results. This simplifies the query without
changing functionality.

Fixes #53695
(cherry picked from commit 85d80cd)

Co-authored-by: xiaohuanlin <33860497+xiaohuanlin@users.noreply.github.com>
ferruzzi pushed a commit to aws-mwaa/upstream-to-airflow that referenced this pull request Aug 7, 2025
The group_by(Log.id) clause was redundant since Log.id is the primary key
and doesn't affect the query results. This simplifies the query without
changing functionality.

Fixes apache#53695
fweilun pushed a commit to fweilun/airflow that referenced this pull request Aug 11, 2025
The group_by(Log.id) clause was redundant since Log.id is the primary key
and doesn't affect the query results. This simplifies the query without
changing functionality.

Fixes apache#53695
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

column "log.dttm" must appear in the GROUP BY clause or be used in an aggregate function

5 participants