Skip to content

Conversation

@nguymin4
Copy link
Contributor

@nguymin4 nguymin4 commented Apr 1, 2025

The current logic of ConsumeFromTopicOperator is that :param max_messages: defaults to None implying read to the end of the topic.

But if the max_messages=None, we also have very misleading warning log e.g. max_batch_size (1000) > max_messages (True). Setting max_messages to 1000

This warning log contradicts with the current logic and in fact, we also don't set the max_messages to max_batch_size either.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg
Copy link

boring-cyborg bot commented Apr 1, 2025

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@eladkal eladkal requested a review from jason810496 April 30, 2025 23:36
Copy link
Member

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

IMHO, instead of removing the warning, we should fix this part

self.max_messages = max_messages or True
self.max_batch_size = max_batch_size
self.poll_timeout = poll_timeout
if self.max_messages is True:
self.read_to_end = True
else:
self.read_to_end = False

by making the self.max_messages attribute still int or None instead of ambiguous with bool ( also have to check the further usage of self.max_messages logic and adopt with this type change if needed ).

@nguymin4 nguymin4 force-pushed the nguymin4/remove-misleading-kafka-warning branch from 95d35ac to 95896f1 Compare May 3, 2025 07:01
@nguymin4 nguymin4 changed the title Remove misleading max_messages warning of Kafka ConsumeFromTopicOperator Fix max_messages warning of Kafka ConsumeFromTopicOperator May 3, 2025
@nguymin4
Copy link
Contributor Author

nguymin4 commented May 3, 2025

@jason810496 Sounds ok to me if we want to go that way. I now changed the logic of max_messages so it is not casted to bool anymore.

@nguymin4 nguymin4 requested a review from jason810496 May 3, 2025 07:06
@nguymin4 nguymin4 force-pushed the nguymin4/remove-misleading-kafka-warning branch 5 times, most recently from 6b5042a to 0d67aaa Compare May 3, 2025 12:42
@nguymin4 nguymin4 force-pushed the nguymin4/remove-misleading-kafka-warning branch from 0d67aaa to 32356dd Compare May 3, 2025 12:43
Copy link
Member

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change!

@jason810496 jason810496 requested review from Lee-W and rawwar May 4, 2025 07:53
@Lee-W Lee-W merged commit 464da35 into apache:main May 6, 2025
64 checks passed
@boring-cyborg
Copy link

boring-cyborg bot commented May 6, 2025

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

mvfc pushed a commit to mvfc/airflow that referenced this pull request May 6, 2025
)

* Fix Kafka consume operator max_messages warning

* Add tests for Kafka consume operator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants