Skip to content

Conversation

@jedcunningham
Copy link
Member

@jedcunningham jedcunningham commented Aug 2, 2025

This PR Syncs v3-0-stable with v3-0-test to release 3.0.4.

Release notes and version bumps added in

65c448e
4ff5f3a

Lzzz666 and others added 30 commits July 11, 2025 22:47
…ache#50371) (apache#52698)

* Add back dag parsing pre-import optimization (apache#50371)

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

* Apply suggestions from code review

---------

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
…pache#52902) (apache#52925)

(cherry picked from commit 1996171)

Co-authored-by: Amogh Desai <amoghrajesh1999@gmail.com>
…mmit's (apache#52931) (apache#52942)

We are still waiting for release of the Lucas's pre-commit after
the Lucas-C/pre-commit-hooks#103 has
been merged - and for now we need to use `bleeding-edge` for
the repo.

Also upgrades zizmor and solves using "env." in shell commands
that might lead to security issues.
(cherry picked from commit 535d71d)
…lient to re matches (apache#52960) (apache#52961)

(cherry picked from commit 8e5c284)

Co-authored-by: Amogh Desai <amoghrajesh1999@gmail.com>
Follow up after apache#52967 -> from later discussions it turned out that
it's not really the ~= that is wrong and ambiguous, but that just
upper-binding of Python version is generally considered as a bad
idea - and it's not Astral's view but it's general consensus that
upper-binding of "python-requires" is bad. Since ~= implies
upper-binding, simply replacing it with >= is likely the best option
we can choose.

(cherry picked from commit e9eb481)
) (apache#52987)

Follow up after apache#52980 - there are still few more places where
the ~= was used in requires-python.
(cherry picked from commit 3f6f1db)
…and AIP-72 (apache#52197) (apache#53117)

* docs: update public interface doc to reflect airflow.sdk and AIP-72

- Added a note under "Using Airflow Public Interfaces" to recommend using `airflow.sdk` as the official interface from Airflow 3.0.
- Referenced AIP-72 and linked related documentation.
- Encouraged users to prefer REST API and Python Client for integrations.

* Update airflow-core/docs/public-airflow-interface.rst

Great



---------
(cherry picked from commit e142ab9)

Co-authored-by: N R Navaneet <156576749+nrnavaneet@users.noreply.github.com>
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
…illed (apache#53058) (apache#53143)

Until apache#44354 is implemented, tasks killed externally or when supervisor process dies unexpectedly, users have no way of knowing this happened.

This has been a blocker for Airflow 3.0 adoption for some:

- apache#44354
- https://apache-airflow.slack.com/archives/C07813CNKA8/p1751057525231389

apache#44354 is more involved and we might not get to it for Airflow 3.1 -- so this is a good fix until then similar to how we run Dag Run callback.

(cherry-picked from a5211f2)
…oml (apache#53179) (apache#53185)

When the flag is specified in pyproject toml, it also forces no
binary installation of xmlsec and lxml for local uv sync which might
fail if some system libraries are not installed, so it is better
to do it in the image by passing the right flags to installer
directly.
(cherry picked from commit 47bbe55)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
…3210)

(cherry picked from commit 45cc0d4)

Co-authored-by: GPK <gopidesupavan@gmail.com>
* Draft: Build python from source

Builds python from source, also installs
golang from official distribution. Does
both of these for the ci image only.

* Updates path

* Adds version upgrade check for python version

Adds support for using the airflow api to fetch
the newest python patch version available for specific
major_minor pair

* Updated to use args in dockerfile for python

* Added support for golang upgrade

* Fixed go version sorting in pre_commit install

* Added github token usage and fixed version regex

Updated python fetch request during upgrade to use
github token and fixed the regex

* Updated dockerfile.ci file

* Added support for multiple python versions

Adds python version from global consts into
build args now for the docker ci.

* Updated python install

* Increases timeout for ci image build

(cherry picked from commit 5aec2d5)

Co-authored-by: Aritra Basu <24430013+aritra24@users.noreply.github.com>
(cherry picked from commit 13c28dc)

Co-authored-by: GPK <gopidesupavan@gmail.com>
…#53221)

Last update check failed where it should not because it got the
rate limit failure when checking for python updates - and the
reason was it did not have GITHUB_TOKEN set.
(cherry picked from commit a24962c)
…inutes (apache#53227) (apache#53230)

The apache#53212 changed the quick-image-build check to only run on
canary build, but this was not the intention - and the image started
to fail because of timeout minutes were too short after we added
python building from sources.

This PR fixes it "properly" - changes timeout minutes to be slightly
longer than the timeout (900 seconds) we specify in build command
and brings back building the image on regular PRs.
(cherry picked from commit 5579edd)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
(cherry picked from commit 43d8c3b)

Co-authored-by: Bugra Ozturk <bugraoz93@users.noreply.github.com>
(cherry picked from commit 234c7f0)

Co-authored-by: Bugra Ozturk <bugraoz93@users.noreply.github.com>
…ache#52751) (apache#52756)

When generating issue content for releases, some PRs were being filtered out
during processing (e.g., dependabot PRs, doc-only PRs) but still remained in
the `pull_requests` dictionary. This caused a `KeyError` when the Jinja2 template
tried to access `user_logins` for PRs that had no corresponding user data.

The fix ensures that `pr_list` variable only contains PRs that have corresponding entries
in the users dictionary, preventing the template from accessing undefined keys.

Fixes issue where `breeze release-management generate-issue-content-core`
command failed with "UndefinedError: dict object has no element <PR_NUMBER>".
(cherry picked from commit 19da281)

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
…) (apache#53245)

* Update mypy script with --warn-unused-ignores

* Add doc information
(cherry picked from commit 1d7e0ae)

Co-authored-by: GPK <gopidesupavan@gmail.com>
(cherry picked from commit 555ca21)

Co-authored-by: GPK <gopidesupavan@gmail.com>
…he#53167)

* Add note for new usage of LogMetadata

* Add _stream_parsed_lines_by_chunk

* Refactor _read_from_local/logs_server as return stream

* Refactor _interleave_logs with K-Way Merge

* Add _get_compatible_log_stream

* Refactor _read method to return stream with compatible interface

- Add compatible interface for executor, remote_logs
- Refactor skip log_pos with skip for each log source

* Refactor log_reader to adapt stream

* Fix _read_from_local open closed file error

* Refactor LogReader by yielding in batch

* Add ndjson header to get_log openapi schema

* Fix _add_log_from_parsed_log_streams_to_heap
- Add comparator for StructuredLogMessage
- Refactor parsed_log_streams from list to dict for removing empty logs

* Fix _interleave_logs dedupe logic
- should check the current logs with default timestamp

* Refactor test_log_handlers
- Fix events utils
- Add convert_list_to_stream, mock_parsed_logs_factory utils
- Fix the following test after refactoring FileTaskHandler
    - test_file_task_handler_when_ti_value_is_invalid
    - test_file_task_handler
    - test_file_task_handler_running
    - test_file_task_handler_rotate_size_limit
    - test__read_when_local
    - test__read_served_logs_checked_when_done_and_no_local_or_remote_logs
    - test_interleave_interleaves
    - test_interleave_logs_correct_ordering
    - test_interleave_logs_correct_dedupe
- Add new test for refactoring FileTaskHandler
    - test__stream_lines_by_chunk
    - test__log_stream_to_parsed_log_stream
    - test__sort_key
    - test__is_sort_key_with_default_timestamp
    - test__is_logs_stream_like
    - test__add_log_from_parsed_log_streams_to_heap

* Move test_log_handlers utils to test_common

* Fix unit/celery/log_handlers test

* Fix mypy-providers static check

* Fix _get_compatible_log_stream
- sequential yield instead of parallel yield for all log_stream

* Fix amazon task_handler test

* Fix wask task handler test

* Fix elasticsearch task handler test

* Fix opensearch task handler test

* Fix TaskLogReader buffer
- don't concat buffer with empty str, yield directly from buffer

* Fix test_log_reader

* Fix CloudWatchRemoteLogIO.read mypy

* Fix test_gcs_task_handler

* Fix core_api test_log

* Fix CloudWatchRemoteLogIO._event_to_str dt format

* Fix TestCloudRemoteLogIO.test_log_message

* Fix es/os task_hander convert_list_to_stream

* Fix compact tests

* Refactor es,os task handler for 3.0 compact

* Fix compat for RedisTaskHandler

* Fix ruff format for test_cloudwatch_task_handler after rebase

* Fix 2.10 compat TestCloudwatchTaskHandler

* Fix 3.0 compat test for celery, wasb

Fix wasb test, spelling

* Fix 3.0 compat test for gcs

* Fix 3.0 compat test for cloudwatch, s3

* Set get_log API default response format to JSON

* Remove "first_time_read" key in log metadata

* Remove "<source>_log_pos" key in log metadata

* Add LogStreamCounter for backward compatibility

* Remove "first_time_read" with backward "log_pos" for tests

- test_log_reader
- test_log_handlers
- test_cloudwatch_task_handler
- test_s3_task_handler
- celery test_log_handler
- test_gcs_task_handler
- test_wasb_task_handler
- fix redis_task_handler
- fix log_pos

* Fix RedisTaskHandler compatibility

* Fix chores in self review

- Fix typo in _read_from_logs_server
- Remove unused parameters in _stream_lines_by_chunk
- read_log_stream
    - Fix doc string by removing outdate note
    - Only add buffer for full_download
- Add test ndjson format for get_log API

* Fine-tune HEAP_DUMP_SIZE

* Replace get_compatible_output_log_stream with iter

* Remove buffer in log_reader

* Fix log_id not found compact for es_task_handler

* Fix review comments
- rename LogStreamCounter as LogStreamAccumulator
- simply for-yield with yield-from in log_reader
- add type annotation for LogStreamAccumulator

* Refactor LogStreamAccumulator._capture method
- use itertools.isslice to get chunk

* Fix type hint, joinedload for ti.dag_run after merge

* Replace _sort_key as _create_sort_key

* Add _flush_logs_out_of_heap common util

* Fix review nits

 - _is_logs_stream_like
    - add type annotation
    - reduce to 1 isinstance call
- construct log_streams in _get_compatible_log_stream inline
- use TypeDict for LogMetadata
- remove len(logs) to check empty
- revert typo of self.log_handler.read in log_reader
- log_stream_accumulator
    - refactor flush logic
    - make totoal_lines as property
    - make stream as property

* Fix mypy errors after merge

* Fix redis task handler test

* Refactor _capture logic in LogStreamAccumulator

* Add comments for ingore LogMetadata TypeDict

* Add comment for offset; Fix commet for LogMessages

* Refactor with from_iterable, islice

* Fix nits in test

- refactor structured_logs fixtures in TestLogStreamAccumulator
- use f-strign in test_file_task_handler
- assert actual value of _create_sort_key
- add details comments in test__add_log_from_parsed_log_streams_to_heap

* Refactor test_utils

* Add comment for lazy initialization

* Fix error handling for _stream_lines_by_chunk

* Fix mypy error after merge

* Fix final review nits

* Fix mypy error

(cherry picked from commit ee54fe9)
…53344) (apache#53347)

(cherry picked from commit a6efa53)

Co-authored-by: Zach <zach.gottesman@datadoghq.com>
…#53404)

(cherry picked from commit 06065f9)

Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>
…#53451)

(cherry picked from commit 6aeee15)

Co-authored-by: Amogh Desai <amoghrajesh1999@gmail.com>
…53342) (apache#53351)

(cherry picked from commit afb2e8a)

Co-authored-by: Kacper Muda <mudakacper@gmail.com>
…ache#53516) (apache#53522)

(cherry picked from commit 11a6361)

Co-authored-by: Amogh Desai <amoghrajesh1999@gmail.com>
…lers (apache#53452) (apache#53457)

When running "update-installers-and-pre-commit" pre-commit, the script
is virtually guaranteed to fail with "rate limits" reached if you do
not use GITHUB_TOKEN. This change makes GITHUB_TOKEN variable mandatory
for the pre-commit, it prints the helpful information and prints URL
that allows to create such GITHUB_TOKEN very easily.
(cherry picked from commit aeb0d4c)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
…y is needed (apache#53535) (apache#53539)

Without those extras pandas does not impose limitations on sqlalchemy
version used - but it is really an implicit dependency of pandas,
when it is used for sqlalchemy interactions.

This will drop constraint version of pandas for Airflow 3 to 2.1 and
it is really a conflicting dependency for Python 3.13 where only pandas
2.2.3 can be used but it requires sqlalchemy 2. This extra should
be added while migrating to sqlalchemy 2 is complete.
(cherry picked from commit ff084ed)
…sing uses default value for safe_mode, which resolves to value in configuration (apache#52694) (apache#53507)

(cherry picked from commit e311c6b)

Co-authored-by: BBQing <33732539+BBQing@users.noreply.github.com>
apache#53082) (apache#53518)

(cherry picked from commit e9e6ac2)

Co-authored-by: Amogh Desai <amoghrajesh1999@gmail.com>
@potiuk
Copy link
Member

potiuk commented Aug 2, 2025

IMHO - we should only log error message in this case, but not the item itself.

@ashb
Copy link
Member

ashb commented Aug 2, 2025

I guess the thinking was "it's very hard to hit this path, and when it does the secret is already going to be logged (as the redaction failed) so being explicit about what the problem is to enable a good bug report is worth it"

(I'm not sure why it's only flagging it on this PR though, as nothing about that code path has changed in a while.)

@potiuk
Copy link
Member

potiuk commented Aug 2, 2025

(I'm not sure why it's only flagging it on this PR though, as nothing about that code path has changed in a while.)

As explained above - becuase CodeQL re-evaluates and reapplies checks when code changes after the code around has been modifed (it has been with secrets masking) and you previously marked it as false posiitve - that's why it did not report it again until secrets masking has been modified.

I think that issue is really dangerous - despite "rarity". This is quite prone to targeted attacks. If attacker (for example UI user) finds a way how to crash secrets masker/redacting by injecting - for example - bad parameter, they could potentially inject such bad data and deliberately trigger printing non-redacted redactable information (say connection passwords)

I'd say we should print the error as warning but the item should only be printed in debug mode.

@ashb
Copy link
Member

ashb commented Aug 2, 2025

The main point though: we already return item just below this flagged log item, so unless you are saying we should change that to return "<redaction-failed>" or something then this alert is not an issue.

@potiuk
Copy link
Member

potiuk commented Aug 2, 2025

The main point though: we already return item just below this flagged log item, so unless you are saying we should change that to return "<redatction-failed>" or something then this alert is not an issue.

Yep. I think so.

@ashb
Copy link
Member

ashb commented Aug 2, 2025

K, PR coming

@potiuk
Copy link
Member

potiuk commented Aug 2, 2025

Good point BTW.

@ashb
Copy link
Member

ashb commented Aug 2, 2025

#54046

github-actions bot and others added 3 commits August 2, 2025 10:15
…apache#54047)

(cherry picked from commit e809161)

Co-authored-by: Yeonguk Choo <choo121600@gmail.com>
…pache#53383)

(cherry picked from commit ea5dbcf)

Co-authored-by: Jeongseok Kang <jskang@lablup.com>
…an error (apache#54046) (apache#54048)

* [v3-0-test] Make log redaction safer in edge case when redaction has an error (apache#54046)
(cherry picked from commit 93ba1c5)

* Fix bad mocking in test

Co-authored-by: Ash Berlin-Taylor <ash@apache.org>
apache#54051) (apache#54053)

When debugger option has been added to breeze in apache#51763 - the ports were added
to base-port.yml, but the environment variables were only set when debugging is
enabled. This hower caused warnings that the variables are not set and
defaulting them to empty string when no debug components were used.

Instead - we moved the debugger ports to separate compose file and
only use the compose file when debug components are used.
(cherry picked from commit 939700f)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
@ashb ashb force-pushed the 3.0.4-from-v30test branch from 4ff5f3a to f81a80f Compare August 2, 2025 18:42
@ashb
Copy link
Member

ashb commented Aug 2, 2025

Updated this to pull in discussed changes (and the others that were on v3-0-test as they were okay/dev only)

@ashb
Copy link
Member

ashb commented Aug 2, 2025

@potiuk FYI #54053 seems to be causing problems, so I've pulled it out of this for now

AttributeError: 'ShellParams' object has no attribute 'debug_components'

@ashb ashb force-pushed the 3.0.4-from-v30test branch from f81a80f to 76a19d7 Compare August 2, 2025 20:10
rawwar and others added 5 commits August 3, 2025 14:07
* debug support in breeze

* fix celery env var

* Add debug options for Airflow components and update related constants

* Add debug options for Airflow components and update related constants

* refactor

* update documentation and add setup_vscode

* Remove section on overriding default debug ports in VSCode setup documentation

* fix base-ports

* fix base-ports

* update docs

* add webserver details in docs

* update docs

* update docs

* update docs

(cherry picked from commit 55d648e)
This change came from apache#46891, but that was a much bigger change to enable
Python 3.13 which we don't want to backport

(cherry picked from commit 488744c)
…apache#54076)

Breeze had the possibility of installing airflow from a branch of
any Github repo - by providing VCS url, but it's been broken since
the split of distributions. This PR adds capability of using
`owner/repo:branch` as `--use-airflow-version` and the installation
will be done using this GitHub repo.

Note! Such installation will NOT (currently) compile the assets
so you will not be able to run api_server easily after such installation
We might want to improve that in the future.

(cherry picked from commit 40ccc55)
@ashb ashb force-pushed the 3.0.4-from-v30test branch from 488744c to 11bd8db Compare August 4, 2025 10:17
@ashb ashb force-pushed the 3.0.4-from-v30test branch from 11bd8db to 68c25d4 Compare August 4, 2025 10:47
@ashb ashb merged commit 68c25d4 into apache:v3-0-stable Aug 4, 2025
156 of 160 checks passed
@ashb ashb deleted the 3.0.4-from-v30test branch August 4, 2025 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:production-image Production image improvements and fixes backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch kind:documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.