Sync `v2-1-stable` and `v2-1-test` to release `2.1.4` #18163

kaxil · 2021-09-11T11:22:39Z

Just need an approval, so that I can push to v2-1-stable which is a protected branch

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

* Improve cross-links to operators and hooks references * fixup! Improve cross-links to operators and hooks references * Update docs/apache-airflow/concepts/operators.rst (cherry picked from commit cff5f18)

currently it shows up as: ``` apache/airflow:|version| - the versioned Airflow image with default Python version (3.6 currently) ``` Example: http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/docs/docker-stack/index.html or even https://airflow.apache.org/docs/docker-stack/index.html This commit fixes it (cherry picked from commit 8cdda20)

Updates `pip` version from `21.2.2` to `21.2.4` (cherry picked from commit ebf3b4a)

The providers operators/hooks reference contained only top-level list of groups of providers, which make them less-usable than they could be as the users did not see at this page links to particular operators/hooks, it was not really visible what is "available" (discoverability) and the more detailed "Service" and "Transfer" pages are not really readable enough to give "at a glance" overview what is available. This change improves that, removes the repeated multiple times "operators and hooks" which was kind of annoying, and increases the TOC-level to 3 giving a nice overview of all available and exposed operator and hooks. (cherry picked from commit c7f37a9)

The ``hook-class-names`` provider's meta-data property has been deprecated and is now replaced by ``connection-types`` property. This documents the change. (cherry picked from commit be75dcd)

The documentation of provider packages was rather disconnected from the apache-airlfow documentation. It was hard to find the ways how the apache airflow's core extensions are implemented by the community managed providers - you needed to know what you were looking for, and you could not find links to the summary of the core-functionality extended by providers when you were looking at the functionality (like logging/secret backends/connections/auth) This PR inroduces much more comprehensive cross-linking between the airflow core functionalithy and the community-managed providers that are providing extensions to the core functionality. (cherry picked from commit bcc7665)

The #17939 did not fix the problem finally. It turned out that one more change was needed - since we now always upgrade to latest dependencies in `push` and `schedule` type of build we do not need to check for the variable UPGRADE_TO_NEWER_DEPENDENCIES (which was not set in "Build Image" step. This fixes it, but also changes the constraint generation to add comments in the generated constraint files, describing how and why the files are generated. (cherry picked from commit bec006e)

…e data (#17319) (cherry picked from commit 2c1880a)

) The Top-Level best practices were a little misleading. They suggested that no code should be written at the top-level DAG other than just creating operators, but the story is a little more nuanced. Better explanation is give and also examples on how you can deal with the situation when you need to generate your data based on some meta-data. From Slack discussion it seems that it is not obvious at all what are the best ways to handle that so two alternatives were presented with generating a meta-data file and generating an importable python code containing the meta-data. During that change, I noticed also, that config sections and config variables were not sorted - which made it very difficult to search for them in the index. All the config variables are now sorted so the references to the righ sections/variables make much more sense now. (cherry picked from commit 1be3ef6)

… the PR (#18060)" (#18086) This reverts commit 0dba2e0. Revert "Revert "Build CI images for the merge result of a PR, not the tip of the PR (#18060)" (#18063)" (#18086) (cherry picked from commit 9496235)

The automated upgrade of dependencies in main broken building of Airflow documentation in main build. After a lot of experimentation, It has been narrowed down to upgrade of dnspython from 1.16.0 to 2.+ which was brought by upgrading eventlet to 0.32.0. This PR limits the dnspython library to < 2.0.0. An issue has been opened: rthalley/dnspython#681 (cherry picked from commit 022b4e0)

As of August 2021, the buster-slim python images, no longer contain python2 packages. We still support running Python2 via PythonVirtualenvOperator and our tests started to fail when we run the tests in `main` - those tests always pull and build the images using latest-available buster-slim images. Our system to prevent PR failures in this case has proven to be useful - the main tests failed to succeed so the base images we have are still using previous buster-slim images which still contain Python 2. This PR adds python2 to installed packages - on both CI images and PROD images. For CI images it is needed to pass tests, for PROD images, it is needed for backwards-compatibility. (cherry picked from commit 6898a2f)

We are now generatnung constraints with better description, and we include information about DEFAULT_BRANCH (main/v2-1-test etc.) The scripts to generate the constraints need to get teh variable passed to docker. Also names of generated files were wrong. The constraints did not update the right constraint files. (cherry picked from commit afd4ba6)

Since we released Celery provider with celery 5, we should limit celery to < 5 for Airlfow 2.1 EAGER_UPGRADE limits. EAGER_UPGRADE limits are only used during constraint generation.

When we have a prolonged issue with flaky tests or Github runners instabilities, our automated constraint and image refresh might not work, so we might need to manually refresh the constraints and images. Documentation about that was in CONTRIBUTING.rst but it is more appriate to keep it in ``dev`` as it only applies to committers. Also during testing the parallell refresh without delays an error was discovered which prevented parallell check of random image hash during the build. This has been fixed and parallell image cache building should work flawlessly now. (cherry picked from commit 36c5fd3)

The previous regexp parsing was well, not perfect closely following the ancient Chinese proverb "If you have problem, introduce regexp - you will have two problems". This PR replaces regexp matching with python urlsplit method. Fixes: #17828 (cherry picked from commit 275e0d1)

When the DAG appear again in the UI and we rerun it, say we have catchup set to True, those running task instances that were not deleted would be rerun and an external state change of the task instances would be detected by the LocalTaskJob thereby sending SIGTERM to the task runner This change resolves this by making sure that DAGs are not deleted when the task instances are still running (cherry picked from commit 5a64c1c)

We were not passing the root to the `/tree_data` api call. Therefore, filtering upstream of a task would be reset during auto-refresh even though root was still defined. (cherry picked from commit c645d7a)

Fix wrong query on my PR about deleting running dags #17630 Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com> (cherry picked from commit 84df864)

(cherry picked from commit 96f7e3f)

* Only draw once during initial graph setup The previous behavior could cause significat slowness for when loading the graph view for large dags with many task groups. * Improve name and fix camelCased * Fix indent * PR suggestions remove args (cherry picked from commit bfdda08)

After clicking on the Pause/Unpause toggle, the element remained in focus and therefore the toggle wouldn't go away. After a change event we will also trigger a blur event to remove the focus so the tooltip will only appear on hover. Fixes: #16500 (cherry picked from commit ee93935)

The "color" method seems to have been removed. (cherry picked from commit a1d9172)

The `enrich_errors` method assumes the first argument to the function its patch is a TaskInstance when infact it can also be a LocalTaskJob. This is now handled by extracting the task_instance from the LocalTaskJob Closes #18118 (cherry picked from commit f97ddf1)

The task_stats, last_dagruns, blocked etc expect dag_ids not dagIds. This caused the endpoint to return all dags the user had access to by default closes: #18083 (cherry picked from commit d6e48cd)

(cherry picked from commit 683fbd4)

Currently, tasks can be run even if the dagrun is queued. Task instances of queued dagruns should only be run when the dagrun is in running state. This PR makes sure tis of queued dagruns are not run thereby properly checking task concurrency. Also, we check max_active_runs when parsing dag which is no longer needed since dagruns are created in queued state and the scheduler controls when to change the queued dagruns to running considering the max_active_runs. This PR removes the checking of max_active_runs in the dag too. (cherry picked from commit ffb81ea)

This hides the variable import form if the user does not have the "can create on variable" permission. (cherry picked from commit 7b3a5f9)

The graph view should show the "Download Log" and "View Logs in {remote logging system}", like is done on the tree view. (cherry picked from commit 83f1f07)

(cherry picked from commit 6868ca4)

The way how dumb-init propagated the signal by default made celery worker not to handle termination well. Default behaviour of dumb-init is to propagate signals to the process group rather than to the single child it uses. This is protective behaviour, in case a user runs 'bash -c' command without 'exec' - in this case signals should be sent not only to the bash but also to the process(es) it creates, otherwise bash exits without propagating the signal and you need second signal to kill all processes. However some airflow processes (in particular airflow celery worker) behave in a responsible way and handles the signals appropriately - when the first signal is received, it will switch to offline mode and let all workers terminate (until grace period expires resulting in Warm Shutdown. Therefore we can disable the protection of dumb-init and let it propagate the signal to only the single child it spawns in the Helm Chart. Documentation of the image was also updated to include explanation of signal propagation. For explicitness the DUMB_INIT_SETSID variable has been set to 1 in the image as well. Fixes #18066 (cherry picked from commit 9e13e45)

Regression on PID reset to allow task start after heartbeat Co-authored-by: Nicolas MEHRAEIN <nicolas.mehraein@adevinta.com> (cherry picked from commit ed99eaa)

github-actions · 2021-09-11T12:03:42Z

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

The recently updated docker-compose had a bit broken behaviour for non-Linux users. It expected the .env file to be created always, but the instructions to create them were not working on Windows. This fixes the problem by turning the error into warning, and directing the users to the right instructions per operating system. Also the recent ``DUMB_INIT_SESS_ID`` was added for worker to allow to handle signals properly also in our quick-start docker-compose. (cherry picked from commit bd77689)

eladkal · 2021-09-11T14:15:54Z

I think #17269 was missed from releases

potiuk · 2021-09-11T15:56:48Z

I think #17269 was missed from releases

Why do you think it should be there @eladkal :)? We do not cherry-pick all bugfixes to "patchlevel" releases, it's always risky - I think 2.1.4 is mostly about stabilising 2.1.3 (which had a number of stability issues) + some doc-only changes that are mostly clarifications, guiding our users better, improvements in communication (generally 0 or low-risk changes). The #17269 is also some new behaviour rather than bugfix IMHO so it should go to 2.2

eladkal · 2021-09-11T16:02:51Z

Why do you think it should be there @eladkal :)? We do not cherry-pick all bugfixes to "patchlevel" releases, it's always risky - I think 2.1.4 is mostly about stabilising 2.1.3 (which had a number of stability issues) + some doc-only changes that are mostly clarifications, guiding our users better, improvements in communication (generally 0 or low-risk changes). The #17269 is also some new behaviour rather than bugfix IMHO so it should go to 2.2

mostly as this seem to be a bug fix for #18102 but I'm OK with it waiting for 2.2 as it should follow shortly

This test wasn't working on python > 3.7. (cherry picked from commit 4a0711c)

This PR separate installing Airflow from sources section and also fixes links for binary source, it had `-bin` suffix which we don't use anymore. And I have added section on verifying integrity. And add more details with examples (cherry picked from commit f9969c1)

Similar to #6240 and #17706

potiuk · 2021-09-11T20:26:47Z

Woohoo!

mik-laj and others added 30 commits August 28, 2021 16:49

Improve cross-links to operators and hooks references (#17622)

4d63c0f

* Improve cross-links to operators and hooks references * fixup! Improve cross-links to operators and hooks references * Update docs/apache-airflow/concepts/operators.rst (cherry picked from commit cff5f18)

Bump pip version to 21.2.4 (#17746)

4f2486f

Updates `pip` version from `21.2.2` to `21.2.4` (cherry picked from commit ebf3b4a)

Update description about the new connection-types provider meta-data

4f560d3

The ``hook-class-names`` provider's meta-data property has been deprecated and is now replaced by ``connection-types`` property. This documents the change. (cherry picked from commit be75dcd)

Suggest to use secrets backend for variable when it contains sensitiv…

61332ba

…e data (#17319) (cherry picked from commit 2c1880a)

Reapply "Build CI images for the merge result of a PR, not the tip of…

bc5e75f

… the PR (#18060)" (#18086) This reverts commit 0dba2e0. Revert "Revert "Build CI images for the merge result of a PR, not the tip of the PR (#18060)" (#18063)" (#18086) (cherry picked from commit 9496235)

Eager upgrade for Airflow 2.1. should now include celery 4 limit

fada02f

Since we released Celery provider with celery 5, we should limit celery to < 5 for Airlfow 2.1 EAGER_UPGRADE limits. EAGER_UPGRADE limits are only used during constraint generation.

Add root to tree refresh url (#17633)

34fbe0d

We were not passing the root to the `/tree_data` api call. Therefore, filtering upstream of a task would be reset during auto-refresh even though root was still defined. (cherry picked from commit c645d7a)

Fix wrong query on running tis (#17631)

6866a7c

Fix wrong query on my PR about deleting running dags #17630 Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com> (cherry picked from commit 84df864)

Increase width for Run column (#17817)

8ecc710

(cherry picked from commit 96f7e3f)

Limit colorlog version (6.x is incompatible) (#18099)

9f98a7d

The "color" method seems to have been removed. (cherry picked from commit a1d9172)

Fixes incorrect parameter passed to views (#18083) (#18085)

6c59622

The task_stats, last_dagruns, blocked etc expect dag_ids not dagIds. This caused the endpoint to return all dags the user had access to by default closes: #18083 (cherry picked from commit d6e48cd)

Fix Clear task instances endpoint resets all DAG runs bug (#17961)

d338aa7

(cherry picked from commit 683fbd4)

Hide variable import form if user lacks permission (#18000)

24ead01

This hides the variable import form if the user does not have the "can create on variable" permission. (cherry picked from commit 7b3a5f9)

Fix log links on graph TI modal (#17862)

c2298bc

The graph view should show the "Download Log" and "View Logs in {remote logging system}", like is done on the tree view. (cherry picked from commit 83f1f07)

Avoid endless redirect loop when user has no roles (#17613)

464e1e1

(cherry picked from commit 6868ca4)

kaxil and others added 4 commits September 11, 2021 11:59

Bump version to 2.1.4

58b562c

Regression on pid reset to allow task start after heartbeat (#17333)

ccdc121

Regression on PID reset to allow task start after heartbeat Co-authored-by: Nicolas MEHRAEIN <nicolas.mehraein@adevinta.com> (cherry picked from commit ed99eaa)

Add Changelog for 2.1.4

1621890

kaxil requested review from ashb, potiuk and jedcunningham September 11, 2021 11:22

boring-cyborg bot added area:API Airflow's REST/HTTP API area:dev-tools area:production-image Production image improvements and fixes area:providers area:Scheduler including HA (high availability) scheduler provider:Apache labels Sep 11, 2021

potiuk approved these changes Sep 11, 2021

View reviewed changes

github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Sep 11, 2021

potiuk force-pushed the v2-1-test branch from f3b3cf2 to ba8a8d1 Compare September 11, 2021 12:06

potiuk force-pushed the v2-1-test branch from ba8a8d1 to 1f938d6 Compare September 11, 2021 12:08

potiuk and others added 4 commits September 11, 2021 18:05

Fix spelling mistake

e4df035

Fix missing create_dummy_dag fixture

7763949

Fix TestSecurity.test_current_user_has_permissions (#17916)

af0e684

This test wasn't working on python > 3.7. (cherry picked from commit 4a0711c)

kaxil force-pushed the v2-1-test branch from a525a4e to bae5be0 Compare September 11, 2021 19:14

Clearly document no breaking change for >=2.1.2, <=2.1.4

9168a0b

Similar to #6240 and #17706

kaxil merged commit 9168a0b into v2-1-stable Sep 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync `v2-1-stable` and `v2-1-test` to release `2.1.4` #18163

Sync `v2-1-stable` and `v2-1-test` to release `2.1.4` #18163

kaxil commented Sep 11, 2021 •

edited

Loading

github-actions bot commented Sep 11, 2021

eladkal commented Sep 11, 2021

potiuk commented Sep 11, 2021

eladkal commented Sep 11, 2021

potiuk commented Sep 11, 2021

Sync v2-1-stable and v2-1-test to release 2.1.4 #18163

Sync v2-1-stable and v2-1-test to release 2.1.4 #18163

Conversation

kaxil commented Sep 11, 2021 • edited Loading

github-actions bot commented Sep 11, 2021

eladkal commented Sep 11, 2021

potiuk commented Sep 11, 2021

eladkal commented Sep 11, 2021

potiuk commented Sep 11, 2021

Sync `v2-1-stable` and `v2-1-test` to release `2.1.4` #18163

Sync `v2-1-stable` and `v2-1-test` to release `2.1.4` #18163

kaxil commented Sep 11, 2021 •

edited

Loading