Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync v2-1-stable and v2-1-test to release 2.1.4 #18163

Merged
merged 47 commits into from
Sep 11, 2021
Merged

Sync v2-1-stable and v2-1-test to release 2.1.4 #18163

merged 47 commits into from
Sep 11, 2021

Conversation

kaxil
Copy link
Member

@kaxil kaxil commented Sep 11, 2021

Just need an approval, so that I can push to v2-1-stable which is a protected branch


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

mik-laj and others added 30 commits August 28, 2021 16:49
* Improve cross-links to operators and hooks references

* fixup! Improve cross-links to operators and hooks references

* Update docs/apache-airflow/concepts/operators.rst

(cherry picked from commit cff5f18)
currently it shows up as:

```
apache/airflow:|version| - the versioned Airflow image with default Python version (3.6 currently)
```

Example: http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/docs/docker-stack/index.html or even https://airflow.apache.org/docs/docker-stack/index.html

This commit fixes it

(cherry picked from commit 8cdda20)
Updates `pip` version from `21.2.2` to `21.2.4`

(cherry picked from commit ebf3b4a)
The providers operators/hooks reference contained only top-level list of
groups of providers, which make them less-usable than they could be as
the users did not see at this page links to particular operators/hooks,
it was not really visible what is "available" (discoverability) and
the more detailed "Service" and "Transfer" pages are not really
readable enough to give "at a glance" overview what is available.

This change improves that, removes the repeated multiple times
"operators and hooks" which was kind of annoying, and increases
the TOC-level to 3 giving a nice overview of all available and
exposed operator and hooks.

(cherry picked from commit c7f37a9)
The ``hook-class-names`` provider's meta-data property has been deprecated and
is now replaced by ``connection-types`` property. This documents the
change.

(cherry picked from commit be75dcd)
The documentation of provider packages was rather disconnected
from the apache-airlfow documentation. It was hard to find the
ways how the apache airflow's core extensions are implemented by
the community managed providers - you needed to know what you were
looking for, and you could not find links to the summary of the
core-functionality extended by providers when you were looking at
the functionality (like logging/secret backends/connections/auth)

This PR inroduces much more comprehensive cross-linking between
the airflow core functionalithy and the community-managed providers
that are providing extensions to the core functionality.

(cherry picked from commit bcc7665)
The #17939 did not fix the problem finally. It turned out that
one more change was needed - since we now always upgrade to latest
dependencies in `push` and `schedule` type of build we do not need
to check for the variable UPGRADE_TO_NEWER_DEPENDENCIES (which
was not set in "Build Image" step.

This fixes it, but also changes the constraint generation to add
comments in the generated constraint files, describing how and
why the files are generated.

(cherry picked from commit bec006e)
)

The Top-Level best practices were a little misleading. They
suggested that no code should be written at the top-level DAG other
than just creating operators, but the story is a little more nuanced.

Better explanation is give and also examples on how you can deal
with the situation when you need to generate your data based on
some meta-data. From Slack discussion it seems that it is not
obvious at all what are the best ways to handle that so two
alternatives were presented with generating a meta-data file
and generating an importable python code containing the meta-data.

During that change, I noticed also, that config sections and
config variables were not sorted - which made it very difficult to
search for them in the index. All the config variables are now
sorted so the references to the righ sections/variables make much
more sense now.

(cherry picked from commit 1be3ef6)
… the PR (#18060)" (#18086)

This reverts commit 0dba2e0.

Revert "Revert "Build CI images for the merge result of a PR, not the tip of the PR (#18060)" (#18063)" (#18086)

(cherry picked from commit 9496235)
The automated upgrade of dependencies in main broken building of
Airflow documentation in main build.

After a lot of experimentation, It has been narrowed down
to upgrade of dnspython from 1.16.0 to 2.+ which was brought
by upgrading eventlet to 0.32.0.

This PR limits the dnspython library to < 2.0.0. An issue
has been opened:
rthalley/dnspython#681

(cherry picked from commit 022b4e0)
As of August 2021, the buster-slim python images, no longer
contain python2 packages. We still support running Python2 via
PythonVirtualenvOperator and our tests started to fail when
we run the tests in `main` - those tests always pull and build
the images using latest-available buster-slim images.

Our system to prevent PR failures in this case has proven to be
useful - the main tests failed to succeed so the base images
we have are still using previous buster-slim images which still
contain Python 2.

This PR adds python2 to installed packages - on both CI images
and PROD images. For CI images it is needed to pass tests, for
PROD images, it is needed for backwards-compatibility.

(cherry picked from commit 6898a2f)
We are now generatnung constraints with better description, and
we include information about DEFAULT_BRANCH (main/v2-1-test etc.)

The scripts to generate the constraints need to get teh variable
passed to docker.

Also names of generated files were wrong. The constraints did
not update the right constraint files.

(cherry picked from commit afd4ba6)
Since we released Celery provider with celery 5, we should
limit celery to < 5 for Airlfow 2.1  EAGER_UPGRADE limits.

EAGER_UPGRADE limits are only used during constraint generation.
When we have a prolonged issue with flaky tests or Github runners
instabilities, our automated constraint and image refresh might
not work, so we might need to manually refresh the constraints
and images. Documentation about that was in CONTRIBUTING.rst
but it is more appriate to keep it in ``dev`` as it only applies
to committers.

Also during testing the parallell refresh without delays an error
was discovered  which prevented parallell check of random image
hash during the build. This has been fixed and parallell
image cache building should work flawlessly now.

(cherry picked from commit 36c5fd3)
The previous regexp parsing was well, not perfect closely following the
ancient Chinese proverb "If you have problem, introduce regexp - you
will have two problems".

This PR replaces regexp matching with python urlsplit method.

Fixes: #17828
(cherry picked from commit 275e0d1)
When the DAG appear again in the UI and we rerun it, say we have catchup set to True,
those running task instances that were not deleted would be rerun and an external state change
of the task instances would be detected by the LocalTaskJob thereby sending SIGTERM to the task runner

This change resolves this by making sure that DAGs are not deleted when the task instances are still
running

(cherry picked from commit 5a64c1c)
We were not passing the root to the `/tree_data` api call. Therefore, filtering upstream of a task would be reset during auto-refresh even though root was still defined.

(cherry picked from commit c645d7a)
Fix wrong query on my PR about deleting running dags #17630

Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
(cherry picked from commit 84df864)
* Only draw once during initial graph setup

The previous behavior could cause significat slowness for when loading
the graph view for large dags with many task groups.

* Improve name and fix camelCased

* Fix indent

* PR suggestions remove args

(cherry picked from commit bfdda08)
After clicking on the Pause/Unpause toggle, the element remained in focus and therefore the toggle wouldn't go away. After a change event we will also trigger a blur event to remove the focus so the tooltip will only appear on hover.

Fixes: #16500
(cherry picked from commit ee93935)
The "color" method seems to have been removed.

(cherry picked from commit a1d9172)
The `enrich_errors` method assumes the first argument to the function
its patch is a TaskInstance when infact it can also be a LocalTaskJob.

This is now handled by extracting the task_instance from the
LocalTaskJob

Closes #18118

(cherry picked from commit f97ddf1)
The task_stats, last_dagruns, blocked etc expect dag_ids not dagIds.

This caused the endpoint to return all dags the user had access to by
default

closes: #18083
(cherry picked from commit d6e48cd)
Currently, tasks can be run even if the dagrun is queued. Task instances of queued dagruns
should only be run when the dagrun is in running state. This PR makes sure tis of queued dagruns
are not run thereby properly checking task concurrency.

Also, we check max_active_runs when parsing dag which is no longer needed since dagruns
are created in queued state and the scheduler controls when to change the queued dagruns
to running considering the max_active_runs.
This PR removes the checking of max_active_runs in the dag too.

(cherry picked from commit ffb81ea)
This hides the variable import form if the user does not have the "can
create on variable" permission.

(cherry picked from commit 7b3a5f9)
The graph view should show the "Download Log" and "View Logs in {remote
logging system}", like is done on the tree view.

(cherry picked from commit 83f1f07)
kaxil and others added 4 commits September 11, 2021 11:59
The way how dumb-init propagated the signal by default
made celery worker not to handle termination well.

Default behaviour of dumb-init is to propagate signals to the
process group rather than to the single child it uses. This is
protective behaviour, in case a user runs 'bash -c' command
without 'exec' - in this case signals should be sent not only
to the bash but also to the process(es) it creates, otherwise
bash exits without propagating the signal and you need second
signal to kill all processes.

However some airflow processes (in particular airflow celery worker)
behave in a responsible way and handles the signals appropriately
- when the first signal is received, it will switch to offline
mode and let all workers terminate (until grace period expires
resulting in Warm Shutdown.

Therefore we can disable the protection of dumb-init and let it
propagate the signal to only the single child it spawns in the
Helm Chart. Documentation of the image was also updated to include
explanation of signal propagation. For explicitness the
DUMB_INIT_SETSID variable has been set to 1 in the image as well.

Fixes #18066

(cherry picked from commit 9e13e45)
Regression on PID reset to allow task start after heartbeat

Co-authored-by: Nicolas MEHRAEIN <nicolas.mehraein@adevinta.com>
(cherry picked from commit ed99eaa)
@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:dev-tools area:production-image Production image improvements and fixes area:providers area:Scheduler including HA (high availability) scheduler provider:Apache labels Sep 11, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Sep 11, 2021
The recently updated docker-compose had a bit broken behaviour
for non-Linux users. It expected the .env file to be created
always, but the instructions to create them were not working
on Windows. This fixes the problem by turning the error
into warning, and directing the users to the right instructions
per operating system.

Also the recent ``DUMB_INIT_SESS_ID`` was added for worker to
allow to handle signals properly also in our quick-start
docker-compose.

(cherry picked from commit bd77689)
@eladkal
Copy link
Contributor

eladkal commented Sep 11, 2021

I think #17269 was missed from releases

@potiuk
Copy link
Member

potiuk commented Sep 11, 2021

I think #17269 was missed from releases

Why do you think it should be there @eladkal :)? We do not cherry-pick all bugfixes to "patchlevel" releases, it's always risky - I think 2.1.4 is mostly about stabilising 2.1.3 (which had a number of stability issues) + some doc-only changes that are mostly clarifications, guiding our users better, improvements in communication (generally 0 or low-risk changes). The #17269 is also some new behaviour rather than bugfix IMHO so it should go to 2.2

@eladkal
Copy link
Contributor

eladkal commented Sep 11, 2021

Why do you think it should be there @eladkal :)? We do not cherry-pick all bugfixes to "patchlevel" releases, it's always risky - I think 2.1.4 is mostly about stabilising 2.1.3 (which had a number of stability issues) + some doc-only changes that are mostly clarifications, guiding our users better, improvements in communication (generally 0 or low-risk changes). The #17269 is also some new behaviour rather than bugfix IMHO so it should go to 2.2

mostly as this seem to be a bug fix for #18102 but I'm OK with it waiting for 2.2 as it should follow shortly

potiuk and others added 4 commits September 11, 2021 18:05
This test wasn't working on python > 3.7.

(cherry picked from commit 4a0711c)
This PR separate installing Airflow from sources section and also fixes links for binary source, it had `-bin` suffix which we don't use anymore. And I have added section on verifying integrity. And add more details with examples

(cherry picked from commit f9969c1)
@kaxil kaxil merged commit 9168a0b into v2-1-stable Sep 11, 2021
@potiuk
Copy link
Member

potiuk commented Sep 11, 2021

Woohoo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:dev-tools area:production-image Production image improvements and fixes area:providers area:Scheduler including HA (high availability) scheduler full tests needed We need to run full set of tests for this PR to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.