Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compatibility with mssql #9973

Merged
merged 36 commits into from
May 26, 2021

Conversation

aneesh-joseph
Copy link
Contributor

@aneesh-joseph aneesh-joseph commented Jul 24, 2020

  • Improve compatibility with MSSQL
  • Make it easier to test against mssql locally ( ./breeze --backend=mssql )
  • Add MSSQL based tests to Airflow's CI

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in #9926 I would like us to improve the compatibility but won't like to add more test to our existing already large CI runs.

@potiuk is going to carry out a survey and then start the discussion on mailing list about it. Just a couple of days back, the large number of running test on v1-10-test causes a long-waiting queue.

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops I see you did not add tests, just breeze configs which allows you to run tests on your machine. I am completely fine with that 👍

Dockerfile.ci Outdated Show resolved Hide resolved
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But I think we should add somewhere (in README.md) next to supported versions, that mssql support is experimental and that we are evaluating it.

Dockerfile.ci Outdated Show resolved Hide resolved
@aneesh-joseph
Copy link
Contributor Author

doesn't new pushes to the branch cancel previous workflow runs? - looks like the CI workflows are running to completion for every push I made to this branch and the previous ones are not getting cancelled out. Is it something to do with the ci.yml in this branch?

@potiuk
Copy link
Member

potiuk commented Jul 24, 2020

doesn't new pushes to the branch cancel previous workflow runs? - looks like the CI workflows are running to completion for every push I made to this branch and the previous ones are not getting cancelled out. Is it something to do with the ci.yml in this branch?

They should cancel it. Let me see.

@potiuk
Copy link
Member

potiuk commented Jul 24, 2020

Ah... It seems that PRs run from the fork do not have access to cancel the already running PRs :( (see the last error message)... Hmm that makes the whole cancel approach we have only work for master merges and direct pushes to v1-10-test branch :(.

Maybe we will have to figure out something else. I guess you cannot cancel running PR of yours @aneesh-joseph ? Only committers can do i t I believe?

{
  eventName: 'pull_request',
  sha: 'f5cc73b67463ce2c21e74f7a11d13bc98d08f2e0',
  headSha: '09dff0cd3a4acac259f21785b604ce41c28f70a1',
  branch: 'improve-mssql-compat',
  owner: 'apache',
  repo: 'airflow'
}
Found input: 1029499
Found token: yes
Found 5 runs total.
Found 2 runs in progress.
Cancelling another run:  {
  id: 181255422,
  head_sha: '93f198dbd05927ae3e855f95cd13af5a196be190',
  status: 'queued'
}
Error while cancelling workflow_id 1029499: Resource not accessible by integration
Done.

@aneesh-joseph
Copy link
Contributor Author

Ah... It seems that PRs run from the fork do not have access to cancel the already running PRs :( (see the last error message)... Hmm that makes the whole cancel approach we have only work for master merges and direct pushes to v1-10-test branch :(.

Maybe we will have to figure out something else. I guess you cannot cancel running PR of yours @aneesh-joseph ? Only committers can do i t I believe?

{
  eventName: 'pull_request',
  sha: 'f5cc73b67463ce2c21e74f7a11d13bc98d08f2e0',
  headSha: '09dff0cd3a4acac259f21785b604ce41c28f70a1',
  branch: 'improve-mssql-compat',
  owner: 'apache',
  repo: 'airflow'
}
Found input: 1029499
Found token: yes
Found 5 runs total.
Found 2 runs in progress.
Cancelling another run:  {
  id: 181255422,
  head_sha: '93f198dbd05927ae3e855f95cd13af5a196be190',
  status: 'queued'
}
Error while cancelling workflow_id 1029499: Resource not accessible by integration
Done.

oops, and yes don't seem to have access to cancel runs from my PR

@potiuk
Copy link
Member

potiuk commented Jul 24, 2020

oops, and yes don't seem to have access to cancel runs from my PR

That's the root of the problem - the running PRs have only read-access token to apache/airflow repository :(. Not sure if we will be able to do much about it

@kaxil
Copy link
Member

kaxil commented Jul 24, 2020

oops, and yes don't seem to have access to cancel runs from my PR

That's the root of the problem - the running PRs have only read-access token to apache/airflow repository :(. Not sure if we will be able to do much about it

That issue has caused all sorts of issues with Github Action in general :(

@potiuk
Copy link
Member

potiuk commented Jul 24, 2020

That issue has caused all sorts of issues with Github Action in general :(

Yeah - but that's for a very good reason ... We do not want random fork's PR to make any changes to our repo before it gets merged. But I think eventually some of those permissions (in this case we need "cancel any PR from the same fork and only from the same fork" permission) will be added to GA.

@kaxil
Copy link
Member

kaxil commented Jul 24, 2020

That issue has caused all sorts of issues with Github Action in general :(

Yeah - but that's for a very good reason ... We do not want random fork's PR to make any changes to our repo before it gets merged. But I think eventually some of those permissions (in this case we need "cancel any PR from the same fork and only from the same fork" permission) will be added to GA.

The permission should have been configurable I feel similar to how we did with the bot. The number of people asking for this feature has increased significantly - hopefully Github now updates that :)

@mik-laj mik-laj changed the title improve compatibility with mssql Improve compatibility with mssql Jul 25, 2020
Copy link
Member

@feluelle feluelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I am wondering how much performance difference there is after changing the models code regarding the sql alchemy queries you had to modify.

If that is not necessary to modify, can/should we revert it? If not this code should be in sql alchemy mssql?

@jarkkorantala
Copy link

LGTM, but I am wondering how much performance difference there is after changing the models code regarding the sql alchemy queries you had to modify.

The change on airflow/models/dag.py is necessary, as it's moving the rendering decision from a hard-coded Postgres syntax is_paused IS True to a variable dialect syntax picked by SQLAlchemy (eg. is_paused IS True for Postgres, is_paused = 1 for SQL Server).

The changes on airflow/models/dagcode.py and airflow/models/serialized_dag.py look like they could incur a performance penalty as the query optimizer is not aware that only the first result will be read. Does SQL Server not support the original exists query?

@aneesh-joseph
Copy link
Contributor Author

aneesh-joseph commented Jul 28, 2020

LGTM, but I am wondering how much performance difference there is after changing the models code regarding the sql alchemy queries you had to modify.

The change on airflow/models/dag.py is necessary, as it's moving the rendering decision from a hard-coded Postgres syntax is_paused IS True to a variable dialect syntax picked by SQLAlchemy (eg. is_paused IS True for Postgres, is_paused = 1 for SQL Server).

The changes on airflow/models/dagcode.py and airflow/models/serialized_dag.py look like they could incur a performance penalty as the query optimizer is not aware that only the first result will be read. Does SQL Server not support the original exists query?

it does support exists, but doesn't support EXISTS expression in the columns clause of a SELECT.

Failing airflow tests with mssql and exists checks - https://github.com/aneesh-joseph/airflow-tests/pull/15/checks?check_run_id=918457634

so I had to do the change in this PR or will have to change it to something like:

from sqlalchemy.sql.expression import literal
return  session.query(literal(True)).filter(
    session.query(cls)
    .filter(cls.dag_id == dag_id)
    .exists()
).scalar()

@jarkkorantala
Copy link

jarkkorantala commented Jul 28, 2020

from sqlalchemy.sql.expression import literal
return  session.query(literal(True)).filter(
    session.query(cls)
    .filter(cls.dag_id == dag_id)
    .exists()
).scalar()

This looks like a better solution than the one with .first(), but it still finds reads through all of the matching values.
Adding a limit should fix that:

from sqlalchemy import true
return (
    session.query(true())
    .filter(
        session.query(cls).filter(cls.dag_id == dag_id).exists()
    )
    .limit(1)
    .scalar()
)

Maybe scalar already does that though?
Update: no, scalar doesn't add limits.

Update: first() does add limits though, so the version in the PR is already close to optimal. it does however deserialize the entire object, so could be improved with something like

from sqlalchemy import select, true
return (
    session.execute(
        select([true()])
        .where(cls.dag_id == dag_id)
        .limit(1)
    ).scalar()
)

@aneesh-joseph aneesh-joseph force-pushed the improve-mssql-compat branch 3 times, most recently from b4bfc73 to 23b6831 Compare July 29, 2020 06:21
@aneesh-joseph
Copy link
Contributor Author

aneesh-joseph commented Jul 29, 2020

This looks like a better solution than the one with .first()

@jarkkorantala yes, but it breaks on My SQL - sqlalchemy/sqlalchemy#5481

it does however deserialize the entire object

I have updated the PR so that it queries for literal(True) instead of querying the complete object.

These checks are now getting translated into the below queries

MySQL

SELECT 1 AS param_1
FROM dag_code
WHERE dag_code.fileloc_hash = 15056821532142474
LIMIT 1

MS SQL

SELECT TOP 1 1 AS param_1
FROM dag_code
WHERE dag_code.fileloc_hash = 15056821532142474

SQLite

SELECT 1 AS param_1
FROM dag_code
WHERE dag_code.fileloc_hash = 15056821532142474
LIMIT 1 OFFSET 0

Postgres

SELECT True AS param_1
FROM dag_code
WHERE dag_code.fileloc_hash = 15056821532142474
LIMIT 1

Before this PR, they were getting translated into

MySQL

SELECT EXISTS (SELECT *
FROM dag_code
WHERE dag_code.fileloc_hash = %s) AS anon_1

MS SQL(wrong query)

SELECT EXISTS (SELECT *
FROM dag_code
WHERE dag_code.fileloc_hash = 15056821532142474) AS anon_1

SQLite

SELECT EXISTS (SELECT *
FROM dag_code
WHERE dag_code.fileloc_hash = 15056821532142474) AS anon_1

Postgres

SELECT EXISTS (SELECT *
FROM dag_code
WHERE dag_code.fileloc_hash = 15056821532142474) AS anon_1

@aneesh-joseph aneesh-joseph force-pushed the improve-mssql-compat branch 4 times, most recently from 25b5ace to 4eb8032 Compare July 30, 2020 13:33
@aneesh-joseph
Copy link
Contributor Author

@potiuk thank you, that sorted it.. btw does the master build run on self hosted runners which have more memory.. should we enable mssql integration tests for master builds?

@potiuk
Copy link
Member

potiuk commented May 26, 2021

@potiuk thank you, that sorted it.. btw does the master build run on self hosted runners which have more memory.. should we enable mssql integration tests for master builds?

It does and it is already sorted out :). the change I added is in the branch that is only executed when memory on the instance is less than < 32 G or so - this means that integration tests will run on master in MSSQL as well :).

if [[ ${test_types_to_run} == *"Integration"* ]]; then
        if (( MEMORY_AVAILABLE_FOR_DOCKER < MEMORY_REQUIRED_FOR_INTEGRATION_TEST_PARALLEL_RUN )) ; then
            # In case of Integration tests - they need more resources (Memory) thus we only run them in

@potiuk potiuk dismissed ashb’s stale review May 26, 2021 09:25

All points addressed and seems we are finally ready to merge that one !

@potiuk potiuk merged commit f2b7d05 into apache:master May 26, 2021
@potiuk
Copy link
Member

potiuk commented May 26, 2021

🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉 🎉

@aneesh-joseph aneesh-joseph deleted the improve-mssql-compat branch May 26, 2021 09:28
@aneesh-joseph
Copy link
Contributor Author

awesome, thank you Jarek 👍

@kaxil
Copy link
Member

kaxil commented May 26, 2021

Thanks @aneesh-joseph this is great, it took lot of time and pings but thank you for your patience

@@ -897,7 +898,9 @@ def get_num_active_runs(self, external_trigger=None, session=None):
)

if external_trigger is not None:
query = query.filter(DagRun.external_trigger == external_trigger)
query = query.filter(
DagRun.external_trigger == (expression.true() if external_trigger else expression.false())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? What is the type of external_trigger here to require this workaround?

[(dm.dag_id, dm.next_dagrun) for dm in dag_models]
)

active_dagruns = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a join here or CTE rather than making multiple queries?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Feel free to give it a shot if you like :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.