Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-1945] Autoscale celery workers for airflow added #3989

Merged
merged 1 commit into from
Oct 15, 2018
Merged

[AIRFLOW-1945] Autoscale celery workers for airflow added #3989

merged 1 commit into from
Oct 15, 2018

Conversation

phanindhra876
Copy link
Contributor

Dear Airflow Maintainers,

This will add a provision to autoscale celery workers unlike same numbers of workers irrespective of number of running tasks.

Please accept this PR that addresses the following issues:
https://issues.apache.org/jira/browse/AIRFLOW-1945

Testing Done:

Manually tested by passing arguments in cli

@codecov-io
Copy link

codecov-io commented Oct 3, 2018

Codecov Report

Merging #3989 into master will decrease coverage by 3.04%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3989      +/-   ##
==========================================
- Coverage   75.92%   72.87%   -3.05%     
==========================================
  Files         199      199              
  Lines       15954    17003    +1049     
==========================================
+ Hits        12113    12391     +278     
- Misses       3841     4612     +771
Impacted Files Coverage Δ
airflow/bin/cli.py 58.76% <0%> (-6.3%) ⬇️
airflow/hooks/druid_hook.py 67.36% <0%> (-20.64%) ⬇️
airflow/task/task_runner/base_task_runner.py 60.97% <0%> (-18.34%) ⬇️
airflow/models.py 75.55% <0%> (-16.4%) ⬇️
airflow/example_dags/example_python_operator.py 78.94% <0%> (-15.79%) ⬇️
airflow/utils/configuration.py 85.71% <0%> (-14.29%) ⬇️
airflow/configuration.py 85.07% <0%> (-3.78%) ⬇️
airflow/operators/s3_file_transform_operator.py 93.87% <0%> (-2.35%) ⬇️
airflow/www_rbac/views.py 72.18% <0%> (-0.36%) ⬇️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a581cba...959ca5d. Read the comment docs.

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the contribution guidelines.

# based on number of queued tasks. pick these numbers based on resources on
# worker box and the nature of the task. If autoscale option is available worker_concurrency
# will be ignored
worker_autoscale = 12,16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably shouldn't set a default value for this and worker_concurrency. Please comment this one out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A link to the section of the celery docs about this in the comment would help too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on not using default values but we don't have a provision to use shell functions here so that during execution we can get number of cores and all. What can be done in these scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashb this value has been commented out. Please review

@msumit
Copy link
Contributor

msumit commented Oct 8, 2018

@phani8996 whats harm in letting my machine run with the full capacity of workers all the time? Cause if I am allowing to grow it to a max, then it means that my machine has the capacity to handle that many workers anyway.

@phanindhra876
Copy link
Contributor Author

@phani8996 whats harm in letting my machine run with the full capacity of workers all the time? Cause if I am allowing to grow it to a max, then it means that my machine has the capacity to handle that many workers anyway.

We can run it at full capacity, but what advantage are we going to get with a bunch of idle workers? Instead this feature spawns workers as per demand. In a way you get what is required. No more under utilisation of workers.

# "airflow worker" command. Pick these numbers based on resources on
# worker box and the nature of the task. If autoscale option is available worker_concurrency
# will be ignored
#worker_autoscale = 12,16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fokko Link has been added. Please check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phani8996 a space after # would be better. i.e # worker_autoscale = 12,16

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@msumit
Copy link
Contributor

msumit commented Oct 15, 2018

@phani8996 plz rebase your commits into a single commit. Also, the commit message could be like this [AIRFLOW-1945] Add Autoscale config for Celery workers

@phanindhra876
Copy link
Contributor Author

@phani8996 plz rebase your commits into a single commit. Also, the commit message could be like this [AIRFLOW-1945] Add Autoscale config for Celery workers

@msumit commits have been rebased and commit message updated with proper message. Please check.

@phanindhra876
Copy link
Contributor Author

Mistakely closed PR

@phanindhra876 phanindhra876 reopened this Oct 15, 2018
@msumit msumit merged commit 6097f82 into apache:master Oct 15, 2018
aliceabe pushed a commit to aliceabe/incubator-airflow that referenced this pull request Jan 3, 2019
ashb pushed a commit to ashb/airflow that referenced this pull request Mar 5, 2019
wmorris75 pushed a commit to modmed/incubator-airflow that referenced this pull request Jul 29, 2019
@ddelange
Copy link
Contributor

fyi, seems this functionality has been removed in celery v4.x :(
celery/celery#4003

@potiuk
Copy link
Member

potiuk commented Sep 27, 2019

Taking into account the comments in the celary bug quoted by @ddelange - should we really enable this @Fokko @ashb ? Seems that autoscale option does not work in celery 4.x at all. And this the version we are using.

@ashb
Copy link
Member

ashb commented Sep 27, 2019

D'oh. Well that's annoyingly frustrating. I guess we should remove that in our next point release with a note in the updating to the lines of "We've removed this option as celery didn't respect it."

Having an extra config directive in our files won't cause any problems/errors on our side so it can be in 1.10.16. @ddelange Fancy creating such a PR? (Against master please, and I will deal with back-porting it to the release)

@ddelange
Copy link
Contributor

I actually really like the potential of using celery autoscale (e.g. put one massive worker on an autoscaling AWS EC2 instance to solve issue of scaling airflow temporarily for a sudden heavy load).
I saw they added autoscaling back for celery milestones 4.5/4.6/4.7 or else for sure v5, so I've left the autoscaling option in our airflow config file, linking to the celery issue in our comments below the comments from default_config.cfg, and hope they fix it in future celery version, while assuming that at some point airflow will bump celery along in a future version and it'll start working again :)

right now it's not breaking anything having this option, celery only just ignores this option internally, and permanently puts it on the minimum concurrency specified. so from my side all good for now. just wanted to let you know (and anyone else who may stumble upon this PR via git blame) ^^

@ashb
Copy link
Member

ashb commented Sep 27, 2019

Sounds good. I'm okay leaving it so long as Celery still document it (even if it doesn't "work" right now) then

@msumit
Copy link
Contributor

msumit commented Sep 27, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants