Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-3698] Add documentation for AWS Connection #4514

Merged

Conversation

danabananarama
Copy link
Contributor

@danabananarama danabananarama commented Jan 13, 2019

Make sure you have checked all steps below.

Jira

Description

  • Here are some details about my PR, including screenshots of any UI changes:
    This PR is to allow setting the AWS region in the EmrJobFlowSensor so that we can sense EMR job flow completion in any region.
    The AWS region setting functionality is available via the extra param in the Connection class, so adding a section in the docs to describe this

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Amending test, did not add new test for functionality as it's essentially just persisting a new init param. Let me know if a new test needs to be added!
    Doc change only

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.
    • All the public functions and the classes in the PR contain docstrings that explain what it does

Code Quality

  • Passes flake8

@danabananarama danabananarama force-pushed the AIRFLOW-3698_set_jobflow_sensor_region branch 3 times, most recently from bb98cc2 to 46b95a8 Compare January 14, 2019 02:40
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is required - it is possibly to specify the region via the extra field of the connection: {"region_name": "ap-southeast-2"}

@danabananarama
Copy link
Contributor Author

Thanks for looking @ashb! And thanks for the tip, it does the trick. Did I miss this in the documentation? (or if not, where would you recommend as a sensible place to add it? Here?)

@ashb
Copy link
Member

ashb commented Jan 15, 2019

You probably didn't miss it - it's a hole in the docs :(

@danabananarama danabananarama force-pushed the AIRFLOW-3698_set_jobflow_sensor_region branch from 46b95a8 to e9e204a Compare January 16, 2019 23:52
@danabananarama danabananarama changed the title [AIRFLOW-3698] Add region param for EMR jobflow sensor [AIRFLOW-3698] Add documentation for AWS Connection Jan 16, 2019
@danabananarama
Copy link
Contributor Author

danabananarama commented Jan 16, 2019

I've updated the PR to instead add a bit in the Managing Connections page in the docs, please let me know if there's a more suitable home for it or if anything needs rewording (:

@mik-laj
Copy link
Member

mik-laj commented Jan 17, 2019

Could you provide a description on how to set up a connection via environment variables?
Similar to GCP: https://github.com/apache/airflow/pull/4523/files

@danabananarama
Copy link
Contributor Author

@mik-laj Oh cool, I didn't know you could do that! I'm afraid I haven't set up the AWS Connection that way before so I'm not aware of any AWS-specific things you might need to specify when setting it up that way.

@mik-laj
Copy link
Member

mik-laj commented Jan 18, 2019

@mik-laj What specific things do you need?
schema part must be equals aws. Query parameters must cotanins extra parameters. For example: aws_access_key_id=AA&aws_secret_access_key=BB&CC=aws_account_id'. You should also write what the default con_id` is in hooks and operators for this platforms.

@danabananarama
Copy link
Contributor Author

Since the query parameter bit is the same as for other Connection types, perhaps it makes more sense for the general case to be documented in the "Creating a Connection with Environment Variables" section to avoid repetition?

I'll add the general default conn_id, but I'm not sure it's worth including all of the hooks and operators for this, there are quite a lot and it would be tough to ensure this doc section is completely maintained going forward.

@mik-laj
Copy link
Member

mik-laj commented Jan 18, 2019

This is already described when my PR is accepted.I personally think that this is not enough. Unfortunately people are lazy and if they are to write a long command, they will rather look for it, only to copy it. I see it myself. It's easy to make a mistake, so I type in Google to get a full and ready command.

@mik-laj
Copy link
Member

mik-laj commented Jan 18, 2019

I do not think that it is worth writing a list of operators and hooks. just mention the default conn_id is enough

docs/howto/manage-connections.rst Outdated Show resolved Hide resolved
docs/howto/manage-connections.rst Show resolved Hide resolved
docs/howto/manage-connections.rst Outdated Show resolved Hide resolved
Password (optional)
Specify the AWS secret access key.

Extra (optional)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be good to mention that users can specify endpoint in Extra section by giving "host". This will be useful in cases like they're using S3-compatible services (but not AWS service.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I missed that, thanks for the spot!

@danabananarama
Copy link
Contributor Author

Thanks for the feedback @ashb, @XD-DENG , I think I've addressed all the comments. Please let me know if anything else needs tweaking!

@codecov-io
Copy link

Codecov Report

Merging #4514 into master will increase coverage by 0.31%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4514      +/-   ##
==========================================
+ Coverage   73.81%   74.12%   +0.31%     
==========================================
  Files         421      421              
  Lines       27662    27700      +38     
==========================================
+ Hits        20418    20532     +114     
+ Misses       7244     7168      -76
Impacted Files Coverage Δ
airflow/bin/cli.py 64.64% <0%> (-0.44%) ⬇️
airflow/contrib/hooks/gcp_dataflow_hook.py 75.75% <0%> (ø) ⬆️
airflow/contrib/hooks/spark_submit_hook.py 79.82% <0%> (+0.08%) ⬆️
airflow/models/__init__.py 92.33% <0%> (+0.14%) ⬆️
airflow/contrib/operators/spark_submit_operator.py 92.3% <0%> (+0.2%) ⬆️
airflow/hooks/dbapi_hook.py 79.03% <0%> (+0.8%) ⬆️
airflow/hooks/hive_hooks.py 75.26% <0%> (+1.84%) ⬆️
airflow/models/connection.py 64.02% <0%> (+1.99%) ⬆️
airflow/contrib/hooks/gcp_api_base_hook.py 81.39% <0%> (+3.78%) ⬆️
airflow/utils/sqlalchemy.py 81.81% <0%> (+4.54%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c0949cb...742914d. Read the comment docs.

1 similar comment
@codecov-io
Copy link

Codecov Report

Merging #4514 into master will increase coverage by 0.31%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4514      +/-   ##
==========================================
+ Coverage   73.81%   74.12%   +0.31%     
==========================================
  Files         421      421              
  Lines       27662    27700      +38     
==========================================
+ Hits        20418    20532     +114     
+ Misses       7244     7168      -76
Impacted Files Coverage Δ
airflow/bin/cli.py 64.64% <0%> (-0.44%) ⬇️
airflow/contrib/hooks/gcp_dataflow_hook.py 75.75% <0%> (ø) ⬆️
airflow/contrib/hooks/spark_submit_hook.py 79.82% <0%> (+0.08%) ⬆️
airflow/models/__init__.py 92.33% <0%> (+0.14%) ⬆️
airflow/contrib/operators/spark_submit_operator.py 92.3% <0%> (+0.2%) ⬆️
airflow/hooks/dbapi_hook.py 79.03% <0%> (+0.8%) ⬆️
airflow/hooks/hive_hooks.py 75.26% <0%> (+1.84%) ⬆️
airflow/models/connection.py 64.02% <0%> (+1.99%) ⬆️
airflow/contrib/hooks/gcp_api_base_hook.py 81.39% <0%> (+3.78%) ⬆️
airflow/utils/sqlalchemy.py 81.81% <0%> (+4.54%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c0949cb...742914d. Read the comment docs.

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy. Any further comments @XD-DENG?

@XD-DENG
Copy link
Member

XD-DENG commented Jan 22, 2019

@ashb , LGTM.

Thanks @danabananarama

@ashb ashb merged commit 89dd877 into apache:master Jan 22, 2019
wmorris75 pushed a commit to modmed/incubator-airflow that referenced this pull request Jul 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants