Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-2224] Add support CSV files in MySqlToGoogleCloudStorageOperator #4738

Merged
merged 1 commit into from
Mar 6, 2019

Conversation

ttanay
Copy link
Contributor

@ttanay ttanay commented Feb 19, 2019

MySqlToGoogleCloudStorageOperator supported export from MySQL in newline-delimited JSON format only.

Added support for export in CSV format with the option of specifying a field delimiter.

Thanks to Bernardo Najlis(@bnajlis) for the original PR(#3139).
I made some changes to the the original PR.

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Added tests:
  • test_init
  • test_exec_success_json
  • test_exec_success_csv
  • test_exec_success_csv_with_delimiter
  • test_file_splitting
  • test_schema_file
    Replaced the pre-existing test with the one @bnajlis wrote as those tests covered that case too.

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.
    • All the public functions and the classes in the PR contain docstrings that explain what it does

Code Quality

  • Passes flake8

@ttanay
Copy link
Contributor Author

ttanay commented Feb 20, 2019

The test failure is in job 11.

test_integration_run_dag_with_scheduler_failure (tests.contrib.minikube.test_kubernetes_executor.KubernetesExecutorTest) ... pod "airflow-7688fb5758-8h4dz" deleted
api call failed. trying again. error HTTPConnectionPool(host='10.20.6.62', port=30809): Max retries exceeded with url: /api/experimental/dags/example_kubernetes_executor_config/dag_runs/2019-02-20T00:27:25+00:00/tasks/start_task (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7faf733be6a0>: Failed to establish a new connection: [Errno 111] Connection refused',))

This is unrelated to the changes in this PR.

@ttanay
Copy link
Contributor Author

ttanay commented Feb 22, 2019

The job 7 fails with message:

ERROR: for redis  Get https://registry-1.docker.io/v2/library/redis/manifests/5.0.1: Get https://auth.docker.io/token?scope=repository%3Alibrary%2Fredis%3Apull&service=registry.docker.io: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Traceback (most recent call last):
  File "bin/docker-compose", line 6, in <module>
  File "compose/cli/main.py", line 71, in main
  File "compose/cli/main.py", line 127, in perform_command
  File "compose/cli/main.py", line 716, in pull
  File "compose/project.py", line 558, in pull
TypeError: sequence item 0: expected a bytes-like object, str found
[4622] Failed to execute script docker-compose
The command "docker-compose -f scripts/ci/docker-compose.yml pull --quiet --parallel" failed and exited with 255 during .
Your build has been stopped.

There was a failure in the build itself for this job.
Can this individual job alone be restarted?

@zhongjiajie
Copy link
Member

@ttanay Maybe you could base on master and git push -f to your branch restart the CI test again.

@ttanay
Copy link
Contributor Author

ttanay commented Feb 22, 2019

@ttanay Maybe you could base on master and git push -f to your branch restart the CI test again.

Done

@ttanay
Copy link
Contributor Author

ttanay commented Feb 22, 2019

Job 9 failed with:

Authenticating as principal admin/admin with password.
kadmin: Cannot resolve network address for admin server in requested realm while initializing kadmin interface
ERROR: InvocationError for command '/app/scripts/ci/2-setup-kdc.sh' (exited with code 1)
___________________________________ summary ____________________________________
ERROR:   py35-backend_postgres-env_docker: commands failed
The command "if [ -z "$KUBERNETES_VERSION" ]; then docker-compose --log-level ERROR -f scripts/ci/docker-compose.yml run airflow-testing /app/scripts/ci/run-ci.sh; fi" exited with 1.

Unrelated to changes in this PR. Is there a chance something is breaking in the Travis tests?
Force pushing again to trigger the tests.

…ator

MySqlToGoogleCloudStorageOperator supported export from MySQL in
newline-delimited JSON format only. Added support for export
from MySQL in CSV format with the option of specifying a field
delimiter.Thanks to Bernardo Najlis(@bnajlis) for the original
PR(apache#3139). I made some changes to the the original PR.
@codecov-io
Copy link

Codecov Report

Merging #4738 into master will increase coverage by 0.17%.
The diff coverage is 97.36%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4738      +/-   ##
==========================================
+ Coverage   74.61%   74.78%   +0.17%     
==========================================
  Files         431      431              
  Lines       28044    28064      +20     
==========================================
+ Hits        20925    20989      +64     
+ Misses       7119     7075      -44
Impacted Files Coverage Δ
airflow/contrib/operators/mysql_to_gcs.py 90.07% <97.36%> (+38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4745910...09e9936. Read the comment docs.

@ttanay
Copy link
Contributor Author

ttanay commented Mar 6, 2019

All tests pass.

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. The default behavior is maintained, and the tests are there.

@Fokko Fokko merged commit ab4d0f6 into apache:master Mar 6, 2019
@ttanay
Copy link
Contributor Author

ttanay commented Mar 6, 2019

Thanks @Fokko!

ashb pushed a commit that referenced this pull request Mar 21, 2019
…ator (#4738)

MySqlToGoogleCloudStorageOperator supported export from MySQL in
newline-delimited JSON format only. Added support for export
from MySQL in CSV format with the option of specifying a field
delimiter.Thanks to Bernardo Najlis(@bnajlis) for the original
PR(#3139). I made some changes to the the original PR.
ashb pushed a commit that referenced this pull request Mar 22, 2019
…ator (#4738)

MySqlToGoogleCloudStorageOperator supported export from MySQL in
newline-delimited JSON format only. Added support for export
from MySQL in CSV format with the option of specifying a field
delimiter.Thanks to Bernardo Najlis(@bnajlis) for the original
PR(#3139). I made some changes to the the original PR.
andriisoldatenko pushed a commit to andriisoldatenko/airflow that referenced this pull request Jul 26, 2019
…ator (apache#4738)

MySqlToGoogleCloudStorageOperator supported export from MySQL in
newline-delimited JSON format only. Added support for export
from MySQL in CSV format with the option of specifying a field
delimiter.Thanks to Bernardo Najlis(@bnajlis) for the original
PR(apache#3139). I made some changes to the the original PR.
wmorris75 pushed a commit to modmed/incubator-airflow that referenced this pull request Jul 29, 2019
…ator (apache#4738)

MySqlToGoogleCloudStorageOperator supported export from MySQL in
newline-delimited JSON format only. Added support for export
from MySQL in CSV format with the option of specifying a field
delimiter.Thanks to Bernardo Najlis(@bnajlis) for the original
PR(apache#3139). I made some changes to the the original PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants