Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-58] Add bulk_dump abstract method to DbApiHook #1471

Merged
merged 1 commit into from
May 6, 2016

Conversation

underyx
Copy link
Contributor

@underyx underyx commented May 6, 2016

Dear Airflow Maintainers,

Please accept this PR that addresses the following issues:

@mistercrunch
Copy link
Member

LGTM, merging

@mistercrunch mistercrunch merged commit aff5d8c into apache:master May 6, 2016
@r39132
Copy link
Contributor

r39132 commented May 6, 2016

I don't think tab-delimited or any text format dump is fool-proof and I'm -1 for this: i.e. what if a text column contains tabs?

Please remove this. I'd prefer a hook that provided serialization options (e.g. avro, thrift, protobufs), all of which support schema'd binary formats.

@underyx
Copy link
Contributor Author

underyx commented May 6, 2016

Airflow supports loading tab-delimited data into MySQL since 1.5.2.

I have no strong opinion on the matter, and would be okay with the removal of the feature, but I'd note that the dbapi hooks are already subject to various similar limitations that users seem to be expected to discover on their own — such as serialization of unknown data types being just str(obj).

yiqingj pushed a commit to yiqingj/airflow that referenced this pull request May 27, 2016
* master:
  AIRFLOW-92 Avoid unneeded upstream_failed session closes apache#1485
  AIRFLOW-52 Warn about overwriting tasks in a DAG
  Add logic to lock DB and avoid race condition
  Handle queued tasks from multiple jobs/executors
  [AIRFLOW-80] Move example_twitter dag to contrib/example_dags as it requires hive
  [AIRFLOW-75] Fix bug in S3 config file parsing
  Use getfqdn to make sure urls are fully qualified
  [AIRFLOW-52] Fix bottlenecks when working with many tasks
  Add bulk_dump abstract method to DbApiHook (apache#1471)
  Fix corner case with joining processes/queues (apache#1473)
  [AIRFLOW-53] Adding DagBag stats report to CLI's list_dags (apache#1468)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants