From d28caac32545c66df4964e3504204faf90ff3e87 Mon Sep 17 00:00:00 2001 From: Dushyant Bhalgami Date: Sun, 2 Jun 2024 17:53:58 +0200 Subject: [PATCH 1/2] fix(ingestion/airflow-plugin): updated the document for developers --- docs/lineage/airflow.md | 2 +- metadata-ingestion/developing.md | 21 +++++++++++++++++++++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/docs/lineage/airflow.md b/docs/lineage/airflow.md index f0952309c328a..1745c23cb1923 100644 --- a/docs/lineage/airflow.md +++ b/docs/lineage/airflow.md @@ -69,7 +69,7 @@ enabled = True # default | -------------------------- | -------------------- | ---------------------------------------------------------------------------------------- | | enabled | true | If the plugin should be enabled. | | conn_id | datahub_rest_default | The name of the datahub rest connection. | -| cluster | prod | name of the airflow cluster | +| cluster | prod | name of the airflow cluster, this is equivalent to the `env` of the instance | | capture_ownership_info | true | Extract DAG ownership. | | capture_tags_info | true | Extract DAG tags. | | capture_executions | true | Extract task runs and success/failure statuses. This will show up in DataHub "Runs" tab. | diff --git a/metadata-ingestion/developing.md b/metadata-ingestion/developing.md index c0d004e961059..60806203d0f81 100644 --- a/metadata-ingestion/developing.md +++ b/metadata-ingestion/developing.md @@ -34,6 +34,27 @@ cd metadata-ingestion-modules/airflow-plugin ../../gradlew :metadata-ingestion-modules:airflow-plugin:installDev source venv/bin/activate datahub version # should print "DataHub CLI version: unavailable (installed in develop mode)" + +# start the airflow web server +export AIRFLOW_HOME=~/airflow +airflow webserver --port 8090 -d + +# start the airflow scheduler +airflow scheduler + +# access the airflow service and run any of the DAG +open http://localhost:8090/ +select any DAG and click on the `play arrow` button to start the DAG + +# add the debug lines in the codebase, i.e. in ./src/datahub_airflow_plugin/datahub_listener.py +logger.debug("this is the sample debug line") + +# run the DAG again and you can see the debug lines in the task_run log at, +1. click on the `timestamp` in the `Last Run` column +2. select the task +3. click on the `log` option + +P.S. if you are not able to see the log lines, then restart the `airflow scheduler` and rerun the DAG ``` ### (Optional) Set up your Python environment for developing on Dagster Plugin From cbd249e8a469a22680699c704e994b2338fa46d6 Mon Sep 17 00:00:00 2001 From: Dushyant Bhalgami Date: Wed, 5 Jun 2024 18:24:12 +0200 Subject: [PATCH 2/2] fix(ingestion/airflow-plugin): fixed review comments --- metadata-ingestion/developing.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/metadata-ingestion/developing.md b/metadata-ingestion/developing.md index 60806203d0f81..e0dbc7c8d4b14 100644 --- a/metadata-ingestion/developing.md +++ b/metadata-ingestion/developing.md @@ -43,19 +43,21 @@ airflow webserver --port 8090 -d airflow scheduler # access the airflow service and run any of the DAG -open http://localhost:8090/ -select any DAG and click on the `play arrow` button to start the DAG +# open http://localhost:8090/ +# select any DAG and click on the `play arrow` button to start the DAG # add the debug lines in the codebase, i.e. in ./src/datahub_airflow_plugin/datahub_listener.py logger.debug("this is the sample debug line") # run the DAG again and you can see the debug lines in the task_run log at, -1. click on the `timestamp` in the `Last Run` column -2. select the task -3. click on the `log` option - -P.S. if you are not able to see the log lines, then restart the `airflow scheduler` and rerun the DAG +#1. click on the `timestamp` in the `Last Run` column +#2. select the task +#3. click on the `log` option ``` + + +> **P.S. if you are not able to see the log lines, then restart the `airflow scheduler` and rerun the DAG** + ### (Optional) Set up your Python environment for developing on Dagster Plugin From the repository root: