Skip to content

Git DAG Bundle clone fail when using repo_url arguments for private repo #54642

@adrian-edbert

Description

@adrian-edbert

Apache Airflow Provider(s)

git

Versions of Apache Airflow Providers

apache-airflow-providers-git==0.0.6

Apache Airflow version

3.0.3

Operating System

Debian GNU/Linux 12 (bookworm)

Deployment

Docker-Compose

Deployment details

image: apache/airflow:3.0.3-python3.12

What happened

When running Git Dag Bundle with repo_url parameters and using HTTPS connection on private repo, during first clone bare repo, git command will fail with 128 error code
This is caused by the git clone using the repo_url from the parameter instead of from the Git hooks (Git hooks url should have the username and password)

What you think should happen instead

Git Dag Bundle should clone successfully
Git Dag Bundle repo_url should pass the repo_url from Git Hook

How to reproduce

Set Git connection with

airflow connections add 'git-dags-conn' \
    --conn-type git \
    --conn-login "$GITLAB_TOKEN_NAME" \
    --conn-password "$GITLAB_TOKEN_VALUE" \
    --conn-host 'https://gitlab.com/$GITLAB_USER/$GITLAB_REPO.git'

Set processor AIRFLOW__DAG_PROCESSOR__DAG_BUNDLE_CONFIG_LIST with repo_url set with https into private repo

{
  "name": "data_platform_test_migration_dag",
  "classpath": "airflow.providers.git.bundles.git.GitDagBundle",
  "kwargs": {
    "tracking_ref": "main",
    "git_conn_id": "git-dags-conn",
    "repo_url": "https://gitlab.com/$GITLAB_USER/$GITLAB_REPO.git"
  }
},

Run DAG Processor
It will fail with

...
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git clone -v --bare -- https://gitlab.com/$GITLAB_USER/$GITLAB_REPO.git 

fatal: could not read Username for 'https://gitlab.com': No such device or address

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.12/site-packages/airflow/dag_processing/manager.py", line 503, in _refresh_dag_bundles
    bundle.initialize()
  File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/git/bundles/git.py", line 111, in initialize
    self._initialize()
  File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/git/bundles/git.py", line 94, in _initialize
    self._clone_bare_repo_if_required()
  File "/home/airflow/.local/lib/python3.12/site-packages/airflow/providers/git/bundles/git.py", line 142, in _clone_bare_repo_if_required
    raise AirflowException("Error cloning repository") from e
airflow.exceptions.AirflowException: Error cloning repository

To workaround we can add the token into the repo_url parameters, but this might expose the token

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions