Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing pyarrow dependency in google provider? #42924

Closed
2 tasks done
saucoide opened this issue Oct 11, 2024 · 3 comments · Fixed by #42996
Closed
2 tasks done

missing pyarrow dependency in google provider? #42924

saucoide opened this issue Oct 11, 2024 · 3 comments · Fixed by #42996
Assignees
Labels
area:dependencies Issues related to dependencies problems area:providers good first issue kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet provider:google Google (including GCP) related issues

Comments

@saucoide
Copy link
Contributor

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

6.1 but looks it's the same in >10

Apache Airflow version

2.2

Operating System

tested linux, macos

Deployment

Virtualenv installation

Deployment details

linux/macos & uv pip to install the packages

What happened

I dont know if im missing some obvious reason for this, but pyarrow is not specified as a dependency for the google provider, while it definetly depends on it: https://github.com/apache/airflow/blob/main/providers/src/airflow/providers/google/cloud/transfers/sql_to_gcs.py#L29

If i do an install in a venv with:

dependencies = [
	"apache-airflow==2.2",
	"apache-airflow-providers-google==6.3",
	"google-cloud-bigquery>=1"
]

pyarrow won't be installed, and importing from sql_to_gcs will raise a exception

if i remove google-cloud-bigquery, it WILL be installed, i have no idea what causes this behavior since google-cloud-bigquery does list pyarrow as a dependency. But the version is installed is due to pandas-gbq and depending on it

What you think should happen instead

IMO if a package is used directly, then it's a direct dependency and it shouldn't rely on it being available via indirect dependencies

I can just add pyarrow myself and solve my problem, but i think the dependency should be explicitly defined in the provider

How to reproduce

[project]
name = "tests"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.8"
dependencies = [
	"apache-airflow==2.2",
	"apache-airflow-providers-google==6.3",
	"google-cloud-bigquery>=1"
]

uv pip install .

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@saucoide saucoide added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Oct 11, 2024
Copy link

boring-cyborg bot commented Oct 11, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@dosubot dosubot bot added area:dependencies Issues related to dependencies problems provider:google Google (including GCP) related issues labels Oct 11, 2024
@potiuk
Copy link
Member

potiuk commented Oct 13, 2024

Sure - feel free to add it to "provider.yaml" in "providers/google/..." folder. Marked it as a good first issue to fix. Just add it in the same way as in other provider.yaml files:

 - pyarrow>=14.0.1

@potiuk
Copy link
Member

potiuk commented Oct 13, 2024

BTW. Temporary workaround - until you fix it and we relase a new provider - is o explicitly add "pyarrow>=14.0.1" in your dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dependencies Issues related to dependencies problems area:providers good first issue kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants