-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New dependency on pyarrow
introduces heavyweight numpy
sub-dependency
#1196
Comments
Possibly a duplicate of #1142 |
For some background, the reason we added |
Concur with the post, we were using this as a replacement over the now bloated google-api-python-client and will be forced to pin this to the latest non problematic version because we use this from an AWS lambda environment which have very specific restricions in terms of deployment size |
If someone were to send a PR to do this, I'd be open to it, given the extra dependency does appear to block more use cases than I anticipated. We'd want to:
|
To implement this, we'd basically want to revert #776. I doubt a simple "revert" will be sufficient at this point as 2.x diverged a bit from 3.x. |
I saw my binary size grow from 30-40MB to 234MB, and I found out it's because I upgraded google-cloud-bigquery and picked up a dependency on pyarrow and numpy. Any update on this bug? I don't think I use the parts of the bigquery package that require these dependencies, so the extra dependencies are just slowing down deployment and startup time (new binary is 5-6x bigger). |
So now you'll be getting issues from people who's binary code size is exploding. Are you sure you landed on the right side of this trade-off? ;-) Can't you just add a dynamic version check after the import? |
Is your feature request related to a problem? Please describe.
The new dependency on
pyarrow
, introduced in #1178, creates a new sub-dependency onnumpy
. Without fully understanding why these dependencies were introduced, a required dependency onnumpy
feels unnecessarily large for this library.Describe the solution you'd like
Make the
pyarrow
andnumpy
dependencies optional (via extras).Describe alternatives you've considered
Pin my usage of
google-cloud-bigquery
back to a version that does have these dependencies, or find a way to remove it entirely.Additional context
The dependency on
numpy
inpyarrow
: https://github.com/apache/arrow/blob/4a90e3994fc9fc10b968ab3439dec636385dec22/python/setup.py#L589-L591(PS, thanks for your work on this library!)
The text was updated successfully, but these errors were encountered: