You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Notably, dataset, as the error in delta-io/delta-rs#1108 shows, but perhaps more too (need to investigate).
Ideally, we'd have the pyarrow build in this repository supply all the required libraries (some are turned on already), so that the delta-rs/deltalake repo can publish it's own compatible layer.
Thoughts?
The text was updated successfully, but these errors were encountered:
Hi @nkarpov, as you might have guessed, the reason we have turned off some of these parameters is because we are trying to keep the size of the layer below the Lambda limit (250Mb unzipped, 50Mb zipped). The PyArrow dataset module is one of them as it was deemed not essential to our methods.
We would be wiling to reconsider as long as the impact on the layer size is reasonable. So my suggestion if you accept is for your team to go through the exercise of building our layer with the appropriate Arrow arguments you require for yours. Once you have them and if the layer size is still reasonable we will look into publishing it.
I've created the PR #1977 which I hope is a tolerable bump on the size. I think it's likely also other packages may in the future benefit from this change since the change is bringing this internal PyArrow build closer to the published pip version.
We're exploring building a compatible layer for aws-sdk-pandas (delta-io/delta-rs#1108) now that
deltalake
is integrated with #1834Today the pyarrow build in
./building/lambda/build-lambda-layer.sh
is not generating some of the optional components found in https://github.com/apache/arrow/blob/master/docs/source/developers/cpp/building.rst#optional-components and whichdeltalake
requires.Notably,
dataset
, as the error in delta-io/delta-rs#1108 shows, but perhaps more too (need to investigate).Ideally, we'd have the pyarrow build in this repository supply all the required libraries (some are turned on already), so that the delta-rs/deltalake repo can publish it's own compatible layer.
Thoughts?
The text was updated successfully, but these errors were encountered: