Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while tring to write data to Google Cloud Storage: Generic GCS error: Unsupported ApplicationCredentials type: service_account #1978

Open
saumyasuhagiya opened this issue Dec 17, 2023 · 3 comments
Labels
bug Something isn't working
Milestone

Comments

@saumyasuhagiya
Copy link

saumyasuhagiya commented Dec 17, 2023

Environment

Delta-rs version:
python-v0.14.0

Binding:
Python

Environment:

  • Cloud provider: Google
  • OS: macOS
  • Other: delta 3.0

Bug

I'm getting this error while trying to write data to google cloud storage.

I've env variables set to Service account key path

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_json_key'
os.environ['SERVICE_ACCOUNT'] = 'path_to_json_key'

I'm trying to run following command

pd is panda dataframe.

write_deltalake("gs://write/to/this/folder/", pd, filesystem = fs.GcsFileSystem, mode='append')
also tried with normal dataframe as df

write_deltalake("gs://write/to/this/folder/", df, filesystem = fs.GcsFileSystem, mode='append')

File "/Users/xyz/write_to_delta/get_write_to_delta_data.py", line 85, in <module>
    write_data()
  File "/Users/xyz/write_to_delta/get_write_to_delta_data.py", line 83, in write_data
    write_deltalake("gs://write/to/this/folder/", pd, filesystem = fs.GcsFileSystem, mode='append')  # should use fs
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xyz/venv/lib/python3.11/site-packages/deltalake/writer.py", line 238, in write_deltalake
    table, table_uri = try_get_table_and_table_uri(table_or_uri, storage_options)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xyz/venv/lib/python3.11/site-packages/deltalake/writer.py", line 608, in try_get_table_and_table_uri
    table = try_get_deltatable(table_or_uri, storage_options)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xyz/venv/lib/python3.11/site-packages/deltalake/writer.py", line 621, in try_get_deltatable
    return DeltaTable(table_uri, storage_options=storage_options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xyz/venv/lib/python3.11/site-packages/deltalake/table.py", line 259, in __init__
    self._table = RawDeltaTable(
                  ^^^^^^^^^^^^^^
OSError: Generic GCS error: Unsupported ApplicationCredentials type: service_account

What happened:
Thrown error.

What you expected to happen:
Should be able to write successfully.

How to reproduce it:
Above mentioned details should help to reproduce it. Below are the package versions in my requirements.txt

delta-spark==3.0.0
deltalake==0.14.0
pyarrow==14.0.1
pyspark==3.5.0

More details:
Followed this #878 and #570. However not sure , if it has support or not. Based on this doc https://delta-io.github.io/delta-rs/api/delta_writer/#deltalake.write_deltalake it seems its supported.

@saumyasuhagiya saumyasuhagiya added the bug Something isn't working label Dec 17, 2023
@ion-elgreco
Copy link
Collaborator

Passing a custom filesystem is not - implemented. Pass them as storage options, you can see the available options here: https://docs.rs/object_store/latest/object_store/gcp/struct.GoogleCloudStorageBuilder.html

@rtyler rtyler added this to the Rust v0.18 milestone Feb 6, 2024
@sor-droneup
Copy link

sor-droneup commented Feb 12, 2024

I'm having simmilar error - I've got a PATH to service acc json. stored in env variable GOOGLE_APPLICATION_CREDENTIALS and GOOGLE_SERVICE_ACCOUNT as desribed with the link above.

to read the file i'm using:
dt = DeltaTable("gs://bucket-name/path/to/my/table")
But keep getting an error:

-->  self._table = RawDeltaTable(
       str(table_uri),
       version=version,
       storage_options=storage_options,
       without_files=without_files,
       log_buffer_size=log_buffer_size,
     )

OSError: Generic GCS error: Invalid RSA key: InconsistentComponents

My service acc. has a format of:

  "type": "service_account",
  "project_id": “my-project",
  "private_key_id": “xxx",
  "private_key": "-----BEGIN PRIVATE KEY——xxx-----END PRIVATE KEY-----\n",
  "client_email": “sa-email@my-project.iam.gserviceaccount.com",
  "client_id": “123",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/my-service-acc%40my-project.iam.gserviceaccount.com",
  "universe_domain": "googleapis.com"
}

The key works for different workloads (eg. reading delta through spark)

@dvigh8
Copy link

dvigh8 commented Jun 27, 2024

#2192 mentions the same error. Service account file is valid and works with other packages ie pandas, but will not work with deltalake as of version 0.18.1 on aarch64. I was able to get it to work by switching the vm to x86_64 with same deltalake version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants