use s3fs authentication if provided

I have a use case where I need to download dataframes from multiple s3 buckets with different credentials.

By default, s3fs uses env variables such as `AWS_PROFILE` `AWS_ACCESS_KEY_ID` etc to determine credentials.  However, this will not work for me as I need different credentials for different buckets.

The s3fs docs show you can alternatively authenticate like so:
https://fs-s3fs.readthedocs.io/en/latest/#authentication
```python
s3fs = open_fs('s3://<access key>:<secret key>@mybucket')
```

I attempted to use this idea with pandas
```python
df = pd.read_csv("s3://<access key>:<secret key>@mybucket/csv_key")
```

but this raised an exception deep within s3fs saying invalid bucket name.  potentially caused by stripping logic here:
https://github.com/pandas-dev/pandas/blob/master/pandas/io/s3.py#L29

I think we could easily support authentication using this syntax:
```python
pd.read_csv("s3://<access key>:<secret key>@mybucket/csv_key")
```

By modifying the code here:
https://github.com/pandas-dev/pandas/blob/master/pandas/io/s3.py#L27

The idea being we first attempt to match the `filepath_or_buffer` for the access key and secret key.  If matched, we pass these into `s3fs.FileSystem`

```python
m = re.match(pattern, filepath_or_buffer)
if match is not None:
    access_key, secret_key, bucket_name = match.groups()
    fs = s3fs.FileSystem(bucket_name, aws_access_key_id=access_key, aws_secret_key=secret_key)
...
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

use s3fs authentication if provided #33639

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

use s3fs authentication if provided #33639

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions