-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow writing to S3 paths #8508
Comments
I'll try to get to this by 0.16 |
any PRs open on this? I thought of this too and think it'd be great. I can take a shot at it if not. |
Go for it! |
https://github.com/wrobstory/pgshift/blob/master/pgshift/pgshift.py#L99 could be a helpful template |
I also wanted the ability to write a DF as a CSV to S3 (acronym overload...), and wrote the following snippet:
Hope this helps anyone trying to do the same thing. IMO, this features shouldn't be added to pandas. I think this snippet (or a better version) should simply be documented Edit: This is using Python 3.5 & boto3. I'm sure a similar snippet will work for 2.7 or the old boto |
Any hope of supporting writing to S3 for the new release? Now that parquet is supported this becomes double interesting. |
@maximveksler this is not very hard to do as we already have a dep for S3 interactions with https://pypi.python.org/pypi/s3fs, and |
@jreback sure, please point me into some relevant locations and I'll PR gladly. Would appreciate focus about what from s3fs and pyarrow I should be looking into for more details. |
http://pandas.pydata.org/pandas-docs/stable/contributing.html# here are tests for reading
writing routines should be in pandas/io/s3.py |
Hi! Was the PR created? |
This is related #19135 |
@maximveksler what about other writers like .to_csv() ? |
@CrossNox reading at the implementation I think it "should work". Could you please test and update? |
Python v2.7.12
Raises the following error:
So, right now i'm saving the file as:
|
@TomAugspurger I would like to sort this out as well as writing to GCS as a follow-up to #20729. Is there a reason that reading/writing generally seem to use two different methods for accessing file-like objects ( For S3 specifically there is another issue (sort of captured by #9712) which is that writing CSVs in
to |
Great! I'm not sure about get_file_path_or_buffer vs. get_handle. I'm not that familiar with the parser code. |
The code |
take |
Using version 0.25.1, I can do the following: df = pd.DataFrame({"a": range(5)})
df.to_csv("s3://test-key/test.csv") So only the documentation is missing. |
It would be really great if
to_(filetype)
supported writing to S3.Here is an example upload to s3 function that takes in a local file and places it on a s3 bucket.
The text was updated successfully, but these errors were encountered: