Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the proper way to upload a file? #665

Closed
SaintRod opened this issue Jan 16, 2024 · 2 comments
Closed

What is the proper way to upload a file? #665

SaintRod opened this issue Jan 16, 2024 · 2 comments

Comments

@SaintRod
Copy link

SaintRod commented Jan 16, 2024

Hello, all. I'm reviving an old issue since the codebase has been updated.

I am confused about how to upload a file to an AWS S3 bucket. I'm new to AWS and AWS.jl, but I can successfully call S3.list_objects_v2 and S3.get_object, which implies I have valid credentials and can access the bucket via AWS.jl.

I have a directory /proj_root/data/output with arrow files data_01.arrow. I want to upload the arrow files to my bucket - I've tried S3.put_object but no go. The call to S3.put_object doesn't error out and something called data_01.arrow does appear in the bucket afterward, but it's not what I expected. The object in the bucket is a text file (.txt) of size 0 kb. For example:

aws_params = Dict("Body" => "/proj_root/data/output/data_01.arrow")
S3.put_object(bucket, key, aws_params; aws_config)

I've also tried the below, but it resulted in an error. One of the columns/fields in the arrow table is of type date.

aws_params = Dict("Body" => Arrow.Table("/proj_root/data/output/data_01.arrow"))
S3.put_object(bucket, key, aws_params; aws_config)

ERROR: MethodError: no method matching iterate(::Dates.Date)

Closest candidates are:
  iterate(::Union{LinRange, StepRangeLen})
   @ Base range.jl:880
  iterate(::Union{LinRange, StepRangeLen}, ::Integer)
   @ Base range.jl:880
  iterate(::T) where T<:Union{Base.KeySet{<:Any, <:Dict}, Base.ValueIterator{<:Dict}}
   @ Base dict.jl:698
  ...

Could someone help me with this? Much appreciated.

PS. I've looked online but found little regarding Julia. I then searched for answers in Python via boto3 in case they'd help. However, most of the answers online reference streaming, which I understand is no longer a functionality AWS.jl supports based on my reading of related-closed issues.

@ericphanson
Copy link
Member

The body there doesn’t look right, that should probably be the contents of the file (eg with read), not the path. You can use AWSS3.jl which might be easier and uses AWS.jl internally.

@SaintRod
Copy link
Author

SaintRod commented Jan 17, 2024

Thanks @ericphanson. I had looked at S3 previously but struggled to wrap my head around the file system stuff. Happy to say that after revisiting AWSS3.jl things clicked.

I was able to put the arrow file(s) into my bucket. I'll close the ticket. I've added an example below in case this is relevant for someone in the future.

using AWS, AWSS3, Arrow
aws_s3_path = S3Path("s3://bucket_name", config=global_aws_config())
Arrow.write(joinpath(aws_s3_path, "data_01.arrow"), data)

# check ^ via below code
# should return an Arrow Table with n rows, p columns
@service S3
S3.get_object(
    my_bucket,
    "data_01.arrow";
    aws_config
) |> Arrow.Table

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants