You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When calling .shutdown() on a CRTTransferManager class, failures are not raised. After some digging into this repo, I saw
the below code (referenced from here)
the pass statement should probably re-raise the exception, so that errors are actually thrown. What do you think?
For the record, my code was this:
defparallel_upload(bucket: str, data: List[dict], workers=100):
"""performs parallel upload of records Args: bucket (str): the bucket records (List[dict]): records to process ex. [ { "key": "path/file.type", "data": "data" } ] workers (int, optional): number of parrallel workers. Defaults to 100. """botocore_config=botocore.config.Config(
max_pool_connections=workers,
tcp_keepalive=True
)
s3client=boto3.Session().client('s3', config=botocore_config)
transfer_config=s3transfer.TransferConfig(max_concurrency=workers)
s3t=s3transfer.create_transfer_manager(s3client, transfer_config)
upload_bytes= []
futures= []
foritemindata:
# IMPORTANT! we need to retain the reference to BytesIO outside of this loop otherwise# the call to shutdown will fail all transfers as there is no longer a reference to the bytes# being transferred - hence the use of a list to maintain themupload_bytes.append(BytesIO(item['data'].encode()))
s3t.upload(
upload_bytes[-1], bucket, item['key']
)
# wait for transfers to completes3t.shutdown()
print(f"\t - {len(data)} objects uploaded to {bucket} successfully.")
As a work-around, I have manually handled the collection of results from s3t.upload, like below:
defparallel_upload(bucket: str, data: List[dict], workers=100):
"""performs parallel upload of records Args: bucket (str): the bucket records (List[dict]): records to process ex. [ { "key": "path/file.type", "data": "data" } ] workers (int, optional): number of parrallel workers. Defaults to 100. """botocore_config=botocore.config.Config(
max_pool_connections=workers,
tcp_keepalive=True
)
s3client=boto3.Session().client('s3', config=botocore_config)
transfer_config=s3transfer.TransferConfig(max_concurrency=workers)
s3t=s3transfer.create_transfer_manager(s3client, transfer_config)
upload_bytes= []
futures= []
foritemindata:
# IMPORTANT! we need to retain the reference to BytesIO outside of this loop otherwise# the call to shutdown will fail all transfers as there is no longer a reference to the bytes# being transferred - hence the use of a list to maintain themupload_bytes.append(BytesIO(item['data'].encode()))
# keep track of all submissions for validation of successful uploadsfutures.append(
s3t.upload(
upload_bytes[-1], bucket, item['key']
)
)
# ensure all results do not throw an exception.forfinfutures:
f.result() # if upload fails, exception is thrown from this callprint(f"\t - {len(data)} objects uploaded to {bucket} successfully.")
I simply iterate the results after the queue-ing of uploads, and call .result(), which works as expected allowing me to concurrently upload while still detecting errors.
The text was updated successfully, but these errors were encountered:
When calling
.shutdown()
on aCRTTransferManager
class, failures are not raised. After some digging into this repo, I sawthe below code (referenced from here)
the
pass
statement should probably re-raise the exception, so that errors are actually thrown. What do you think?For the record, my code was this:
As a work-around, I have manually handled the collection of results from
s3t.upload
, like below:I simply iterate the results after the queue-ing of uploads, and call
.result()
, which works as expected allowing me to concurrently upload while still detecting errors.The text was updated successfully, but these errors were encountered: