Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3 cp with max_concurrent_requests #5941

Closed
3 of 5 tasks
henriqueribeiro opened this issue Feb 12, 2021 · 11 comments
Closed
3 of 5 tasks

s3 cp with max_concurrent_requests #5941

henriqueribeiro opened this issue Feb 12, 2021 · 11 comments
Assignees
Labels
closed-for-staleness guidance Question that needs advice or information. investigating This issue is being investigated and/or work is in progress to resolve the issue. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.

Comments

@henriqueribeiro
Copy link

Confirm by changing [ ] to [x] below:

Issue is about usage on:

  • Service API : I want to do X using Y service, what should I do?
  • CLI : passing arguments or cli configurations.
  • Other/Not sure.

Platform/OS/Hardware/Device
What are you running the cli on?

aws --version
aws-cli/2.1.0 Python/3.7.3 Linux/5.8.0-43-generic exe/x86_64.ubuntu.20

Describe the question
I want to reduce the number of threads used to copy a file from S3. I have been playing with max_concurrent_requests and it's not making sense to me. I'm trying to copy a single file (~41GB) to my local machines and if I set max_concurrent_requests to 1, I still get 5 threads copying the file. If I change it to 2, I get 6 threads. What is the rationale behind this?

Also, looking into the documentation:

The aws s3 transfer commands are multithreaded. At any given time, multiple requests to Amazon S3 are in flight. For example, if you are uploading a directory via aws s3 cp localdir s3://bucket/ --recursive, the AWS CLI could be uploading the local files localdir/file1, localdir/file2, and localdir/file3 in parallel. The max_concurrent_requests specifies the maximum number of transfer commands that are allowed at any given time.

Does this configuration limit only the number of files concurrently copied and not the number of threads used for a single multipart file?

Logs/output
Get full traceback and error logs by adding --debug to the command.

@henriqueribeiro henriqueribeiro added guidance Question that needs advice or information. needs-triage This issue or PR still needs to be triaged. labels Feb 12, 2021
@kdaily kdaily self-assigned this Feb 12, 2021
@kdaily kdaily added investigating This issue is being investigated and/or work is in progress to resolve the issue. and removed needs-triage This issue or PR still needs to be triaged. labels Feb 12, 2021
@kdaily
Copy link
Member

kdaily commented Feb 12, 2021

Hi @henriqueribeiro, I'm going to dig into it further, but I do not think it directly controls the number of threads, only the number of active requests.

@alialqh
Copy link

alialqh commented Feb 13, 2021

A

1 similar comment
@alialqh
Copy link

alialqh commented Feb 13, 2021

A

@alialqh
Copy link

alialqh commented Feb 13, 2021

> 

@henriqueribeiro
Copy link
Author

thank you @kdaily

@alialqh
Copy link

alialqh commented Feb 25, 2021

F

@alialqh
Copy link

alialqh commented Feb 25, 2021

S

@alialqh
Copy link

alialqh commented Feb 25, 2021

F

@alialqh
Copy link

alialqh commented Feb 25, 2021

Az123456a

@kdaily
Copy link
Member

kdaily commented Apr 13, 2021

Apologies for the delay. The max_concurrent_requests does not control directly the number of threads. There is currently a minimum number of threads irrespective of that value, and there isn't a way to adjust it directly. Having that would be a feature request. In some other research I came across this issue which feels the same as the one we were working on with @ttheyer (#5876):

boto/boto3#1670

We have it as a backlog item to look into the memory usage pattern to be sure that it matches expectations, but I don't have an update on when that will be done.

I think we can close this in lieu of #5876 and #1670 since they are related and likely the same underlying issue. I don't think that directly adjusting the number of threads is really what you want - you just want to be sure that proper memory management is handled.

@kdaily kdaily added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Apr 13, 2021
@github-actions
Copy link

Greetings! It looks like this issue hasn’t been active in longer than a week. We encourage you to check if this is still an issue in the latest release. Because it has been longer than a week since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or add an upvote to prevent automatic closure, or if the issue is already closed, please feel free to open a new one.

@github-actions github-actions bot added closing-soon This issue will automatically close in 4 days unless further comments are made. closed-for-staleness and removed closing-soon This issue will automatically close in 4 days unless further comments are made. labels Apr 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-for-staleness guidance Question that needs advice or information. investigating This issue is being investigated and/or work is in progress to resolve the issue. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days.
Projects
None yet
Development

No branches or pull requests

3 participants