This repository has been archived by the owner on Nov 1, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 200
Add azcopy upload support and switch to the default max_concurrency #1556
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Let me know if there is anything else that you need from my side on this request. |
@puhley is it ok if I do "Update branch" on your PR first before approving ? |
I tried performing an update branch and two checks failed. Both are related to the new api-service code which is unrelated to this merge. One of them was a whitespace issue with one of the new files. The other was a missing zip file. Would you override these checks or should I just wait and merge again? |
this PR should fix the issue: |
stishkin
approved these changes
May 11, 2022
@puhley - what's the workflow you'd prefer ? should I merge the PR or you will merge it ? |
@stishkin I am not a project maintainer. Therefore, I don't have the authority to merge pull requests. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary of the Pull Request
This pull request is to address issue #1477 where low bandwidth connections could not support multiple concurrent upload streams without encountering a timeout. To address this issue, two changes are being introduced.
The first change is to modify the
upload_file
method to use the azcopy command as its default approach. It was determined that azcopy has more robust code for handling different connection speeds than the Azure Python SDK. Therefore, the code will now attempt to use azcopy first. It is possible that the azcopy command could fail due to it not being present, anazcopy login
step needing to be performed, or a similar scenario. If this occurs, then the code will fall back to using the Azure Python SDK. If they both fail, then the code will re-attempt both approaches up to the retry limit. In order to support this, a new method was added to azcopy.py for performingazcopy copy
commands.The second change is to modify
upload_file_data
method to useupload_file
. This change will ensure that both file upload approaches are consistent in their use of azcopy,max_concurrency
, and retries.PR Checklist
Info on Pull Request
This pull request adds
azcopy copy
support. The existing azcopy code leveraged thesync
functionality which is for coordinating two files or directories that already exist. When creating a new job, you may be uploading a new file to a new destination container. Therefore, theazcopy_copy
method was created for copy operations.The
max_concurrency
of 10 was removed from the Azure SDKupload_blob
call withinupload_file
so that the default is used instead. This will help resolve an issue where low bandwidth connections could not support multiple concurrent uploads without encountering a timeout.The
upload_file_data
function was converted to callupload_file
rather than have its own upload code. This approach ensures that bothupload_file
andupload_file_data
have the same behavior as requested in the issue discussion. The data passed as a string toupload_file_data
is now written to a temporary file for azcopy to access withinupload_file
. Testing for this approach is limited sinceupload_file_data
is not currently used within onefuzz.Validation Steps Performed
Creating a fuzzing job will leverage the
upload_file
functionality. This was tested creating a job with that required both the target_exe and extra_files flags to upload files.It was noted in the original discussion that low latency tests can be performed using: