Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve hashing management for large file parts #245

Open
ppolewicz opened this issue May 17, 2021 · 0 comments
Open

Improve hashing management for large file parts #245

ppolewicz opened this issue May 17, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@ppolewicz
Copy link
Collaborator

Currently during sync the main thread does all the hashing aggresively, which can put it hundreds or thousands of parts ahead of the uploading threads.

I suggest we introduce a new parameter (to emerge executor?) which will create a multiprocessing pool which will farm out those hashing requests to separate processes, so that it can be parallelized over many cores for a generally faster operation. This also partially addreses the pycurl problem where performance may increase to a point where we'll want to avoid hashing in the main thread.

@ppolewicz ppolewicz added the enhancement New feature or request label May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant