Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

👌 IMPROVE: Legacy tar file archive migration performance #5275

Merged
merged 3 commits into from
Jan 15, 2022

Conversation

chrisjsewell
Copy link
Member

Tar files do not allow performant single file streaming,
and so the file is fully extracted to disk,
before streaming to the new (zip file) archive.

Tar files do not allow performant single file streaming,
and so the file is fully extracted to disk,
before streaming to the new (zip file) archive.
@chrisjsewell chrisjsewell requested a review from unkcpz December 15, 2021 08:49
@chrisjsewell chrisjsewell changed the title 👌 IMPROVE: Tar file archive migration performance 👌 IMPROVE: Legacy tar file archive migration performance Dec 15, 2021
@codecov
Copy link

codecov bot commented Dec 15, 2021

Codecov Report

Merging #5275 (11a3d2b) into develop (ec6fa17) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5275      +/-   ##
===========================================
+ Coverage    82.00%   82.01%   +0.01%     
===========================================
  Files          533      533              
  Lines        38230    38243      +13     
===========================================
+ Hits         31348    31362      +14     
+ Misses        6882     6881       -1     
Flag Coverage Δ
django 77.06% <100.00%> (+0.01%) ⬆️
sqlalchemy 76.36% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...implementations/sqlite/migrations/legacy_to_new.py 96.67% <100.00%> (+0.32%) ⬆️
aiida/transports/plugins/local.py 81.71% <0.00%> (+0.26%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec6fa17...11a3d2b. Read the comment docs.

Copy link
Member

@unkcpz unkcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisjsewell Thanks for the quick fix. I make the test on the tar achieve and it works great! Just a minor request from my side. And for the base_parts change, I have to admit I am not familiar with this part, so maybe @ramirezfranciscof can have another eye on this?

with get_progress_reporter()(desc='Converting repo', total=length) as progress:
for subpath in path.glob('**/*'):
progress.update()
parts = subpath.parts
parts = subpath.parts[base_parts:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have no clue what this change is for. But I think this is just because I miss the big picture of how migration work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are inside an archive, then the path is always relative to the archive (base_parts = 0), but if we are inside a file system then the path may be absolute, and we want to get back to the relative path.

@chrisjsewell
Copy link
Member Author

Ok, as it works I'm gonna merge 😄

@chrisjsewell chrisjsewell requested a review from unkcpz January 15, 2022 08:37
Copy link
Member

@unkcpz unkcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Thanks @chrisjsewell

@chrisjsewell chrisjsewell merged commit 163b383 into aiidateam:develop Jan 15, 2022
@chrisjsewell
Copy link
Member Author

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants