You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement a ZipFolder class, which is based on my previous PR #3215 .
The idea is very similar to the TarDataset issue on pytorch/pytorch#49440.
It archives the ImageFolder to be a zip without any compression. The functions are almost the same as ImageFolder.
Advantage: it's better for long term use with one single archive file, and makes loading and transferring faster and more convenient by avoiding small files IO (when memory=True), especially on HDD disk.
When argument memory is set to be true, it'll read all bytes of the zip into memory at beginning. Otherwise, the default loading by zipfile would be lazy, leading to the same mechanism as ImageFolder.
Besides the basic utility, I also add a staticmethod initialize_from_folder that makes a folder (follows the ImageFolder requirements) to be the zip format.
Need Discussion:
Method initialize_from_folder might need a better name. (Candidates: init_from_folder, folder_to_zip)
It might not be appropriate to use io.BytesIO for type annotation.
Potential file structure of zip file (zip filename == [root_folder_name]_store.zip):
a. (current) [root_folder_name]/[target_class]/[img_file]
b. [target_class]/[img_file]
We need to check the compress type to be ZIP_STORED.
And unit test and docs need doing if any reviewer thinks this PR worth it.
🚀 Feature
This issue is corresponding to my PR #3510 .
Implement a
ZipFolder
class, which is based on my previous PR #3215 .The idea is very similar to the
TarDataset
issue on pytorch/pytorch#49440.It archives the ImageFolder to be a
zip
without any compression. The functions are almost the same asImageFolder
.Advantage: it's better for long term use with one single archive file, and makes loading and transferring faster and more convenient by avoiding small files IO (when
memory=True
), especially on HDD disk.When argument
memory
is set to be true, it'll read all bytes of the zip into memory at beginning. Otherwise, the default loading byzipfile
would be lazy, leading to the same mechanism asImageFolder
.Besides the basic utility, I also add a staticmethod
initialize_from_folder
that makes a folder (follows theImageFolder
requirements) to be the zip format.Need Discussion:
initialize_from_folder
might need a better name. (Candidates:init_from_folder
,folder_to_zip
)io.BytesIO
for type annotation.[root_folder_name]_store.zip
):a. (current)
[root_folder_name]/[target_class]/[img_file]
b.
[target_class]/[img_file]
And unit test and docs need doing if any reviewer thinks this PR worth it.
cc @pmeier
The text was updated successfully, but these errors were encountered: