A TFDS-based python package for data set building.
Current supported data sets are listed below. Use the following commands to (re)build all data sets.
make build_dataset
This data set from Li et al. 2022 contains 589 T2-weighted labeled images which are split for training, validation and testing respectively.
Use the following commands at the root of this repository (i.e. under ImgX/
) to automatically
download and build the data set, which will be built under ~/tensorflow_datasets
folder.
Optionally, add flag --overwrite
to rebuild/overwrite the data set.
tfds build imgx/datasets/male_pelvic_mr
This data set from Ji et al. 2022 contains 500 CT labeled images which has been split into 200, 100, and 200 images for training, validation, and test sets. But test set labels were not released, therefore validation is further split into 10 and 90 images for validation and test sets.
Use the following commands at the root of this repository (i.e. under ImgX/
) to automatically
download and build the data set, which will be built under ~/tensorflow_datasets
folder.
Optionally, add flag --overwrite
to rebuild/overwrite the data set.
tfds build imgx/datasets/amos_ct
This data set from Marzola et al. 2021 contains 3910 labeled images, which has been split into 2531, 666, and 713 images for training, validation, and test sets.
Use the following commands at the root of this repository (i.e. under ImgX/
) to automatically
download and build the data set, which will be built under ~/tensorflow_datasets
folder.
Optionally, add flag --overwrite
to rebuild/overwrite the data set.
tfds build imgx/datasets/muscle_us
This data set from Baid et al. 2021 contains 1251 labeled images which are split for training, validation and testing respectively.
This data set requires manual data downloading from
Kaggle. using
kaggle API. The
authentication token
shall be obtained and stored under ~/.kaggle/kaggle.json
.
Then, execute the following commands to download and unzip files. Afterward, return to ImgX/
folder (/app/ImgX
for docker).
mkdir -p ~/tensorflow_datasets/downloads/manual/BraTS2021_Kaggle/BraTS2021_Training_Data/
cd ~/tensorflow_datasets/downloads/manual/BraTS2021_Kaggle/BraTS2021_Training_Data/
kaggle datasets download -d dschettler8845/brats-2021-task1
unzip brats-2021-task1.zip
tar xf BraTS2021_Training_Data.tar
rm BraTS2021_00495.tar
rm BraTS2021_00621.tar
rm BraTS2021_Training_Data.tar
rm brats-2021-task1.zip
This way under BraTS2021_Kaggle/
exist folders per sample. For example, files corresponding to uid
BraTS2021_01666
should be located at
~/tensorflow_datasets/downloads/manual/BraTS2021_Kaggle/BraTS2021_Training_Data/BraTS2021_01666/
under which there are five files:
BraTS2021_01666_flair.nii.gz
,BraTS2021_01666_t1.nii.gz
,BraTS2021_01666_t1ce.nii.gz
,BraTS2021_01666_t2.nii.gz
,BraTS2021_01666_seg.nii.gz
.
Use the following commands at the root of this repository (i.e. under ImgX/
) to automatically
build the data set, which will be built under ~/tensorflow_datasets
folder. Optionally, add flag
--overwrite
to rebuild/overwrite the data set.
tfds build imgx/datasets/brats2021_mr
This data set from Bernard et al. 2018 contains 150 samples. Samples are split into 100 and 50 for training and test sets. Each sample contains
- a 4D image (a sequence of 3D MR images)
- a 3D image and corresponding segmentation label for end-diastolic (ED) frame
- a 3D image and corresponding segmentation label for end-systolic (ES) frame
Use the following commands at the root of this repository (i.e. under ImgX/
) to automatically
download and build the data set, which will be built under ~/tensorflow_datasets
folder.
Optionally, add flag --overwrite
to rebuild/overwrite the data set.
tfds build imgx/datasets/acdc_mr