DeepInPy -- Getting Started

This document is intended to show how to get started with a training experiment.

In the current setup, the training is driven by a data file and a config file. The data file specifies the training data. the config file specifies the training procedure. Right now, the config file has options for every training procedure implemented, even though many are only relevant to a specific training algorithm, model, etc.

Data format:

DeepInPy expects a complex-valued, multi-channel MRI format. Even when the data are single-coil, the format should be followed.

The data format is a h5 file, consisting of the following fields:

imgs: [Ntraining, N1, N2, ..., NT, X, Y, Z]: np.complex
masks: [Ntraining, N1, N2, ..., NT, X, Y, Z]: np.float
maps: [Ntraining, Ncoil, N1, N2, ..., NT, X, Y, Z]: np.complex
ksp: [Ntraining, Ncoil, N1, N2, ..., NT, X, Y, Z]: np.complex

Ntraining is the number of training examples. If Ntraining=1, it should still be included as a singleton dimension.
N1, N2, ..., NT are higher-order dimensions, and can be used for multi-phase data (e.g. temporal, contrast, coefficients, phases, etc.). These dimensions are optional and can be excluded.
X, Y, Z are spatial dimensions. In the case of 2D data, the Z dimension can be excluded.
"imgs" are the fully sampled images used for ground-truth for calculating nrmse and for doing supervised learning
"ksp" is the kspace that will be down-sampled by the mask. It should be fully sampled, but it technically doesn’t have to be.
"masks" will be multiplied by ksp each training round

Except for the masks, all data should be stored as complex-valued arrays.

Example: 2D 8-coil data with 100 training examples

imgs: [100, 256, 256]: np.complex
masks: [100, 256, 256]: np.float
maps: [100, 8, 256, 256]: np.complex
ksp: [100, 8, 256, 256]: np.complex

Example: 2D single-coil data with 1 training example

imgs: [1, 256, 256]: np.complex
masks: [1, 256, 256]: np.float
maps: [1, 1, 256, 256]: np.complex
ksp: [1, 1, 256, 256]: np.complex

Note that the maps array can be all-ones in this case

Example: 2D 8-coil data with 100 training examples and 20 temporal phases

imgs: [100, 20, 256, 256]: np.complex
masks: [100, 20, 256, 256]: np.float
maps: [100, 8, 20, 256, 256]: np.complex
ksp: [100, 8, 20, 256, 256]: np.complex

Note that the maps array can be all-ones in this case

Example: 2D 8-coil data, solving for each channel separately

We use the same interface by treating the coil dimension as a higher-order dimension, and creating an all-ones maps array. We tell the code that it is "one-channel" data with a higher-order dimension equal to 8

imgs: [100, 8, 256, 256]: np.complex
masks: [100, 8, 256, 256]: np.float
maps: [100, 1, 8, 256, 256]: np.complex
ksp: [100, 1, 8, 256, 256]: np.complex

Writing/reading data file

To write a data file, you can use the deepinpy.utils.utils.h5_write function. The function takes the path to the target h5 file and a dictionary of key-value pairs:

# example data writer for 2D images with 10 training examples and 8 coils

from deepinpy.utils.utils import h5_write

imgs = np.random.randn(10, 256, 256, dtype=np.complex)
masks = np.random.randn(10, 256, 256, dtype=np.complex)
maps = np.random.randn(10, 8, 256, 256, dtype=np.float)
ksp = np.random.randn(10, 8, 256, 256, dtype=np.complex)

data = {'imgs': imgs, 'masks': masks, 'maps': maps, 'ksp': ksp}

h5_write('mydata.h5', data)

There is also a similar h5_read function to load the training set.

Config parameters

DeepInPy is controlled by passing command-line arguments to the main.py function. To view the command-line args, you can run

python main.py --help

Config file

The recommended way to pass command-line args is through the use of a config file:

python main.py --config configs/example.json

The config file is a JSON-formatted file containing the names of the command-line args, and the values to pass. These args will automatically be logged to tensorboard, so that they can be queried/reused.

Note: Not all command-line args will be used, as it depends on the specific model that you use. (TODO: organize command-line args by model/module).
Note 2: By default, DeepInPy will use the cpu for training. You should specify the GPUs to use otherwise

The main config parameters that are necessary to run a training:

data_file: specifies the path to the data file in hdf5 format (see above section)
recon: the reconstruction method to use (for example, "modl", "cgsense", "resnet", etc.)
network: the neural network to use within the recon, if applicable (for example, "ResNet")

Other config parameters that are not required, but strongly recommended to set:

name: name of the experiment, which will be tracked in tensorboard
gpu: specify a string of comma-separated gpu numbers for training (e.g. "0" or "0, 1")
step: step size, or learning rate, for training
num_epochs: number of training epochs to run
shuffle: true to shuffle the dataset
num_data_sets: controls the number of training samples to use for training
stdev: set to non-zero to add complex-valued white Gaussian noise to the data
self_supervised: set to true to evaluate the loss in the measurement domain

Distributed training

It is possible to run simple distributed training, by splitting the training epoch over multiple GPUs/CPUs. For example, if the training set contains 100 samples and four GPUs are used, then each GPU will receive 25 training samples each epoch.

To run distributed training on GPU, simple specify multiple GPUs in the config: gpu: "0, 1, 2, 3"
To run distributed training on CPU, do not set the gpu variable, and instead export the environment variable for OpenMP before running the code: export OMP_NUM_THREADS=20

Hyperparameter optimization

The config can be used to enable hyperparameter optimization/tuning with support for parallelization across CPUs/GPUs. To enable hyperparameter optimization:

set hyperopt to true
set num_workers to the number of experiments to run in parallel. For example, with four GPUs, set num_workers to 4.
set gpu to the list of GPUs to use (or leave blank to use CPU)
set num_trials to the number of experiments to run. For example, set num_trials to 10 to run 10 experiments with different hyperparameters
Example: 100 trials using 4 GPUs with two experiments per GPU running at once:

"hyperopt": true,
"num_workers": 8,
"gpu": "0, 1, 2, 3",
"num_trials": 100

Currently, one must manually set which config options are tunable via hyperparameter optimization. DeepInPy uses TestTube to cotrol this. By default, the step size is the only tunable parameter, defined in main.py:

parser.opt_range('--step', type=float, dest='step', default=.001, help='step size/learning rate', tunable=True, nb_samples=100, low=.0001, high=.001)

Notice that it is an opt_range, meaning that it will sample values between low and high. Also notice that nb_samples=100, meaning at most 100 different values will be sampled from this hyperparameter. Finally, notice that tunable=True. If we change this to False, then it will not be used for hyperparameter optimization

For example, currently the solver is not tunable:

parser.opt_list('--solver', action='store', dest='solver', type=str, tunable=False, options=['sgd', 'adam'], help='optimizer/solver ("adam", "sgd")', default="sgd")

If we change tunable to True, then hyperopt will choose between the values under options for each experiment.

In this way, we can set multiple Hyperparameters to tunable=True. Then, each hyperopt experiment will choose one value from each parameter, and we can sweep a large number of parameters at once.

The default policy is to use random search. This can also be modified by changing the strategy to grid search in the HyperOptArgumentParser argument:
From

parser = HyperOptArgumentParser(usage=usage_str, description=description_str, formatter_class=argparse.ArgumentDefaultsHelpFormatter, strategy='random_search')

to

parser = HyperOptArgumentParser(usage=usage_str, description=description_str, formatter_class=argparse.ArgumentDefaultsHelpFormatter, strategy='grid_search')

Learning rate scheduler

DeepInPy has capabilities for learning rate scheduling. An example usage has been included in the default config. The general use is:

"lr_scheduler": [x,y]

where "x" is the epoch when the multiplicative factor will be applied and "y" is the multiplicative factor that scales the current learning rate. Each successive x number of epochs, e.g. 2x, 3x, 4x etc will also scale the learning rate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!