Members : Vegard Bergsvik Øvstegård
Supervisors : Jim Tørresen
This repository aims to implement and produce trained networks in semantic image segmentation for orthopohots. Current network structure is U-net.
git clone https://github.com/gil-uav/semantic-image-segmentation-unet.git
python -m virtualevn venv
source venv/bin/activate
pip install -r requirements.txt
conda create --name seg --file spec-file.txt
conda activate seg
pip install kornia
The application fetches some configurations and parameters from a .env file if it exists.
Run python train.py --help
to see all other arguments. The package is using pytorch-lighting and inherits all its arguments.
The data is expected to be structured like this:
data/
images/
masks/
The path do data us set using --dp argument.
This example stores the checkpoints and logs under the default_root_dir, uses all available GPUs and fetches training data from --dp.
python train.py --default_root_dir=/shared/use/this/ --gpus=-1 --dp=/data/is/here/
Only these arguments are fetched from .env, the rest must be passed through the CLI.
# Model config
N_CHANNELS=3
N_CLASSES=1
BILINEAR=True
# Hyperparameters
EPOCHS=300 # Epochs
BATCH_SIZE=4 # Batch size
GROUP_NORM=0 # Group normalization size. If 0 run batch norm.
LRN_RATE=0.001 # Learning rate
VAL_PERC=15 # Validation percent
TEST_PERC=15 # Testing percent
IMG_SIZE=512 # Image size
VAL_INT_PER=1 # Validation interval percentage
ACC_GRAD=4 # Accumulated gradients, number = K.
GRAD_CLIP=1.0 # Clip gradients with norm above give value
EARLY_STOP=10 # Early stopping patience(Epochs)
# Other
PROD=False # Turn on or off debugging APIs
FAST_DEV_RUN=False # Do a fast test.
DIR_DATA="data/" # Where dataset is stored
DIR_ROOT_DIR="/shared/use/this/" # Where logs and checkpoint will be stored
WORKERS=4 # Number of workers for data- and validation loading
DISCORD_WH=httpsomethingwebhoowawnserisalways42
- Try with different number of workers, but more than 0. A good starting point
is
workers = cores * (threads per core)
.
- Group normalization
- Auto learning rate tuner
- Distributed data parallel training
- Early stopping
- ADAM optimizer
- Gradient clipping
- ReduceLROnPlateau learning rate scheduler
- Logging to Tensorboard
- Metrics:
- Loss
- F1
- Precision
- Recall
- Visualise images
- Gradient Accumulation(NB! Might conflict with batch-norm!)
- Add hyper-parameters and arguments from console
- Load hyper-parameters from .env
- Training finished notification to Discord
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.