Joint-ID: Transformer-Based Joint Image Enhancement and Depth Estimation for Underwater Environments
IEEE Sensors Journal 2023
This repository represents the official implementation of the paper titled "Transformer-Based Joint Image Enhancement and Depth Estimation for Underwater Environments".
Geonmo Yang, Gilhwan Kang, Juhhui Lee, Younggun Cho
We propose a novel approach for enhancing underwater images that leverages the benefits of joint learning for simultaneous image enhancement and depth estimation. We introduce Joint-ID, a transformer-based neural network that can obtain high-perceptual image quality and depth information from raw underwater images. Our approach formulates a multi-modal objective function that addresses invalid depth, lack of sharpness, and image degradation based on color and local texture.
-
Run the demo locally (requires a GPU and an
nvidia-docker2
, see Installation Guide) -
Optionally, we provide instructions to use docker in multiple ways. (But, recommended using
docker compose
, see Installation Guide). -
The code requires
python>=3.8
, as well aspytorch>=1.7
andtorchvision>=0.8
. But, we don't provide the instructions to install both PyTorch and TorchVision dependencies. Please usenvidia-docker2
π. Installing both PyTorch and TorchVision with CUDA support is strongly recommended. -
This code was tested on:
- Ubuntu 22.04 LTS, Python 3.10.12, CUDA 11.7, GeForce RTX 3090 (pip)
- Ubuntu 22.04 LTS, Python 3.8.6, CUDA 12.0, RTX A6000 (pip)
- Ubuntu 20.04 LTS, Python 3.10.12, CUDA 12.1, GeForce RTX 3080ti (pip)
- π¦ Prepare Repository & Checkpoints
- β¬ Prepare Dataset
- π Prepare Docker Image and Run the Docker Container
- π Training for Joint-ID on Joint-ID Dataset
- π Testing for Joint-ID on Joint-ID Dataset
- π Testing for Joint-ID on Standard or Custom Dataset
-
Clone the repository (requires git):
git clone https://github.com/sparolab/Joint_ID.git cd Joint_ID
-
Let's call the path where Joint-ID's repository is located
${Joint-ID_root}
. -
Download a checkpoint joint_id_ckpt.pth of our model on path
${Joint-ID_root}/Joint_ID
.
-
Download the Joint_ID_Dataset.zip
-
Next, unzip the file named
Joint_ID_Dataset.zip
with the downloaded path as${dataset_root_path}
.sudo unzip ${dataset_root_path}/Joint_ID_Dataset.zip # ${dataset_root_path} requires at least 2.3 Gb of space. # ${dataset_root_path} is the absolute path, not relative path.
-
After downloading, you should see the following file structure in the
Joint_ID_Dataset
folderπ¦ Joint_ID_Dataset β£ π train β β£ π LR # GT for traning dataset β β β£ π 01_Warehouse β β β β£ π color # enhanced Image β β β β β£ π in_00_160126_155728_c.png β β β β ... β β β β β β β β π depth_filled # depth Image β β β β£ π in_00_160126_155728_depth_filled.png β β β ... β β ... β β π synthetic # synthetic distorted dataset β β£ π LR@01_Warehouse@color...7512.jpg β β£ ... β β π test # 'test'folder has same structure as 'train'folder ...
-
After downloading, you should see the following file structure in the
Joint_ID_Dataset
folder -
If you want to know the dataset, then see the project page for additional dataset details.
To run a docker container, we need to create a docker image. There are two ways to create a docker image and run the docker container.
-
Use
docker pull
or:# download the docker image docker pull ygm7422/official_joint_id:latest # run the docker container nvidia-docker run \ --privileged \ --rm \ --gpus all -it \ --name joint-id \ --ipc=host \ --shm-size=256M \ --net host \ -v /tmp/.X11-unix:/tmp/.X11-unix \ -e DISPLAY=unix$DISPLAY \ -v /root/.Xauthority:/root/.Xauthority \ --env="QT_X11_NO_MITSHM=1" \ -v ${dataset_root_path}/Joint_ID_Dataset:/root/workspace/dataset_root \ -v ${Joint-ID_root}/Joint_ID:/root/workspace \ ygm7422/official_joint_id:latest
-
Use
docker compose
(this is used to build docker iamges and run container simultaneously):cd ${Joint-ID_root}/Joint_ID # build docker image and run container simultaneously bash run_docker.sh up gpu ${dataset_root_path}/Joint_ID_Dataset # Inside the container docker exec -it Joint_ID bash
Regardless of whether you use method 1 or 2, you should have a docker container named Joint_ID
running.
- First, move to the
/root/workspace
folder inside the docker container. Then, run the following command to start the training.# move to workspace cd /root/workspace # start to train on Joint-ID Dataset python run.py local_configs/arg_joint_train.txt
- The model's checkpoints and log files are saved in the
/root/workspace/save
folder. - If you want to change the default variable setting for training, see Inference settings below.
-
First, move to the
/root/workspace
folder inside the docker container. Then, run the following command to start the testing.# move to workspace cd /root/workspace # start to test on Joint-ID Dataset python run.py local_configs/arg_joint_test.txt
-
The test images and results are saved in the
result_joint.diml.joint_id
folder. -
If you want to change the default variable setting for testing, see Inference settings below.
-
Set the dataset related variables in the
local_configs/cfg/joint.diml.joint_id.py
file. Below, enter the input image path in thesample_test_data_path
variable.... # If you want to adjust the image size, adjust the `image_size` below. image_size = dict(input_height=288, input_width=512) ... # Dataset dataset = dict( train_data_path='dataset_root/train/synthetic', ... # sample_test_data_path='${your standard or custom dataset path}', sample_test_data_path='demo', video_txt_file='' ) ...
-
First, move to the
/root/workspace
folder inside the docker container. Then, run the following command to start the testing.# move to workspace cd /root/workspace # start to test on standard datasets python run.py local_configs/arg_joint_samples_test.txt
-
The test images and results are saved in the
sample_eval_result_joint.diml.joint_id
folder.
We set the hyperparameters in 'local_configs/cfg/joint.diml.joint_id.py'.
depth_range
: Range of depth we want to estimate
image_size
: the size of the input image data. If you set this variable, make sure to set auto_crop
to False in train_dataloader_cfg
, or eval_dataloader_cfg
, or test_dataloader_cfg
, or sample_test_cfg
below. If you do not want to set image_size
, please set auto_crop
to True. auto_crop
will be input to the model at the original size of the input data.
train_parm
: hyperparameters to set when training.
test_parm
: hyperparameters to set when testing.
Please cite our paper:
@article{yang2023joint,
title={Joint-ID: Transformer-based Joint Image Enhancement and Depth Estimation for Underwater Environments},
author={Yang, Geonmo and Kang, Gilhwan and Lee, Juhui and Cho, Younggun},
journal={IEEE Sensors Journal},
year={2023},
publisher={IEEE}
}
Geonmo Yang: ygm7422@gmail.com
Project Link: https://sites.google.com/view/joint-id/home
This work is licensed under the GPL License, Version 3.0 (as defined in the LICENSE).