TF-Keras implementation of AIHIQnet as described in Attention Integrated Hierarchical Networks for No-Reference Image Quality Assessment by Junyong You and Jari Korhonen.
- Clone this repository.
- Install required Python packages. The code is developed by PyCharm in Python 3.7. The requirements.txt document is generated by PyCharm. Please note this is for running the models on an old 1080Ti GPU with TF version 2.2.
- The models have actually been trained on RTX3090 by tf-nightly-gpu (version 2.6), and the latest version of code can be run under both TF2.2 and TF-nightly. The guidance (https://www.reddit.com/r/tensorflow/comments/jsalkw/rtx_3090_and_tensorflow_for_windows_10_step_by/) was followed to install TF on RTX3090, and other packages should also be updated.
Examples of training AIHIQnet on KonIQ-10k and SPAQ databases can be found in image_quality/bin. Argparser should be used, but the authors prefer to use dictionary with parameters being defined. It is easy to convert to take arguments. In principle, the following parameters can be defined:
args = {}
args['multi_gpu'] = 1 # Set if multi-GPU training should be used or not, 1: multi-GPU, 0: single GPU
args['gpu'] = 0 # If single GPU is used, choose which GPU on multi-GPU PC
args['result_folder'] = r'..\databases\results\phiqnet_koniq_distribution' # Define result folder
args['n_quality_levels'] = 5 # Choose 1 for KonIQ-MOS or 5 for KonIQ
# Choose between 'resnet50', 'densnet121', 'vgg16', 'resnet152v2', 'vgg16', 'resnest50'
args['backbone'] = 'resnet50'
# args['backbone'] = 'densnet121'
# args['backbone'] = 'vgg16'
# Choose between False and True, default: False
args['naive_backbone'] = False
# Image and score must be provided
args['images_scores_file'] = r'..\databases\train_val_test_koniq.pkl' # Define the file containing train/val/test images with the MOS values or voter distribution
args['image_folder'] = r'..\databases\image_quality_koniq10k\koniq_all' # Define the image folder path, if the absolute image path is not specified in the above file
args['initial_epoch'] = 0
args['lr_base'] = 1e-4/2
args['lr_schedule'] = True
args['batch_size'] = 16
args['epochs'] = 100
args['feature_fusion'] = True # Flag to use feature fusion or not
args['attention_module'] = True # Flag to use attention module or not
args['image_aug'] = True # Flag to use imageg augmention
# Depending on which backbone is used, choose the corresponding ImageNet pretrained weights file, set to None is no pretrained weights to be used.
args['weights'] = r'..\pretrained_weights\resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'
# args['weights'] = None
args['do_finetune'] = True # Flag to do fine tune or not
After PHIQnet has been trained, and the weights have been stored in h5 file, it can be used to predict image quality with arbitrary sizes,
args = {}
args['n_quality_levels'] = 5
args['naive_backbone'] = False
args['backbone'] = 'resnet50'
args['feature_fusion'] = True
args['weights'] = 'phiqnet.h5'
model = phiq_net(n_quality_levels=args['n_quality_levels'],
naive_backbone=args['naive_backbone'],
backbone=args['backbone'],
feature_fusion=args['feature_fusion'])
model.load_weights(args['weights'])
Please see the scripts in model_evaluation.
This work uses two publicly available databases: KonIQ-10k KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment by V. Hosu, H. Lin, T. Sziranyi, and D. Saupe; and SPAQ Perceptual quality assessment of smartphone photography by Y. Fang, H. Zhu, Y. Zeng, K. Ma, and Z. Wang.
Please see README in databases for details.
Please check the paths of image, weights, etc. first if encountering any problems. Please report any bugs in Issues.
- To be added