A simple demonstration of JPEG compression detection and quality estimation using machine learning.
The aim is to produce a tensor of fixed-length representing an image of arbitrary size. The custom feature extraction process is the following.
Let
-
Image padding. Assure that both image dimensions
$W$ and$H$ are divisible by$B$ , thus$\tilde{W} = W + \Delta w$ such that$\exists k \in \mathbb{N}$ , such that$\tilde{W} = Bk$ . The same applied for$\tilde{H}$ . If necessary, expand the image by copying the edge values. -
Block splitting. Split the image of size
$\tilde{W} \times \tilde{H}$ into equal$B \times B$ blocks. Let$N$ denote the number of produced blocks. -
Block reshaping. Merge all the
$N$ extracted blocks into a single tensor of shape$N \times B \times B$ . -
Reduction. Apply
$R$ different statistical reductions, each time using a different function, e.g., min, max, mean, standard deviation, or median. Each reduction will produce$R$ distinct$B \times B$ matrices. -
Zig-zag selection and Concatenation. For subsequent visualization sake, a zig-zag selection is utilized as a substitute for a flatten operation (the order of indices does not affect ML algorithms). Thus, each of the
$R$ matrices with shape$B \times B$ is converted into a single-dimensional vector of length$B^2$ . The resulting vectors are concatenated to form the final feature vector of length$R \cdot B^2$ .
There are several notable observations. Given the feature extraction strategy above, the contribution of the minimum and maximum statistics is the least significant. It is completely negligible. As a result, the model trained using just mean, standard deviation, and median performs just as well with considerably fewer parameters.
- Robinson, Jonathan, and Vojislav Kecman. "Combining support vector machine learning with the discrete cosine transform in image compression." IEEE Transactions on Neural Networks 14.4 (2003): 950-958.
- Retraint, Florent, and Cathel Zitzmann. "Quality factor estimation of jpeg images using a statistical model." Digital Signal Processing 103 (2020): 102759.