4xNomosRealWeb Dataset
NomosRealWeb Dataset
Purpose: Train 4x upscaling models for upscaling images downloaded from the web. Realistic degradations have been used where I used for example my Ludvae200 model for realistic noise, and umzi helped me create less strong realistic lens blur by making an implementation in wtp dataset destroyer.
I have trained the 4xNomosWebPhoto_RealPLKSR, 4xNomosWebPhoto_atd and 4xNomosWebPhoto_esrgan models with it, and am releasing this dataset now (16.08.2024).
This dataset it the culmination of my experiences of the 4xRealWebPhoto datasets which got reworked until version 4. I included a pdf with details about 4xRealWebPhoto_v2 just as information. The information about this dataset can be found in this pdf.
This dataset was created to upscale images that were downloaded from the web, with realistic degradations in mind.
Mainly, it can deal with different realistic degradations.
Use case in mind, a photograph has been made on a mobile phone with medium sensor quality, meaning there is noise and maybe slight blur in the photograph.
Then it is uploaded to a social media platform, where the service provider of course compresses the image for faster page loading speeds (with webp or jpg).
Then someone sees that photograph (maybe of a sunset) and likes it, downloads it, and re-uploads it on his own profile, meaning the service provider again compressed the image.
Therefore this dataset has been degraded with realistic degradations. There are degradations folders in the Assets below, one where all images have been downscaled x4 only with different scaling algorithms. Then one where it has been scaled and compressed. One where it has been scaled, noise added, then compressed. One where it has been scaled, noise added, compressed, and then recompressed. And so forth.
I provide a lr1 folder where these different degradations have been mixed together. I named it lr1 because there can be multiple lr created out of the degraded dataset folders so that degradation variance is increased. Or, if the training software allows it, maybe even all degradations folders can be put in as lr folders and it would choose the image (like "2.png") randomly from one of the provided lr folders. This way we increase degradation variance by a lot for the upscaling model to learn.
lr1 folder repeating rotation (so image 1, 13, 25 .. would be scale only, image 2, 14, 26 would be scaled and compressed, and so forth) :
This is a view of the degradations folders created, each with 6000 images corresponding to the hr folder:
The hr folder is the nomosv2 dataset as created by musl which can be found with this link