How To Upscale

Notice

OpenModelDB is still in alpha and actively being worked on. Please feel free to share your feedback and report any bugs you find.

The best place to find AI Upscaling models

OpenModelDB is a community driven database of AI Upscaling models. We aim to provide a better way to find and compare models than existing sources.

Found 575 models
RGT
4x
4xTextures_GTAV_rgt-s
4xTextures_GTAV_rgt-s
4xTextures_GTAV_rgt-s
Github Release Link ## 4xTextures_GTAV_rgt-s **Scale:** 4 **Architecture:** RGT **Architecture Option:** RGT-S **License:** CC-BY-0.4 **Purpose:** Restoration **Subject:** Game Textures **Input Type:** Images **Release Date:** 04.05.2024 **Dataset:** GTAV_512_Textures **Dataset Size:** 8492 **OTF (on the fly augmentations):** No **Pretrained Model:** RGT_S_x4 **Iterations:** 165'000 **Batch Size:** 6,4 **GT Size:** 128,256 **Description:** A model to upscale game textures, trained on GTAV Textures, handles jpg compression down to 80. **Showcase:** Slow Pics
DRCT
4x
4xRealWebPhoto_v4_drct-l
4xRealWebPhoto_v4_drct-l
4xRealWebPhoto_v4_drct-l
Link to Github Release ## 4xRealWebPhoto_v4_drct-l **Scale:** 4 **Architecture:** DRCT **Architecture Option:** DRCT-L **Author:** Philip Hofmann **License:** CC-BY-0.4 **Purpose:** Restoration **Subject:** Realistic, Photography **Input Type:** Images **Release Date:** 02.05.2024 **Dataset:** 4xRealWebPhoto_v4 **Dataset Size:** 8492 **OTF (on the fly augmentations):** No **Pretrained Model:** 4xmssim_drct-l_pretrain **Iterations:** 260'000 **Batch Size:** 6,4 **GT Size:** 128,192 **Description:** The first real-world drct model, so I am releasing it, or at least my try at it, maybe others will be able to get better results than me, I think I'd recommend my 4xRealWebPhoto_v3_atd model over this one if a real-world model for upscaling photos downloaded from the web is desired. This model is based on my previously released drct pretrain. Used mixup, cutmix, resizemix augmentations, and mssim, perceptual, gan, dists, ldl, focalfrequency, gradvar, color and luma losses. **Showcase:** Slow.pics
DAT
4x
4xRealWebPhoto_v4_dat2
4xRealWebPhoto_v4_dat2
4xRealWebPhoto_v4_dat2
Link to Github Release ## 4xRealWebPhoto_v4_dat2 **Scale:** 4 **Architecture:** DAT **Author:** Philip Hofmann **License:** CC-BY-4.0 **Purpose:** Compression Removal, Deblur, Denoise, JPEG, WEBP, Restoration **Subject:** Photography **Input Type:** Images **Date:** 04.04.2024 **Architecture Option:** DAT-2 **I/O Channels:** 3(RGB)->3(RGB) **Dataset:** Nomos8k **Dataset Size:** 8492 **OTF (on the fly augmentations):** No **Pretrained Model:** DAT_2_x4 **Iterations:** 243'000 **Batch Size:** 4-6 **GT Size:** 128-256 **Description:** 4x Upscaling Model for Photos from the Web. The dataset consists of only downscaled photos (to handle good quality), downscaled and compressed photos (uploaded to the web and compressed by service provider), and downscale, compressed, rescaled, recompressed photos (downloaded from the web and re-uploaded to the web). Applied lens blur, realistic noise with my ludvae200 model, JPG and WEBP compression (40-95), and down_up, linear, cubic_mitchell, lanczos, gaussian and box downsampling algorithms. For details on the degradation process, check out the pdf with its explanations and visualizations. This is basically a dat2 version of my previous 4xRealWebPhoto_v3_atd model, but trained with a bit stronger noise values, and also a single image per variant so drastically reduced training dataset size. **Showcase:** 12 Slowpics Examples
LUDVAE
1x
Ludvae200
Ludvae200
A 1x model for 1x realistic noise degradation model . Github Release Link Name: Ludvae200 License: CC BY 4.0 Author: Philip Hofmann Network: LUD-VAE Scale: 1 Release Date: 25.03.2024 Iterations: 190'000 H_size: 64 n_channels: 3 dataloader_batch_size: 16 H_noise_level: 8 L_noise_level: 3 Dataset: RealLR200 Number of train images: 200 OTF Training: No Pretrained_Model_G: None Description: 1x realistic noise degradation model, trained on the RealLR200 dataset as found released on the SeeSR github repo. Next to the ludvae200.pth model file, I provide a ludvae200.zip file which not only contains the code but also an inference script to run this model on the dataset of your choice. Adapt the ludvae200_inference.py script accordingly by adjusting the file paths at the beginning section, to your input folder, output folder, the folder path holding the ludvae200.pth model, and a folder path where you want the text file to be generated. I made the textfile generation the same way as I did in Kim's Dataset Destroyer, which means you will have each image file logged with each of the values used to degrade that specific image file in the resulting text file, which will append only and never overwrite. You can also adjust the strength settings inside the inference script file to fit to your needs. If you in general want less strong noise for example, you should adjust the temperature upper limit from 0.4 to 0.2 or go even lower. So in line 96 change "temperature_strength = uniform(0.1,0.4)" to "temperature_strength = uniform(0.1,0.2)" just to give an example. These values are defaulted to my needs of my last dataset degradation workflow I used, but feel free to adjust these values. You can also do the same as I did, temporarily using deterministic values with multiple runs to determine the min and max values of noise generation you deem suitable for your dataset needs. See the examples of what this looked like for my last dataset workflow I used my model in.
ATD
4x
 4xNomos8k_atd_jpg
 4xNomos8k_atd_jpg
4xNomos8k_atd_jpg
A 4x model for 4x photo upscaler, handles jpg compression . Link to Github Release Name: 4xNomos8k_atd_jpg License: CC BY 4.0 Author: Philip Hofmann Network: ATD Scale: 4 Release Date: 22.03.2024 Iterations: 240'000 epoch: 152 batch_size: 6, 3 HR_size: 128, 192 Dataset: nomos8k Number of train images: 8492 OTF Training: Yes Pretrained_Model_G: 003_ATD_SRx4_finetune Description: 4x photo upscaler which handles jpg compression. This model will preserve noise. Trained on the very recently released (~2 weeks ago) Adaptive-Token-Dictionary network. Training details: AdamW optimizer with U-Net SN discriminator and BFloat16. Degraded with otf jpg compression down to 40, re-compression down to 40, together with resizes and the blur kernels. Losses: PixelLoss using CHC (Clipped Huber with Cosine Similarity Loss), PerceptualLoss using Huber, GANLoss, LDL using Huber, YCbCr Color Loss (bt601) and Luma Loss (CIE XYZ) on neosr. 7 Examples: Slowpics
ATD
4x
 4xRealWebPhoto_v3_atd
 4xRealWebPhoto_v3_atd
4xRealWebPhoto_v3_atd
A 4x model for 4x upscaler for photos downloaded from the web . Link to Github Release Name: 4xRealWebPhoto_v3_atd License: CC BY 4.0 Author: Philip Hofmann Network: ATD Scale: 4 Release Date: 22.03.2024 Iterations: 250'000 epoch: 10 batch_size: 6, 3 HR_size: 128, 192 Dataset: 4xRealWebPhoto_v3 Number of train images: 101'904 OTF Training: No Pretrained_Model_G: 003_ATD_SRx4_finetune Description: 4x real web photo upscaler, meant for upscaling photos downloaded from the web. Trained on my v3 of my 4xRealWebPhoto dataset, it should be able to handle noise, jpg and webp (re)compression, (re)scaling, and just a little bit of lens blur, while also be able to handle good quality input. Trained on the very recently released (~2 weeks ago) Adaptive-Token-Dictionary network. My 4xRealWebPhoto dataset tried to simulate the use-case of a photo being uploaded to the web and being processed by the service provides (like on a social media platform) so compression/downscaling, then maybe being downloaded and re-uploaded by another used where it, again, were processed by the service provider. I included different variants in the dataset. The pdf with info to the v2 dataset can be found here, while i simply included whats different in the v3 png: !4xRealWebPhoto_v3 Training details: AdamW optimizer with U-Net SN discriminator and BFloat16. Degraded with otf jpg compression down to 40, re-compression down to 40, together with resizes and the blur kernels. Losses: PixelLoss using CHC (Clipped Huber with Cosine Similarity Loss), PerceptualLoss using Huber, GANLoss, LDL using Huber, Focal Frequency, Gradient Variance with Huber, YCbCr Color Loss (bt601) and Luma Loss (CIE XYZ) on neosr with norm: true. 11 Examples: Slowpics
RGT
4x
 4xRealWebPhoto_v2_rgt_s
 4xRealWebPhoto_v2_rgt_s
4xRealWebPhoto_v2_rgt_s
A 4x model for 4x real web photo upscaler, meant for upscaling photos downloaded from the web . Link to Github Release Name: 4xRealWebPhoto_v2_rgt_s License: CC BY 4.0 Author: Philip Hofmann Network: RGT Network Option: RGT-S Scale: 4 Release Date: 10.03.2024 Iterations: 220'000 epoch: 5 batch_size: 16 HR_size: 128 Dataset: 4xRealWebPhoto_v2 (see details in attached pdf file in github release) Number of train images: 1'086'976 (or 543'488 pairs) OTF Training: No Pretrained_Model_G: RGT_S_x4 Description: 4x real web photo upscaler, meant for upscaling photos downloaded from the web. Trained on my v2 of my 4xRealWebPhoto dataset, it should be able to handle realistic noise, jpg and webp compression and re-compression, scaling and rescaling with multiple downscampling algos, and handle a little bit of lens blur. Thought featuring degraded images in the examples, this model should also be able to handle good quality input. Details about the approach/dataset I made to train this model (and therefore also what this model would be capable of handling) is in the attached pdf in the github release. My previous tries of this dataset, meaning v0 and v1, will get a separate entry, though this version would be recommended over them. 12 Examples on Slowpics
DAT
4x
IllustrationJaNai_V1_DAT2
IllustrationJaNai_V1_DAT2
IllustrationJaNai_V1_DAT2
A 4x model for Illustrations, digital art, manga covers. Model for color images including manga covers and color illustrations, digital art, visual novel art, artbooks, and more. The DAT2 version is the highest quality version but also the slowest. See the ESRGAN version for faster performance. https://slow.pics/c/GfArurPG
ESRGAN
4x
IllustrationJaNai_V1_ESRGAN
IllustrationJaNai_V1_ESRGAN
IllustrationJaNai_V1_ESRGAN
A 4x model for Illustrations, digital art, manga covers. Model for color images including manga covers and color illustrations, digital art, visual novel art, artbooks, and more. The ESRGAN version is high quality with balanced performance. See the DAT2 version for maximum quality. https://slow.pics/c/GfArurPG
Compact
1x
SwatKats Compact
SwatKats Compact
SwatKats Compact
A 1x model for Upscaling older cartoons. This is yet another retrain of SaurusX's SwatKats_Lite model. The dataset was reprocessed with my Find Misaligned Images script, along with the new ImgAlign update, which drastically reduced artifacts and increased the model's capabilities. This particular model is roughly on par with or slightly behind the original, doing better in some spots and worse in others. Refer to the attached examples to see this. The advantage of this over the original is the speed improvement of Compact over ESRGAN-lite. In a 480p test on an RTX 4090, the original ESRGAN-lite model took 0.28 seconds to process a frame vs Compact's 0.13 seconds. https://slow.pics/s/dF3Icjpv OR <https://imgsli.com/MjQxMzc1/0/1>
RGT
4x
NomosUni rgt multijpg
NomosUni rgt multijpg
NomosUni rgt multijpg
A 4x model for 4x universal DoF preserving upscaler. 4x universal DoF preserving upscaler, pair trained with jpg degradation (down to 40) and multiscale (down_up, bicubic, bilinear, box, nearest, lanczos) in neosr with adamw, unet and pixel, perceptual, gan and color losses. Similiar to the last model I released, with same dataset, this is a full RGT model in comparison. FP32 ONNX conversion is provided in the google drive folder for you to run it. 6 Examples (To check JPG compression handling see Example Nr.4, to check Depth of Field handlin see Example Nr.1 & Nr.6): Slowpics
ESRGAN
4x
WTP ColorDS
WTP ColorDS
The model was trained for some tests, but maybe someone will need to remove a screentone with a color image; in addition to the screentone, it can handle small halftones quite well