## AniSD Suite (15 models)

**Scale:** 2x (1x for AniSD DB)  
**Architecture:** SPAN / Compact / SwinIR Small / CRAFT / DAT2  / RPLKSR

**Dataset:** Anime frames. Credits to @.kuronoe. and @pwnsweet (EVA dataset) for their contributions to the dataset!  
**Dataset Size:** ~7,000 - ~13,000  

AniSD is a suite of 15 (as of time of writing) specialized SISR models trained to restore and upscale standard definition digital anime from ~2000s and onwards, including both both WEB and DVD releases. Faithfulness to the source and natural-looking output are the guiding principles behind the training of the AniSD models. This means avoiding oversharpened output (which can look especially absurd on standard definition sources), minimizing upscaling artifacts, retaining the natural detail of the source and of course, fixing the standard range of issues found in many DVD/WEB release (chroma issues, compression, haloing/ringing, blur, dotcrawl, banding etc.). Refer to the infographic above for a quick breakdown of the available models, and refer to the [Github release](https://github.com/Sirosky/Upscale-Hub/releases/tag/AniSD-RealPLKSR) for further information.

AniSD Suite (Multiple Models)

[Link to Github Release with more infos to the process](https://github.com/Phhofm/models/releases/tag/2xAoMR_mosr)

2xAoMR_mosr  
Scale: 4  
Architecture: [MoSR](https://github.com/umzi2/MoSR)  
Architecture Option: [mosr](https://github.com/umzi2/MoSR/blob/95c5bf73cca014493fe952c2fbc0bdbe593da08f/neosr/archs/mosr_arch.py#L117)  

Author: Philip Hofmann  
License: CC-BY-0.4  
Purpose: A 2x mosr upscaling model for game textures  
Subject: Game Textures  
Input Type: Images  
Release Date: 21.09.2024 (dd/mm/yy)  

Dataset: Game Textures from Age of Mythology: Retold  
Dataset Size: 13'847  
OTF (on the fly augmentations): No  
Pretrained Model: [4xNomos2_hq_mosr](https://github.com/Phhofm/models/releases/tag/4xNomos2_hq_mosr)  
Iterations: 510'000  
Batch Size: 4  
Patch Size: 64  

## Description:   
In short: A 2x game texture mosr upscaling model, trained on and for (but not limited to) Age of Mythology: Retold textures.

Since I have been playing Age of Mythology: Retold (casual player), I thought it would be interesting to train an single image super resolution model on (and for) game textures of AoMR, but this model should be usable for other game textures aswell.
This is a 2x model, since the biggest texture images are already 4096x4096, I thought going 4x on those would be overkill (also there are already 4x game texture upscaling models, so this model can be used for similiar cases where 4x is not needed).

Model Showcase:
[Slowpics](https://slow.pics/s/IlkmsToH)

2xAoMR_mosr

[Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021)](https://github.com/cszn/BSRGAN)

BSRGANx2

As the name might suggest, this model aims to upscale text with less distortion that other models. It seems to do a good job generally, but don't expect it to be a state of the art model that can upscale magazines and stuff. It makes things more readable but since it was train on B/W pictures it desaturates them.

BSTexty

Pretrained: 4x_muy4_035_1.pth

This is my first model, so it's not perfect, but I wanted to see if I could train an upscaling model that didn't result in a lot of detail loss and deblurring like the current Compact upscaling models. I believe I accomplished this, but I was unable to reduce contrast shifting. The contrast shifting may cause skin tones to appear incorrect on bright frames, but it's not too bad overall! I'll list a few examples below; more can be found by clicking the Overview link on the Github release page.

Pretrained model: Kim2091_CrappyCompactV2.pth

Bubble AnimeScale Compact v1

2x_Bubble_AnimeScale_SwinIR_Small_v1 was trained to upscale anime frames faithfully without major contrast shifting compared to my compact model. Although much slower compared to my compact model, the results look significantly better! A few example upscales are listed below; more can be found by clicking the Overview link on the Github release page.

Bubble AnimeScale SwinIR Small v1

This is my first 2xcompact model, the main purpose of it is to upscale Dragon ball GT. for upscaling videos that have grain I would recommend denoising and dehaloing it before passing it to the model for temporal stability.

GT-evA

SPAN model for live action film and digital video. The main goal is to fix/reduce common video quality problems while maintaining fidelity. I tried the existing video-focused models and they all denoise or cause colour shifts so I decided to train my own.

The model is trained with compression (JPEG, MPEG-4 ASP, H264, VP9, H265), chroma subsampling, blurriness from multiple scaling, uneven horizontal and vertical resolution, oversharpening halos, bad deinterlacing jaggies, and onscreen text. It is not trained to remove noise at all so it preserves details in the source well. To prevent colour/brightness shifts, I used consistency loss in neosr. I had to modify consistency loss to use a stronger blur so it doesn't interfere with the halo removal.

Limitations:
1. The model has limited ability to see details through heavy grain, but light to moderate grain is fine.
2. The model still does not handle bad deinterlacing perfectly, especially if the source is vertically resized. Fixing bad deinterlacing is not the main goal so it is what it is. Sources that are line-doubled throughout should be descaled back to half height first for best results.
3. The model sometimes oversharpens a little. This is probably because the training data has some oversharpened images.
4. This model generally cannot handle VHS degradation.

More comparisons: https://slow.pics/c/DtDN7gaq

The training config and image degradation scripts used to create training data can be found in https://github.com/jcj83429/upscaling/tree/9332e7d5b07747ff347e5abdc43f8144364de9f7/2xLiveActionV1_SPAN

2xLiveActionV1_SPAN

This is a model for the restoration of My Little Pony: Friendship is Magic, however it also works decently well on all similar art. 

It was trained in 2x on ground truth 3840x2160 HRs and 1920x1080 LRs of varying compression, so it is able to upscale from 1080p to 2160p, where its detail retention is great, however it may create noticeable artifacting if looked at closely, like areas of randomly coloured pixels along edges. In 1x or 1.5x (2x upscaled and then downscaled back down) it performs extremely well, almost perfectly in fact, in correcting colours, removing compression, and crisping up lines - and this is the way the model is intended to be used (hence the acronym of its name being "SS", or "supersampling").

**[Github Release](https://github.com/Derpiesaurus/models/releases/tag/v1.0)**   
**Showcase:** https://slow.pics/s/1ixqCSjy

MLP StarSample V1.0

Purpose: Compression Removal, Noise Reduction, Line Correction, MPEG2 / LD Artifact Removal

This is my first model release. The model was inspired from a personal project which I have been pursuing for some time now, which this model aims to solve. This model will upscale low resolution hand drawn animation from 1970s to 2000. The colors are retained with effective noise control. Details and Textures are maintained to a good degree considering animations. Color Spills are also corrected depending on the colors. Shades of white and yellows have been difficult. It also makes the lines slightly sharper and thinner. This could be a plus depending on your source. The model is also temporally stable across my tests with little observable issues. 

**Showcase:** 
Images - https://imgsli.com/MjYwNzY1/12/13
Video Sample -  https://t.ly/Jp7-w vs Upscale - https://t.ly/PdsKs

2x Pooh V4

Purpose: Upscale text in very low quality to normal quality.

The upscale model is specifically designed to enhance lower-quality text images, improving their clarity and readability by upscaling them by 2x. It excels at processing moderately sized text, effectively transforming it into high-quality, legible scans. However, the model may encounter challenges when dealing with very small text, as its performance is optimized for text of a certain minimum size. For best results, input images should contain text that is not excessively small.

Text2HD v.1

Purpose: 

Emulating Waifu2x at Noise Level 3 NOTE: You can't use this with regular esrgan forks or the bot, it has to be run through basicsr

Waifaux NL3 SRResNet

Purpose: 

Trained this model to see how it would work trying to essentially get the same results as Waifu2x but with ESRGAN

Waifaux NL3 SuperLite

[SwinIR: Image Restoration Using Swin Transformer](https://github.com/JingyunLiang/SwinIR)


Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.

SwinIR-M-x2 (classicalSR-DF2K-s64w8)

SwinIR-M-x2 (classicalSR-DIV2K-s64w8)

SwinIR-M-x2 (lightweightSR-DIV2K-s64w8)

SwinIR-M-x2-GAN (realSR-BSRGAN-DFO-s64w8)

[Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) aims at developing Practical Algorithms for General Image/Video Restoration.

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data.

RealESRGAN_x2Plus

Interpolation between 4x-UltraSharp and 4x-TextSharp-v0.5. Works amazingly on anime. It also upscales text, but it's far better with anime content. I rebranded this model on 2/10/22 to 4x-AnimeSharp from 4x-TextSharpV1.

Pretrained model: Interpolation between 4x-UltraSharp and 4x-TextSharp-v0.5

AnimeSharp

Since the above 4xFFHQDAT model is not able to handle the noise present in low quality input images, i made a small variant/finetune of this, the 4xFFHQLDAT model. This model might come in handy if your input image is of bad quality/not suited for the previous model. I basically made this model in a response to an input image posted in upscaling-results channel as a request to this upscale model (since 4xFFHQDAT would not be able to handle noise), see Imgsli1 example below for result.

4xFFHQLDAT

Description: 4x photo upscaler for faces, trained on the FaceUp dataset. These models are an improvement over the previously released 4xFFHQDAT and are its successors. These models are released together with the FaceUp dataset, plus the accompanying [youtube video](https://www.youtube.com/watch?v=TBiVIzQkptI)

This model comes in 4 different versions:  
4xFaceUpDAT (for good quality input)  
4xFaceUpLDAT (for lower quality input, can additionally denoise)  
4xFaceUpSharpDAT (for good quality input, produces sharper output, trained without USM but sharpened input images, good quality input)  
4xFaceUpSharpLDAT (for lower quality input, produces sharper output, trained without USM but sharpened input images, can additionally denoise)  

I recommend trying out 4xFaceUpDAT

4xFaceUpLDAT

4xFaceUpSharpLDAT

4x photo upscaler on the SSDIR_Sharp dataset, trained with otf, using the same settings as Real_HAT_GAN.  
These otf values are very high in my opinion. This was an experiment if I used the same settings as Real_HAT_GAN (therefore the name).   
This model will denoise very strongly and smooth out a lot (so a lot of details will be lost. But maybe this effect might be beneficial to someone)

4xReal_SSDIR_DAT_GAN

UniScale_Restore has strong compression removal that helps with restoring heavily compressed or noisy images. It is intended to compete with BSRGAN. Trained with BSRGAN_Resize and Combo_Noise in traiNNer.

UniScale Restore

Low-resolution text/typography and symbols

Architecture	RealPLKSR
Scale	2x
Color Mode	RGB
License	CC-BY-SA-4.0 Private use Commercial use Distribution Modifications Credit required Same License State Changes No Liability & Warranty Disclaimer
Date	2024-08-27
Dataset	Scanned books and text
Dataset size	8000
Training iterations	168000
Training batch size	8

Text2HD v.1

Similar Models