Name: 2x 2xLiveActionV1_SPAN
Author: jcj83429

## Ani4K v2  

 Ani4K v2, as the successor to the original Ani4K, retains its predecessor's fantastic detail retention, depth of field preservation and faithfulness to the original source. As the name suggests, the model is targeted at modern anime, ranging from high-quality Bluray to crappy WEB releases, for upscaling either to 2K or 4K.

An **UltraCompact** version of the model is available on the [Github release](https://github.com/Sirosky/Upscale-Hub/releases/tag/Ani4K-v2). The UltraCompact version is faster, without a perceptual hit to quality in most cases.

📌 [More comparisons](https://slow.pics/c/1HykdyT5)

****

**FAQ**

- *How does v2 differ from v1?* I'm so glad you asked! A shortcoming of the v1 models is that the model really struggled on sources which were poorly mastered. This is unfortunately still very common even with modern anime. v2 is far more capable of dealing with such sources. 
- *How does Ani4K v2 differ from JaNai v3? Which one should I pick?* JaNai v3 is a fantastic model, and shares many of the fundamental training objectives behind Ani4K (DOF preservation, faithfulness, etc.). I'd say that the primary difference is one of training philosophy-- JaNai seeks to render the source as if it was originally mastered in 4K, whereas Ani4K seeks to produce an upscale that is as close as possible to the source (while cleaning up any issues). Long story short, test both models and see which you prefer.
- *What model versions are there?* Ani4K comes in Compact and UltraCompact flavors. Compact is of course a standard option. UltraCompact provides a noticeable performance uplift, without too much impact on quality. I ultimately did not train a SuperUltraCompact variant as I felt the hit to the model quality was far too significant.

Ani4VK v2 Compact

## AniSD Suite (15 models)

**Scale:** 2x (1x for AniSD DB)  
**Architecture:** SPAN / Compact / SwinIR Small / CRAFT / DAT2  / RPLKSR

**Dataset:** Anime frames. Credits to @.kuronoe. and @pwnsweet (EVA dataset) for their contributions to the dataset!  
**Dataset Size:** ~7,000 - ~13,000  

AniSD is a suite of 15 (as of time of writing) specialized SISR models trained to restore and upscale standard definition digital anime from ~2000s and onwards, including both both WEB and DVD releases. Faithfulness to the source and natural-looking output are the guiding principles behind the training of the AniSD models. This means avoiding oversharpened output (which can look especially absurd on standard definition sources), minimizing upscaling artifacts, retaining the natural detail of the source and of course, fixing the standard range of issues found in many DVD/WEB release (chroma issues, compression, haloing/ringing, blur, dotcrawl, banding etc.). Refer to the infographic above for a quick breakdown of the available models, and refer to the [Github release](https://github.com/Sirosky/Upscale-Hub/releases/tag/AniSD-RealPLKSR) for further information.

AniSD Suite (Multiple Models)

2x general purpose anime upscaler, with a focus on cleaning up scuffed sources while enhancing texture detail. While the model was trained on DVDs, it still works well on modern anime. The model is trained to deal with noise, compression artifacts, blur, bleeding, haloing and scuffed line art. Apparently, it also learned to deal with some dotcrawl.

AniScale

upscaling old anime. help to denoise and find lines and dehalo

BIGOLDIES

[Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021)](https://github.com/cszn/BSRGAN)

BSRGANx2

Purpose: Digital Animation

Meant as a versatile model for upscaling high detail digital anime and cartoons. Has debanding, MPEG-2 correction, and halo reduction. Trained to handle both 4:3 and 16:9 DVD material with equal efficacy. Will retain a lot of textures except for the really high freq stuff.

Digitoon Lite

I trained Skr's great LD-Anime model on compact architecture. It upscales while fixing numerous video problems, including: noise/grain, compression artifacts, rainbows, dot crawl, halos and color bleed. This compact version may look slightly worse than Skr's original model, but runs significantly faster and also retains the correct colors better than the original model did.

LD-Anime Compact

Purpose: Denoise/Dehalo

Denoise, dehalo, derainbow old anime

LD-Anime_Skr v1.0

SPAN model for live action film and digital video. The main goal is to fix/reduce common video quality problems while maintaining fidelity. I tried the existing video-focused models and they all denoise or cause colour shifts so I decided to train my own.

The model is trained with compression (JPEG, MPEG-4 ASP, H264, VP9, H265), chroma subsampling, blurriness from multiple scaling, uneven horizontal and vertical resolution, oversharpening halos, bad deinterlacing jaggies, and onscreen text. It is not trained to remove noise at all so it preserves details in the source well. To prevent colour/brightness shifts, I used consistency loss in neosr. I had to modify consistency loss to use a stronger blur so it doesn't interfere with the halo removal.

Limitations:
1. The model has limited ability to see details through heavy grain, but light to moderate grain is fine.
2. The model still does not handle bad deinterlacing perfectly, especially if the source is vertically resized. Fixing bad deinterlacing is not the main goal so it is what it is. Sources that are line-doubled throughout should be descaled back to half height first for best results.
3. The model sometimes oversharpens a little. This is probably because the training data has some oversharpened images.
4. This model generally cannot handle VHS degradation.

More comparisons: https://slow.pics/c/DtDN7gaq

The training config and image degradation scripts used to create training data can be found in https://github.com/jcj83429/upscaling/tree/9332e7d5b07747ff347e5abdc43f8144364de9f7/2xLiveActionV1_SPAN

2xLiveActionV1_SPAN

This is a model for the restoration of My Little Pony: Friendship is Magic, however it also works decently well on all similar art. 

It was trained in 2x on ground truth 3840x2160 HRs and 1920x1080 LRs of varying compression, so it is able to upscale from 1080p to 2160p, where its detail retention is great, however it may create noticeable artifacting if looked at closely, like areas of randomly coloured pixels along edges. In 1x or 1.5x (2x upscaled and then downscaled back down) it performs extremely well, almost perfectly in fact, in correcting colours, removing compression, and crisping up lines - and this is the way the model is intended to be used (hence the acronym of its name being "SS", or "supersampling").

**[Github Release](https://github.com/Derpiesaurus/models/releases/tag/v1.0)**   
**Showcase:** https://slow.pics/s/1ixqCSjy

MLP StarSample V1.0

Made to restore screentones and remove compression artifacts in manga images with widths between 650 and 900 pixels (can be more than 900).

2x Manga Ora

Purpose: Compression Removal, Noise Reduction, Line Correction, MPEG2 / LD Artifact Removal

This is my first model release. The model was inspired from a personal project which I have been pursuing for some time now, which this model aims to solve. This model will upscale low resolution hand drawn animation from 1970s to 2000. The colors are retained with effective noise control. Details and Textures are maintained to a good degree considering animations. Color Spills are also corrected depending on the colors. Shades of white and yellows have been difficult. It also makes the lines slightly sharper and thinner. This could be a plus depending on your source. The model is also temporally stable across my tests with little observable issues. 

**Showcase:** 
Images - https://imgsli.com/MjYwNzY1/12/13
Video Sample -  https://t.ly/Jp7-w vs Upscale - https://t.ly/PdsKs

2x Pooh V4

Purpose: Upscale text in very low quality to normal quality.

The upscale model is specifically designed to enhance lower-quality text images, improving their clarity and readability by upscaling them by 2x. It excels at processing moderately sized text, effectively transforming it into high-quality, legible scans. However, the model may encounter challenges when dealing with very small text, as its performance is optimized for text of a certain minimum size. For best results, input images should contain text that is not excessively small.

Text2HD v.1

Purpose: VHS Restoration

An advanced VHS recording model designed to enhance video quality by reducing artifacts such as haloing, ghosting, and noise patterns. Optimized primarily for PAL resolution (NTSC might work good as well).

VHS2HD

[SwinIR: Image Restoration Using Swin Transformer](https://github.com/JingyunLiang/SwinIR)


Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by up to 0.14~0.45dB, while the total number of parameters can be reduced by up to 67%.

SwinIR-M-x2 (classicalSR-DF2K-s64w8)

SwinIR-M-x2 (classicalSR-DIV2K-s64w8)

SwinIR-M-x2 (lightweightSR-DIV2K-s64w8)

SwinIR-M-x2-GAN (realSR-BSRGAN-DFO-s64w8)

[Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) aims at developing Practical Algorithms for General Image/Video Restoration.

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data.

RealESRGAN_x2Plus

[Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) aims at developing Practical Algorithms for General Image/Video Restoration.

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data.

We add small models that are optimized for anime videos :-)

RealESRGANv2 AnimeVideo xs-x2

BSRGAN

A 4x photo upscale DAT model trained with otf (resize, jpg, small blur) on the LSDIR dataset.

4xLSDIRDAT

Interpolation of 4xLSDIRplusC and 4xLSDIRplusR to handle jpg compression and a little bit of noise/blur

4xLSDIRplus

The RealESRGAN_x4plus finetuned with the big LSDIR dataset (84,991 images / 165 GB), with jpg compression and noise and blur

4xLSDIRplusR

A 4x photo upscaler with otf jpg compression, blur and resize, trained on musl's Nomos8k_sfw dataset for realisic sr, this time based on the [DAT arch](https://github.com/zhengchen1999/DAT), as a finetune on the official 4x DAT model.


The 295 MB file is the pth file which can be run with the [dat reo github code](https://github.com/zhengchen1999/DAT). The 85.8 MB file is an onnx conversion.


All Files can be found in [this google drive folder](https://drive.google.com/drive/folders/1b2vQHxlFQrVW22osIhQbDk98sdXzzFkx). If above onnx file is not working, you can try the other conversions in the onnx subfolder.


Examples:

[Imgsli1](https://imgsli.com/MTk4Mjg1) (generated with onnx file)

[Imgsli2](https://imgsli.com/MTk4Mjg2) (generated with onnx file)

[Imgsli](https://imgsli.com/MTk4Mjk5) (generated with testscript of dat repo on the three test images in dataset/single with pth file)

Architecture	SPAN
Scale	2x
Size	48nf
Color Mode	RGB
License	CC-BY-NC-SA-4.0 Private use Distribution Modifications Credit required Same License State Changes No Liability & Warranty Disclaimer
Date	2025-05-19
Dataset	nomosv2
Dataset size	36000
Training iterations	490000
Training epochs	271
Training batch size	20
Training HR size	128
Training OTF	No

2xLiveActionV1_SPAN

Similar Models