Pytorch implementation of video upscaling using Real-ESRGAN.

This commit is contained in:
2026-02-28 12:25:57 +01:00
parent 72fe4e04d0
commit 0420add695
6 changed files with 342 additions and 94 deletions

View File

@@ -2,7 +2,8 @@
This project uses:
- `ffmpeg` / `ffprobe` for demux/remux and frame handling
- `realesrgan-ncnn-vulkan` for GPU upscaling (Vulkan backend, works well on NVIDIA)
- `Real-ESRGAN (PyTorch + CUDA)` as default upscaler backend on NVIDIA GPUs
- optional legacy `realesrgan-ncnn-vulkan` backend
- `upscale_video.py` as Python controller/orchestrator
## 1) System setup (Ubuntu)
@@ -25,7 +26,19 @@ sudo ubuntu-drivers autoinstall
sudo reboot
```
## 2) Install Real-ESRGAN binary and models
## 2) Install Real-ESRGAN backend
### Default (recommended): PyTorch + CUDA
Inside project venv:
```bash
python -m pip install -r requirements.txt
python -m pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
```
The script automatically downloads model weights on first run into `~/.cache/realesrgan`.
### Optional legacy: ncnn-vulkan binary
### a) Download and install binary
@@ -79,16 +92,22 @@ wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrga
unzip -j realesrgan-ncnn-vulkan-20220424-ubuntu.zip "realesrgan-ncnn-vulkan/models/*" -d models/
```
## 3) Install Python dependencies (optional but recommended)
## 3) Create and use the Python environment
For a cleaner progress bar during upscaling:
Recommended (pyenv + venv, avoids PEP668/system-pip issues):
```bash
pip install tqdm
# or
pip install -r requirements.txt
cd /home/admin_n/python/video-upscaling
pyenv install -s 3.11.14
pyenv shell 3.11.14
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txt
```
The script works without tqdm but will show a nicer single-line progress bar if it's installed.
The script works without `tqdm`, but with dependencies installed you get a clean single-line progress bar.
Note: Python `3.14` currently fails with `basicsr/realesrgan` build errors. Use Python `3.11.x`.
## 4) Run the Python controller
@@ -97,13 +116,13 @@ From this project directory:
python3 upscale_video.py \
-i input.mp4 \
-o output_upscaled.mp4 \
--model realesr-animevideov3 \
--model-path ~/tools/realesrgan/models \
--backend pytorch \
--model realesrgan-x4plus \
--scale 2
```
**Important**: If you get "find_blob_index_by_name" errors, the model files are missing.
Use `--model-path` to point to your models directory (see section 2b above).
PyTorch backend uses `.pth` model weights. You can pass a custom weight file via `--model-path /path/model.pth`.
For legacy ncnn backend, pass `--backend ncnn` plus your existing ncnn binary/model setup.
By default, temporary working files are created on `/mnt/winsteam`.
Override if needed with `--temp-root /some/other/path`.
@@ -114,14 +133,17 @@ To force a specific Vulkan GPU, pass e.g. `--gpu-id 0`.
### Useful options
- `--model realesr-animevideov3` for animation/anime-like sources
- `--model realesrgan-x4plus` for natural/live-action footage
- `--model-path ~/tools/realesrgan/models` to specify model directory
- `--backend pytorch|ncnn` choose upscaler backend (default `pytorch`)
- `--model-path /path/to/model.pth` for custom PyTorch weight file
- `--weights-dir ~/.cache/realesrgan` where auto-downloaded PyTorch weights are stored
- `--scale 2|3|4`
- `--tile-size 128` (or 256) if you hit VRAM limits
- `--jobs 2:2:2` to tune throughput
- `--jobs 2:2:2` to tune throughput (ncnn backend only)
- `--crf 14` for higher output quality (bigger file)
- `--keep-temp` to keep extracted and processed frame directories
- `--temp-root /mnt/winsteam` for temp workspace location
- `--gpu-id auto` (or `--gpu-id 0`, `--gpu-id 1`, etc.)
- `--fp32` for PyTorch FP32 inference (default is FP16 on CUDA)
- `--test-seconds 60` to process only first N seconds for validation
- `--pre-vf "hqdn3d=1.5:1.5:6:6"` to denoise/deblock before upscaling
@@ -148,9 +170,9 @@ Start with:
If you see memory errors, lower memory pressure using:
- `--tile-size 128`
- `6-jobs 1:2:2`
- `--jobs 1:2:2`
## 5) Optional quality upgrades
## 6) Optional quality upgrades
For best final quality, you can output with HEVC:
```bash
@@ -159,7 +181,7 @@ python3 upscale_video.py -i input.mp4 -o output_hevc.mp4 --codec libx265 --crf 1
## 7) GPU ID mapping (`nvidia-smi` vs Vulkan `-g`)
Real-ESRGAN uses Vulkan GPU IDs (`-g`), which may not match `nvidia-smi` index order.
This section is mainly relevant for legacy `ncnn` backend. PyTorch backend usually follows CUDA GPU indexing.
Check NVIDIA GPUs:
```bash