Pytorch implementation of video upscaling using Real-ESRGAN.

2026-02-28 12:25:57 +01:00
parent 72fe4e04d0
commit 0420add695
6 changed files with 342 additions and 94 deletions
--- a/README.md
+++ b/README.md
@@ -2,7 +2,8 @@

 This project uses:
 - `ffmpeg` / `ffprobe` for demux/remux and frame handling
- `realesrgan-ncnn-vulkan` for GPU upscaling (Vulkan backend, works well on NVIDIA)
+- `Real-ESRGAN (PyTorch + CUDA)` as default upscaler backend on NVIDIA GPUs
+- optional legacy `realesrgan-ncnn-vulkan` backend
 - `upscale_video.py` as Python controller/orchestrator

 ## 1) System setup (Ubuntu)
@@ -25,7 +26,19 @@ sudo ubuntu-drivers autoinstall
 sudo reboot
 ```

-## 2) Install Real-ESRGAN binary and models
+## 2) Install Real-ESRGAN backend
+
+### Default (recommended): PyTorch + CUDA
+
+Inside project venv:
+```bash
+python -m pip install -r requirements.txt
+python -m pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
+```
+
+The script automatically downloads model weights on first run into `~/.cache/realesrgan`.
+
+### Optional legacy: ncnn-vulkan binary

 ### a) Download and install binary

@@ -79,16 +92,22 @@ wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrga
 unzip -j realesrgan-ncnn-vulkan-20220424-ubuntu.zip "realesrgan-ncnn-vulkan/models/*" -d models/
 ```

-## 3) Install Python dependencies (optional but recommended)
+## 3) Create and use the Python environment

-For a cleaner progress bar during upscaling:
+Recommended (pyenv + venv, avoids PEP668/system-pip issues):
 ```bash
-pip install tqdm
-# or
-pip install -r requirements.txt
+cd /home/admin_n/python/video-upscaling
+pyenv install -s 3.11.14
+pyenv shell 3.11.14
+python -m venv .venv
+source .venv/bin/activate
+python -m pip install --upgrade pip
+python -m pip install -r requirements.txt
 ```

-The script works without tqdm but will show a nicer single-line progress bar if it's installed.
+The script works without `tqdm`, but with dependencies installed you get a clean single-line progress bar.
+
+Note: Python `3.14` currently fails with `basicsr/realesrgan` build errors. Use Python `3.11.x`.

 ## 4) Run the Python controller

@@ -97,13 +116,13 @@ From this project directory:
 python3 upscale_video.py \
  -i input.mp4 \
  -o output_upscaled.mp4 \
-  --model realesr-animevideov3 \
-  --model-path ~/tools/realesrgan/models \
+  --backend pytorch \
+  --model realesrgan-x4plus \
  --scale 2
 ```

-**Important**: If you get "find_blob_index_by_name" errors, the model files are missing.
-Use `--model-path` to point to your models directory (see section 2b above).
+PyTorch backend uses `.pth` model weights. You can pass a custom weight file via `--model-path /path/model.pth`.
+For legacy ncnn backend, pass `--backend ncnn` plus your existing ncnn binary/model setup.

 By default, temporary working files are created on `/mnt/winsteam`.
 Override if needed with `--temp-root /some/other/path`.
@@ -114,14 +133,17 @@ To force a specific Vulkan GPU, pass e.g. `--gpu-id 0`.
 ### Useful options
 - `--model realesr-animevideov3` for animation/anime-like sources
 - `--model realesrgan-x4plus` for natural/live-action footage
- `--model-path ~/tools/realesrgan/models` to specify model directory
+- `--backend pytorch|ncnn` choose upscaler backend (default `pytorch`)
+- `--model-path /path/to/model.pth` for custom PyTorch weight file
+- `--weights-dir ~/.cache/realesrgan` where auto-downloaded PyTorch weights are stored
 - `--scale 2|3|4`
 - `--tile-size 128` (or 256) if you hit VRAM limits
- `--jobs 2:2:2` to tune throughput
+- `--jobs 2:2:2` to tune throughput (ncnn backend only)
 - `--crf 14` for higher output quality (bigger file)
 - `--keep-temp` to keep extracted and processed frame directories
 - `--temp-root /mnt/winsteam` for temp workspace location
 - `--gpu-id auto` (or `--gpu-id 0`, `--gpu-id 1`, etc.)
+- `--fp32` for PyTorch FP32 inference (default is FP16 on CUDA)
 - `--test-seconds 60` to process only first N seconds for validation
 - `--pre-vf "hqdn3d=1.5:1.5:6:6"` to denoise/deblock before upscaling

@@ -148,9 +170,9 @@ Start with:

 If you see memory errors, lower memory pressure using:
 - `--tile-size 128`
- `6-jobs 1:2:2`
+- `--jobs 1:2:2`

-## 5) Optional quality upgrades
+## 6) Optional quality upgrades

 For best final quality, you can output with HEVC:
 ```bash
@@ -159,7 +181,7 @@ python3 upscale_video.py -i input.mp4 -o output_hevc.mp4 --codec libx265 --crf 1

 ## 7) GPU ID mapping (`nvidia-smi` vs Vulkan `-g`)

-Real-ESRGAN uses Vulkan GPU IDs (`-g`), which may not match `nvidia-smi` index order.
+This section is mainly relevant for legacy `ncnn` backend. PyTorch backend usually follows CUDA GPU indexing.

 Check NVIDIA GPUs:
 ```bash