# Local AI Video Upscaling on Ubuntu (RTX 4070 Super) This project uses: - `ffmpeg` / `ffprobe` for demux/remux and frame handling - `Real-ESRGAN (PyTorch + CUDA)` as default upscaler backend on NVIDIA GPUs - optional legacy `realesrgan-ncnn-vulkan` backend - `upscale_video.py` as Python controller/orchestrator ## 1) System setup (Ubuntu) ### NVIDIA driver + Vulkan check ```bash nvidia-smi vulkaninfo | head ``` If missing tools: ```bash sudo apt update sudo apt install -y ffmpeg vulkan-tools mesa-vulkan-drivers unzip wget ``` Install/update NVIDIA driver with Ubuntu tooling if needed: ```bash sudo ubuntu-drivers autoinstall sudo reboot ``` ## 2) Install Real-ESRGAN backend ### Default (recommended): PyTorch + CUDA Inside project venv: ```bash python -m pip install -r requirements.txt python -m pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128 ``` The script automatically downloads model weights on first run into `~/.cache/realesrgan`. ### Optional legacy: ncnn-vulkan binary ### a) Download and install binary Download `realesrgan-ncnn-vulkan` Linux release from the official Real-ESRGAN releases page, extract it, and add the binary folder to your `PATH`. Example: ```bash mkdir -p ~/tools && cd ~/tools # Replace URL with latest Linux release zip from official Real-ESRGAN releases wget -O realesrgan.zip unzip realesrgan.zip -d realesrgan REAL_ESRGAN_DIR="$(find "$HOME/tools/realesrgan" -maxdepth 2 -type f -name realesrgan-ncnn-vulkan -printf '%h\n' | head -n 1)" echo "Found binary dir: $REAL_ESRGAN_DIR" echo "export PATH=\"$REAL_ESRGAN_DIR:\$PATH\"" >> ~/.bashrc source ~/.bashrc command -v realesrgan-ncnn-vulkan realesrgan-ncnn-vulkan -h ``` If the command is still not found, test directly with absolute path: ```bash find "$HOME/tools/realesrgan" -maxdepth 3 -type f -name realesrgan-ncnn-vulkan ``` ### b) Download model files The binary needs model files to work. Download them from the official Real-ESRGAN repository: ```bash cd ~/tools/realesrgan mkdir -p models cd models # Download all common models wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-animevideov3.pth wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth # For ncnn-vulkan, you need the converted .param and .bin files # Download pre-converted models: wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-animevideov3-x2.pth -O realesr-animevideov3-x2.pth wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-animevideov3-x3.pth -O realesr-animevideov3-x3.pth wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-animevideov3-x4.pth -O realesr-animevideov3-x4.pth ``` **Important**: The ncnn-vulkan version needs specific model formats. If models still fail, download the complete model pack: ```bash cd ~/tools/realesrgan wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-ubuntu.zip unzip -j realesrgan-ncnn-vulkan-20220424-ubuntu.zip "realesrgan-ncnn-vulkan/models/*" -d models/ ``` ## 3) Create and use the Python environment Recommended (pyenv + venv, avoids PEP668/system-pip issues): ```bash cd /home/admin_n/python/video-upscaling pyenv install -s 3.11.14 pyenv shell 3.11.14 python -m venv .venv source .venv/bin/activate python -m pip install --upgrade pip python -m pip install -r requirements.txt ``` The script works without `tqdm`, but with dependencies installed you get a clean single-line progress bar. Note: Python `3.14` currently fails with `basicsr/realesrgan` build errors. Use Python `3.11.x`. ## 4) Run the Python controller From this project directory: ```bash python3 upscale_video.py \ -i input.mp4 \ -o output_upscaled.mp4 \ --backend pytorch \ --model realesrgan-x4plus \ --scale 2 ``` PyTorch backend uses `.pth` model weights. You can pass a custom weight file via `--model-path /path/model.pth`. For legacy ncnn backend, pass `--backend ncnn` plus your existing ncnn binary/model setup. By default, temporary working files are created on `/mnt/winsteam`. Override if needed with `--temp-root /some/other/path`. By default, GPU selection uses `--gpu-id auto`. To force a specific Vulkan GPU, pass e.g. `--gpu-id 0`. ### Useful options - `--model realesr-animevideov3` for animation/anime-like sources - `--model realesrgan-x4plus` for natural/live-action footage - `--backend pytorch|ncnn` choose upscaler backend (default `pytorch`) - `--model-path /path/to/model.pth` for custom PyTorch weight file - `--weights-dir ~/.cache/realesrgan` where auto-downloaded PyTorch weights are stored - `--scale 2|3|4` - `--tile-size 128` (or 256) if you hit VRAM limits - `--jobs 2:2:2` to tune throughput (ncnn backend only) - `--crf 14` for higher output quality (bigger file) - `--keep-temp` to keep extracted and processed frame directories - `--temp-root /mnt/winsteam` for temp workspace location - `--gpu-id auto` (or `--gpu-id 0`, `--gpu-id 1`, etc.) - `--fp32` for PyTorch FP32 inference (default is FP16 on CUDA) - `--test-seconds 60` to process only first N seconds for validation - `--pre-vf "hqdn3d=1.5:1.5:6:6"` to denoise/deblock before upscaling During upscaling, the script prints live status every ~2 seconds: - processed/total frames - percentage - current average fps - ETA (remaining time) **Note**: Audio is automatically re-encoded to AAC 192kbps for maximum compatibility. **Progress display**: Real-ESRGAN's verbose output is suppressed. The script shows clean progress with tqdm (if installed) or simple periodic updates otherwise. **Aspect ratio handling**: Input frames are normalized to square pixels before upscaling (`SAR=1`). For anamorphic sources (for example 720x576 PAL 16:9), this avoids “squeezed” frame geometry. The conversion uses non-cropping width expansion, so source frame content is preserved. ## 5) Typical tuning for RTX 4070 Super Start with: - `--scale 2` - `--jobs 2:2:2` - `--tile-size 0` If you see memory errors, lower memory pressure using: - `--tile-size 128` - `--jobs 1:2:2` ## 6) Optional quality upgrades For best final quality, you can output with HEVC: ```bash python3 upscale_video.py -i input.mp4 -o output_hevc.mp4 --codec libx265 --crf 18 ``` ## 7) GPU ID mapping (`nvidia-smi` vs Vulkan `-g`) This section is mainly relevant for legacy `ncnn` backend. PyTorch backend usually follows CUDA GPU indexing. Check NVIDIA GPUs: ```bash nvidia-smi --query-gpu=index,name,uuid,pci.bus_id --format=csv,noheader ``` Check Vulkan devices: ```bash vulkaninfo --summary ``` Match by GPU name or UUID: - `nvidia-smi` UUID format: `GPU-xxxxxxxx-...` - `vulkaninfo` UUID format: `xxxxxxxx-...` (same value without `GPU-` prefix) Example from this setup: - Vulkan `GPU0` = `NVIDIA GeForce RTX 4070 SUPER` - `nvidia-smi` index `1` = `NVIDIA GeForce RTX 4070 SUPER` So for the RTX 4070 SUPER here, use Vulkan ID `-g 0`. With this script, that is: ```bash python3 upscale_video.py -i input.mp4 -o output.mp4 --gpu-id 0 ``` Quick test run example (first 60 seconds only): ```bash python3 upscale_video.py -i input.mp4 -o output_test.mp4 --model realesrgan-x4plus --model-path ~/tools/realesrgan/models --scale 2 --gpu-id 0 --test-seconds 60 ``` --- If you want, this can be extended with: - batch-folder processing - automatic model selection by content type - optional frame interpolation (RIFE) for smoother motion