feat(conversions): end-to-end PPT/PPTX/ODP -> PDF pipeline with RQ worker + Gotenberg
DB/model Add Conversion model + ConversionStatus enum (pending, processing, ready, failed) Alembic migrations: create conversions table, indexes, unique (source_event_media_id, target_format, file_hash), and NOT NULL on file_hash API Enqueue on upload (ppt|pptx|odp) in routes/eventmedia.py: compute sha256, upsert Conversion, enqueue job New routes: POST /api/conversions/<media_id>/pdf — ensure/enqueue conversion GET /api/conversions/<media_id>/status — latest status/details GET /api/files/converted/<path> — serve converted PDFs Register conversions blueprint in wsgi Worker server/worker.py: convert_event_media_to_pdf Calls Gotenberg /forms/libreoffice/convert, writes to server/media/converted/ Updates Conversion status, timestamps, error messages Fix media root resolution to /server/media Prefer function enqueue over string path; expose server.worker in package init for RQ string compatibility Queue/infra server/task_queue.py: RQ queue helper (REDIS_URL, default redis://redis:6379/0) docker-compose: Add redis and gotenberg services Add worker service (rq worker conversions) Pass REDIS_URL and GOTENBERG_URL to server/worker Mount shared media volume in prod for API/worker parity docker-compose.override: Add dev redis/gotenberg/worker services Ensure PYTHONPATH + working_dir allow importing server.worker Use rq CLI instead of python -m rq for worker Dashboard dev: run as appropriate user/root and pre-create/chown caches to avoid EACCES Dashboard dev UX Vite: set cacheDir .vite to avoid EACCES in node_modules Disable Node inspector by default to avoid port conflicts Docs Update copilot-instructions.md with conversion system: flow, services, env vars, endpoints, storage paths, and data model
This commit is contained in:
15
.github/copilot-instructions.md
vendored
15
.github/copilot-instructions.md
vendored
@@ -23,12 +23,25 @@ Use this as your shared context when proposing changes. Keep edits minimal and m
|
|||||||
- Per-client group assignment (retained): `infoscreen/{uuid}/group_id` via `server/mqtt_helper.py`.
|
- Per-client group assignment (retained): `infoscreen/{uuid}/group_id` via `server/mqtt_helper.py`.
|
||||||
- Screenshots: server-side folders `server/received_screenshots/` and `server/screenshots/`; Nginx exposes `/screenshots/{uuid}.jpg` via `server/wsgi.py` route.
|
- Screenshots: server-side folders `server/received_screenshots/` and `server/screenshots/`; Nginx exposes `/screenshots/{uuid}.jpg` via `server/wsgi.py` route.
|
||||||
|
|
||||||
|
- Presentation conversion (PPT/PPTX/ODP → PDF):
|
||||||
|
- Trigger: on upload in `server/routes/eventmedia.py` for media types `ppt|pptx|odp` (compute sha256, upsert `Conversion`, enqueue job).
|
||||||
|
- Worker: RQ worker runs `server.worker.convert_event_media_to_pdf`, calls Gotenberg LibreOffice endpoint, writes to `server/media/converted/`.
|
||||||
|
- Services: Redis (queue) and Gotenberg added in compose; worker service consumes the `conversions` queue.
|
||||||
|
- Env: `REDIS_URL` (default `redis://redis:6379/0`), `GOTENBERG_URL` (default `http://gotenberg:3000`).
|
||||||
|
- Endpoints: `POST /api/conversions/<media_id>/pdf` (ensure/enqueue), `GET /api/conversions/<media_id>/status`, `GET /api/files/converted/<path>` (serve PDFs).
|
||||||
|
- Storage: originals under `server/media/…`, outputs under `server/media/converted/` (prod compose mounts a shared volume for this path).
|
||||||
|
|
||||||
## Data model highlights (see `models/models.py`)
|
## Data model highlights (see `models/models.py`)
|
||||||
- Enums: `EventType` (presentation, website, video, message, webuntis), `MediaType` (file/website types), and `AcademicPeriodType` (schuljahr, semester, trimester).
|
- Enums: `EventType` (presentation, website, video, message, webuntis), `MediaType` (file/website types), and `AcademicPeriodType` (schuljahr, semester, trimester).
|
||||||
- Tables: `clients`, `client_groups`, `events`, `event_media`, `users`, `academic_periods`, `school_holidays`.
|
- Tables: `clients`, `client_groups`, `events`, `event_media`, `users`, `academic_periods`, `school_holidays`.
|
||||||
- Academic periods: `academic_periods` table supports educational institution cycles (school years, semesters). Events and media can be optionally linked via `academic_period_id` (nullable for backward compatibility).
|
- Academic periods: `academic_periods` table supports educational institution cycles (school years, semesters). Events and media can be optionally linked via `academic_period_id` (nullable for backward compatibility).
|
||||||
- Times are stored as timezone-aware; treat comparisons in UTC (see scheduler and routes/events).
|
- Times are stored as timezone-aware; treat comparisons in UTC (see scheduler and routes/events).
|
||||||
|
|
||||||
|
- Conversions:
|
||||||
|
- Enum `ConversionStatus`: `pending`, `processing`, `ready`, `failed`.
|
||||||
|
- Table `conversions`: `id`, `source_event_media_id` (FK→`event_media.id` ondelete CASCADE), `target_format`, `target_path`, `status`, `file_hash` (sha256), `started_at`, `completed_at`, `error_message`.
|
||||||
|
- Indexes: `(source_event_media_id, target_format)`, `(status, target_format)`; Unique: `(source_event_media_id, target_format, file_hash)`.
|
||||||
|
|
||||||
## API patterns
|
## API patterns
|
||||||
- Blueprints live in `server/routes/*` and are registered in `server/wsgi.py` with `/api/...` prefixes.
|
- Blueprints live in `server/routes/*` and are registered in `server/wsgi.py` with `/api/...` prefixes.
|
||||||
- Session usage: instantiate `Session()` per request, commit when mutating, and always `session.close()` before returning.
|
- Session usage: instantiate `Session()` per request, commit when mutating, and always `session.close()` before returning.
|
||||||
@@ -51,6 +64,8 @@ Use this as your shared context when proposing changes. Keep edits minimal and m
|
|||||||
- Holidays present in the current view (count)
|
- Holidays present in the current view (count)
|
||||||
- Period label (display_name or name) with a badge indicating whether any holidays exist in that period (overlap check)
|
- Period label (display_name or name) with a badge indicating whether any holidays exist in that period (overlap check)
|
||||||
|
|
||||||
|
Note: Syncfusion usage in the dashboard is already documented above; if a UI for conversion status/downloads is added later, link its routes and components here.
|
||||||
|
|
||||||
## Local development
|
## Local development
|
||||||
- Compose: development is `docker-compose.yml` + `docker-compose.override.yml`.
|
- Compose: development is `docker-compose.yml` + `docker-compose.override.yml`.
|
||||||
- API (dev): `server/Dockerfile.dev` with debugpy on 5678, Flask app `wsgi:app` on :8000.
|
- API (dev): `server/Dockerfile.dev` with debugpy on 5678, Flask app `wsgi:app` on :8000.
|
||||||
|
|||||||
12
Makefile
12
Makefile
@@ -25,10 +25,12 @@ help:
|
|||||||
@echo " up-prod - Start prod stack (docker-compose.prod.yml)"
|
@echo " up-prod - Start prod stack (docker-compose.prod.yml)"
|
||||||
@echo " down-prod - Stop prod stack"
|
@echo " down-prod - Stop prod stack"
|
||||||
@echo " health - Quick health checks"
|
@echo " health - Quick health checks"
|
||||||
|
@echo " fix-perms - Recursively chown workspace to current user"
|
||||||
|
|
||||||
|
|
||||||
# ---------- Development stack ----------
|
# ---------- Development stack ----------
|
||||||
.PHONY: up
|
.PHONY: up
|
||||||
yup: ## Start dev stack
|
up: ## Start dev stack
|
||||||
$(COMPOSE) up -d --build
|
$(COMPOSE) up -d --build
|
||||||
|
|
||||||
.PHONY: down
|
.PHONY: down
|
||||||
@@ -80,3 +82,11 @@ health: ## Quick health checks
|
|||||||
@echo "Dashboard (dev):" && curl -fsS http://localhost:5173/ || true
|
@echo "Dashboard (dev):" && curl -fsS http://localhost:5173/ || true
|
||||||
@echo "MQTT TCP 1883:" && nc -z localhost 1883 && echo OK || echo FAIL
|
@echo "MQTT TCP 1883:" && nc -z localhost 1883 && echo OK || echo FAIL
|
||||||
@echo "MQTT WS 9001:" && nc -z localhost 9001 && echo OK || echo FAIL
|
@echo "MQTT WS 9001:" && nc -z localhost 9001 && echo OK || echo FAIL
|
||||||
|
|
||||||
|
# ---------- Permissions ----------
|
||||||
|
.PHONY: fix-perms
|
||||||
|
fix-perms:
|
||||||
|
@echo "Fixing ownership to current user recursively (may prompt for sudo password)..."
|
||||||
|
sudo chown -R $$(id -u):$$(id -g) .
|
||||||
|
@echo "Done. Consider adding UID and GID to your .env to prevent future root-owned files:"
|
||||||
|
@echo " echo UID=$$(id -u) >> .env && echo GID=$$(id -g) >> .env"
|
||||||
|
|||||||
@@ -4,6 +4,7 @@ import react from '@vitejs/plugin-react';
|
|||||||
|
|
||||||
// https://vite.dev/config/
|
// https://vite.dev/config/
|
||||||
export default defineConfig({
|
export default defineConfig({
|
||||||
|
cacheDir: './.vite',
|
||||||
plugins: [react()],
|
plugins: [react()],
|
||||||
resolve: {
|
resolve: {
|
||||||
// 🔧 KORRIGIERT: Entferne die problematischen Aliases komplett
|
// 🔧 KORRIGIERT: Entferne die problematischen Aliases komplett
|
||||||
|
|||||||
417
deployment.md
417
deployment.md
@@ -1,417 +0,0 @@
|
|||||||
# Infoscreen Deployment Guide
|
|
||||||
|
|
||||||
Komplette Anleitung für das Deployment des Infoscreen-Systems auf einem Ubuntu-Server mit GitHub Container Registry.
|
|
||||||
|
|
||||||
## 📋 Übersicht
|
|
||||||
|
|
||||||
- **Phase 0**: Docker Installation (optional)
|
|
||||||
- **Phase 1**: Images bauen und zur Registry pushen
|
|
||||||
- **Phase 2**: Ubuntu-Server Installation
|
|
||||||
- **Phase 3**: System-Konfiguration und Start
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 🐳 Phase 0: Docker Installation (optional)
|
|
||||||
|
|
||||||
Falls Docker noch nicht installiert ist, wählen Sie eine der folgenden Optionen:
|
|
||||||
|
|
||||||
### Option A: Ubuntu Repository (schnell)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Standard Ubuntu Docker-Pakete
|
|
||||||
sudo apt update
|
|
||||||
sudo apt install docker.io docker-compose-plugin -y
|
|
||||||
sudo systemctl enable docker
|
|
||||||
sudo systemctl start docker
|
|
||||||
```
|
|
||||||
|
|
||||||
### Option B: Offizielle Docker-Installation (empfohlen)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Alte Docker-Versionen entfernen
|
|
||||||
sudo apt remove docker docker-engine docker.io containerd runc -y
|
|
||||||
|
|
||||||
# Abhängigkeiten installieren
|
|
||||||
sudo apt update
|
|
||||||
sudo apt install ca-certificates curl gnupg lsb-release -y
|
|
||||||
|
|
||||||
# Docker GPG-Key hinzufügen
|
|
||||||
sudo mkdir -p /etc/apt/keyrings
|
|
||||||
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
|
|
||||||
|
|
||||||
# Docker Repository hinzufügen
|
|
||||||
echo \
|
|
||||||
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
|
|
||||||
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
|
|
||||||
|
|
||||||
# Docker installieren (neueste Version)
|
|
||||||
sudo apt update
|
|
||||||
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin -y
|
|
||||||
|
|
||||||
# Docker aktivieren und starten
|
|
||||||
sudo systemctl enable docker
|
|
||||||
sudo systemctl start docker
|
|
||||||
|
|
||||||
# User zur Docker-Gruppe hinzufügen
|
|
||||||
sudo usermod -aG docker $USER
|
|
||||||
|
|
||||||
# Neuanmeldung für Gruppenänderung erforderlich
|
|
||||||
exit
|
|
||||||
# Neu einloggen via SSH
|
|
||||||
```
|
|
||||||
|
|
||||||
### Docker-Installation testen
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Test-Container ausführen
|
|
||||||
docker run hello-world
|
|
||||||
|
|
||||||
# Docker-Version prüfen
|
|
||||||
docker --version
|
|
||||||
docker compose version
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 🏗️ Phase 1: Images bauen und pushen (Entwicklungsmaschine)
|
|
||||||
|
|
||||||
### 1. GitHub Container Registry Login
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# GitHub Personal Access Token mit write:packages Berechtigung erstellen
|
|
||||||
echo $GITHUB_TOKEN | docker login ghcr.io -u robbstarkaustria --password-stdin
|
|
||||||
|
|
||||||
# Oder interaktiv:
|
|
||||||
docker login ghcr.io
|
|
||||||
# Username: robbstarkaustria
|
|
||||||
# Password: [GITHUB_TOKEN]
|
|
||||||
```
|
|
||||||
|
|
||||||
### 2. Images bauen und taggen
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd /workspace
|
|
||||||
|
|
||||||
# Server-Image bauen
|
|
||||||
docker build -f server/Dockerfile -t ghcr.io/robbstarkaustria/infoscreen-api:latest .
|
|
||||||
|
|
||||||
# Dashboard-Image bauen
|
|
||||||
docker build -f dashboard/Dockerfile -t ghcr.io/robbstarkaustria/infoscreen-dashboard:latest .
|
|
||||||
|
|
||||||
# Listener-Image bauen (falls vorhanden)
|
|
||||||
docker build -f listener/Dockerfile -t ghcr.io/robbstarkaustria/infoscreen-listener:latest .
|
|
||||||
|
|
||||||
# Scheduler-Image bauen (falls vorhanden)
|
|
||||||
docker build -f scheduler/Dockerfile -t ghcr.io/robbstarkaustria/infoscreen-scheduler:latest .
|
|
||||||
```
|
|
||||||
|
|
||||||
### 3. Images zur Registry pushen
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Alle Images pushen
|
|
||||||
docker push ghcr.io/robbstarkaustria/infoscreen-api:latest
|
|
||||||
docker push ghcr.io/robbstarkaustria/infoscreen-dashboard:latest
|
|
||||||
docker push ghcr.io/robbstarkaustria/infoscreen-listener:latest
|
|
||||||
docker push ghcr.io/robbstarkaustria/infoscreen-scheduler:latest
|
|
||||||
|
|
||||||
# Status prüfen
|
|
||||||
docker images | grep ghcr.io
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 🖥️ Phase 2: Ubuntu-Server Installation
|
|
||||||
|
|
||||||
### 4. Ubuntu Server vorbereiten
|
|
||||||
|
|
||||||
```bash
|
|
||||||
sudo apt update && sudo apt upgrade -y
|
|
||||||
|
|
||||||
# Grundlegende Tools installieren
|
|
||||||
sudo apt install git curl wget -y
|
|
||||||
|
|
||||||
# Docker installieren (siehe Phase 0)
|
|
||||||
```
|
|
||||||
|
|
||||||
### 5. Deployment-Dateien übertragen
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Deployment-Ordner erstellen
|
|
||||||
mkdir -p ~/infoscreen-deployment
|
|
||||||
cd ~/infoscreen-deployment
|
|
||||||
|
|
||||||
# Dateien vom Dev-System kopieren (über SCP)
|
|
||||||
scp user@dev-machine:/workspace/docker-compose.prod.yml .
|
|
||||||
scp user@dev-machine:/workspace/.env .
|
|
||||||
scp user@dev-machine:/workspace/nginx.conf .
|
|
||||||
scp -r user@dev-machine:/workspace/certs ./
|
|
||||||
scp -r user@dev-machine:/workspace/mosquitto ./
|
|
||||||
|
|
||||||
# Alternative: Deployment-Paket verwenden
|
|
||||||
# Auf Dev-Maschine (/workspace):
|
|
||||||
# tar -czf infoscreen-deployment.tar.gz docker-compose.prod.yml .env nginx.conf certs/ mosquitto/
|
|
||||||
# scp infoscreen-deployment.tar.gz user@server:~/
|
|
||||||
# Auf Server: tar -xzf infoscreen-deployment.tar.gz
|
|
||||||
```
|
|
||||||
|
|
||||||
### 6. Mosquitto-Konfiguration vorbereiten
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Falls mosquitto-Ordner noch nicht vollständig vorhanden:
|
|
||||||
mkdir -p mosquitto/{config,data,log}
|
|
||||||
|
|
||||||
# Mosquitto-Konfiguration erstellen (falls nicht übertragen)
|
|
||||||
cat > mosquitto/config/mosquitto.conf << 'EOF'
|
|
||||||
# -----------------------------
|
|
||||||
# Netzwerkkonfiguration
|
|
||||||
# -----------------------------
|
|
||||||
listener 1883
|
|
||||||
allow_anonymous true
|
|
||||||
# password_file /mosquitto/config/passwd
|
|
||||||
|
|
||||||
# WebSocket (optional)
|
|
||||||
listener 9001
|
|
||||||
protocol websockets
|
|
||||||
|
|
||||||
# -----------------------------
|
|
||||||
# Persistence & Pfade
|
|
||||||
# -----------------------------
|
|
||||||
persistence true
|
|
||||||
persistence_location /mosquitto/data/
|
|
||||||
|
|
||||||
log_dest file /mosquitto/log/mosquitto.log
|
|
||||||
EOF
|
|
||||||
|
|
||||||
# Berechtigungen für Mosquitto setzen
|
|
||||||
sudo chown -R 1883:1883 mosquitto/data mosquitto/log
|
|
||||||
chmod 755 mosquitto/config mosquitto/data mosquitto/log
|
|
||||||
```
|
|
||||||
|
|
||||||
### 7. Environment-Variablen anpassen
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# .env für Produktionsumgebung anpassen
|
|
||||||
nano .env
|
|
||||||
|
|
||||||
# Wichtige Anpassungen:
|
|
||||||
# VITE_API_URL=https://YOUR_SERVER_HOST/api # Für Dashboard-Build (Production)
|
|
||||||
# DB_HOST=db # In Containern immer 'db'
|
|
||||||
# DB_CONN=mysql+pymysql://${DB_USER}:${DB_PASSWORD}@db/${DB_NAME}
|
|
||||||
# Alle Passwörter für Produktion ändern
|
|
||||||
```
|
|
||||||
|
|
||||||
Hinweise:
|
|
||||||
- Eine Vorlage `.env.example` liegt im Repo. Kopiere sie als Ausgangspunkt: `cp .env.example .env`.
|
|
||||||
- Für lokale Entwicklung lädt `server/database.py` die `.env`, wenn `ENV=development` gesetzt ist.
|
|
||||||
- In Produktion verwaltet Compose/Container die Variablen; kein automatisches `.env`-Load im Code nötig.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 🚀 Phase 3: System-Start und Konfiguration
|
|
||||||
|
|
||||||
### 8. Images von Registry pullen
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# GitHub Container Registry Login (falls private Repository)
|
|
||||||
echo $GITHUB_TOKEN | docker login ghcr.io -u robbstarkaustria --password-stdin
|
|
||||||
|
|
||||||
# Images pullen
|
|
||||||
docker compose -f docker-compose.prod.yml pull
|
|
||||||
```
|
|
||||||
|
|
||||||
### 9. System starten
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Container starten
|
|
||||||
docker compose -f docker-compose.prod.yml up -d
|
|
||||||
|
|
||||||
# Status prüfen
|
|
||||||
docker compose ps
|
|
||||||
docker compose logs -f
|
|
||||||
```
|
|
||||||
|
|
||||||
### 10. Firewall konfigurieren
|
|
||||||
|
|
||||||
```bash
|
|
||||||
sudo ufw enable
|
|
||||||
sudo ufw allow ssh
|
|
||||||
sudo ufw allow 80/tcp
|
|
||||||
sudo ufw allow 443/tcp
|
|
||||||
sudo ufw allow 1883/tcp # MQTT
|
|
||||||
sudo ufw allow 9001/tcp # MQTT WebSocket
|
|
||||||
sudo ufw status
|
|
||||||
```
|
|
||||||
|
|
||||||
### 11. Installation validieren
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Health-Checks
|
|
||||||
curl http://localhost/api/health
|
|
||||||
curl https://localhost -k # -k für selbstsignierte Zertifikate
|
|
||||||
|
|
||||||
# Container-Status
|
|
||||||
docker compose ps
|
|
||||||
|
|
||||||
# Logs bei Problemen anzeigen
|
|
||||||
docker compose logs server
|
|
||||||
docker compose logs dashboard
|
|
||||||
docker compose logs mqtt
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 🧪 Quickstart (Entwicklung)
|
|
||||||
|
|
||||||
Schneller Start der Entwicklungsumgebung mit automatischen Proxys und Hot-Reload.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Im Repository-Root
|
|
||||||
# 1) .env aus Vorlage erzeugen (lokal, falls noch nicht vorhanden)
|
|
||||||
cp -n .env.example .env
|
|
||||||
|
|
||||||
# 2) Dev-Stack starten (verwendet docker-compose.yml + docker-compose.override.yml)
|
|
||||||
docker compose up -d --build
|
|
||||||
|
|
||||||
# 3) Status & Logs
|
|
||||||
docker compose ps
|
|
||||||
docker compose logs -f server
|
|
||||||
docker compose logs -f dashboard
|
|
||||||
docker compose logs -f mqtt
|
|
||||||
|
|
||||||
# 4) Stack stoppen
|
|
||||||
docker compose down
|
|
||||||
```
|
|
||||||
|
|
||||||
Erreichbarkeit (Dev):
|
|
||||||
- Dashboard (Vite): http://localhost:5173
|
|
||||||
- API (Flask Dev): http://localhost:8000/api
|
|
||||||
- API Health: http://localhost:8000/health
|
|
||||||
- Screenshots: http://localhost:8000/screenshots/<uuid>.jpg
|
|
||||||
- MQTT: localhost:1883 (WebSocket: localhost:9001)
|
|
||||||
|
|
||||||
Hinweise:
|
|
||||||
- `ENV=development` lädt `.env` automatisch in `server/database.py`.
|
|
||||||
- Vite proxy routet `/api` und `/screenshots` in Dev direkt auf die API (siehe `dashboard/vite.config.ts`).
|
|
||||||
|
|
||||||
### 12. Automatischer Start (optional)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Systemd-Service erstellen
|
|
||||||
sudo tee /etc/systemd/system/infoscreen.service > /dev/null << 'EOF'
|
|
||||||
[Unit]
|
|
||||||
Description=Infoscreen Application
|
|
||||||
Requires=docker.service
|
|
||||||
After=docker.service
|
|
||||||
|
|
||||||
[Service]
|
|
||||||
Type=oneshot
|
|
||||||
RemainAfterExit=yes
|
|
||||||
WorkingDirectory=/home/$USER/infoscreen-deployment
|
|
||||||
ExecStart=/usr/bin/docker compose -f docker-compose.prod.yml up -d
|
|
||||||
ExecStop=/usr/bin/docker compose -f docker-compose.prod.yml down
|
|
||||||
TimeoutStartSec=300
|
|
||||||
|
|
||||||
[Install]
|
|
||||||
WantedBy=multi-user.target
|
|
||||||
EOF
|
|
||||||
|
|
||||||
# Service aktivieren
|
|
||||||
sudo systemctl enable infoscreen.service
|
|
||||||
sudo systemctl start infoscreen.service
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 🌐 Zugriff auf die Anwendung
|
|
||||||
|
|
||||||
Nach erfolgreichem Deployment ist die Anwendung unter folgenden URLs erreichbar:
|
|
||||||
|
|
||||||
- **HTTPS Dashboard**: `https://YOUR_SERVER_IP`
|
|
||||||
- **HTTP Dashboard**: `http://YOUR_SERVER_IP` (Redirect zu HTTPS)
|
|
||||||
- **API**: `http://YOUR_SERVER_IP/api/`
|
|
||||||
- **MQTT**: `YOUR_SERVER_IP:1883`
|
|
||||||
- **MQTT WebSocket**: `YOUR_SERVER_IP:9001`
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 🔧 Troubleshooting
|
|
||||||
|
|
||||||
### Container-Status prüfen
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Alle Container anzeigen
|
|
||||||
docker compose ps
|
|
||||||
|
|
||||||
# Spezifische Logs anzeigen
|
|
||||||
docker compose logs -f [service-name]
|
|
||||||
|
|
||||||
# Container einzeln neustarten
|
|
||||||
docker compose restart [service-name]
|
|
||||||
```
|
|
||||||
|
|
||||||
### System neustarten
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Komplett neu starten
|
|
||||||
docker compose down
|
|
||||||
docker compose up -d
|
|
||||||
|
|
||||||
# Images neu pullen
|
|
||||||
docker compose pull
|
|
||||||
docker compose up -d
|
|
||||||
```
|
|
||||||
|
|
||||||
### Häufige Probleme
|
|
||||||
|
|
||||||
| Problem | Lösung |
|
|
||||||
|---------|--------|
|
|
||||||
| Container startet nicht | `docker compose logs [service]` prüfen |
|
|
||||||
| Ports bereits belegt | `sudo netstat -tulpn \| grep :80` prüfen |
|
|
||||||
| Keine Berechtigung | User zu docker-Gruppe hinzufügen |
|
|
||||||
| DB-Verbindung fehlschlägt | Environment-Variablen in `.env` prüfen |
|
|
||||||
| Mosquitto startet nicht | Ordner-Berechtigungen für `1883:1883` setzen |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 📊 Docker-Version Vergleich
|
|
||||||
|
|
||||||
| Aspekt | Ubuntu Repository | Offizielle Installation |
|
|
||||||
|--------|------------------|------------------------|
|
|
||||||
| **Installation** | ✅ Schnell (1 Befehl) | ⚠️ Mehrere Schritte |
|
|
||||||
| **Version** | ⚠️ Oft älter | ✅ Neueste Version |
|
|
||||||
| **Updates** | ✅ Via apt | ✅ Via apt (nach Setup) |
|
|
||||||
| **Stabilität** | ✅ Getestet | ✅ Aktuell |
|
|
||||||
| **Features** | ⚠️ Möglicherweise eingeschränkt | ✅ Alle Features |
|
|
||||||
|
|
||||||
**Empfehlung:** Für Produktion die offizielle Docker-Installation verwenden.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## 📝 Wartung
|
|
||||||
|
|
||||||
### Regelmäßige Updates
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Images aktualisieren
|
|
||||||
docker compose pull
|
|
||||||
docker compose up -d
|
|
||||||
|
|
||||||
# System-Updates
|
|
||||||
sudo apt update && sudo apt upgrade -y
|
|
||||||
```
|
|
||||||
|
|
||||||
### Backup
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Container-Daten sichern
|
|
||||||
docker compose down
|
|
||||||
sudo tar -czf infoscreen-backup-$(date +%Y%m%d).tar.gz mosquitto/data/ certs/
|
|
||||||
|
|
||||||
# Backup wiederherstellen
|
|
||||||
sudo tar -xzf infoscreen-backup-YYYYMMDD.tar.gz
|
|
||||||
docker compose up -d
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
**Das Infoscreen-System ist jetzt vollständig über GitHub
|
|
||||||
@@ -73,7 +73,11 @@ services:
|
|||||||
networks:
|
networks:
|
||||||
- infoscreen-net
|
- infoscreen-net
|
||||||
healthcheck:
|
healthcheck:
|
||||||
test: ["CMD-SHELL", "mosquitto_pub -h localhost -t test -m 'health' || exit 1"]
|
test:
|
||||||
|
[
|
||||||
|
"CMD-SHELL",
|
||||||
|
"mosquitto_pub -h localhost -t test -m 'health' || exit 1",
|
||||||
|
]
|
||||||
interval: 30s
|
interval: 30s
|
||||||
timeout: 5s
|
timeout: 5s
|
||||||
retries: 3
|
retries: 3
|
||||||
@@ -98,10 +102,14 @@ services:
|
|||||||
MQTT_BROKER_URL: ${MQTT_BROKER_URL}
|
MQTT_BROKER_URL: ${MQTT_BROKER_URL}
|
||||||
MQTT_USER: ${MQTT_USER}
|
MQTT_USER: ${MQTT_USER}
|
||||||
MQTT_PASSWORD: ${MQTT_PASSWORD}
|
MQTT_PASSWORD: ${MQTT_PASSWORD}
|
||||||
|
REDIS_URL: "${REDIS_URL:-redis://redis:6379/0}"
|
||||||
|
GOTENBERG_URL: "${GOTENBERG_URL:-http://gotenberg:3000}"
|
||||||
ports:
|
ports:
|
||||||
- "8000:8000"
|
- "8000:8000"
|
||||||
networks:
|
networks:
|
||||||
- infoscreen-net
|
- infoscreen-net
|
||||||
|
volumes:
|
||||||
|
- media-data:/app/server/media
|
||||||
healthcheck:
|
healthcheck:
|
||||||
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
|
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
|
||||||
interval: 30s
|
interval: 30s
|
||||||
@@ -140,6 +148,7 @@ services:
|
|||||||
|
|
||||||
scheduler:
|
scheduler:
|
||||||
build:
|
build:
|
||||||
|
context: .
|
||||||
dockerfile: scheduler/Dockerfile
|
dockerfile: scheduler/Dockerfile
|
||||||
image: infoscreen-scheduler:latest
|
image: infoscreen-scheduler:latest
|
||||||
container_name: infoscreen-scheduler
|
container_name: infoscreen-scheduler
|
||||||
@@ -157,6 +166,41 @@ services:
|
|||||||
networks:
|
networks:
|
||||||
- infoscreen-net
|
- infoscreen-net
|
||||||
|
|
||||||
|
redis:
|
||||||
|
image: redis:7-alpine
|
||||||
|
container_name: infoscreen-redis
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- infoscreen-net
|
||||||
|
|
||||||
|
gotenberg:
|
||||||
|
image: gotenberg/gotenberg:8
|
||||||
|
container_name: infoscreen-gotenberg
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- infoscreen-net
|
||||||
|
|
||||||
|
worker:
|
||||||
|
build:
|
||||||
|
context: .
|
||||||
|
dockerfile: server/Dockerfile
|
||||||
|
image: infoscreen-worker:latest
|
||||||
|
container_name: infoscreen-worker
|
||||||
|
restart: unless-stopped
|
||||||
|
depends_on:
|
||||||
|
- redis
|
||||||
|
- gotenberg
|
||||||
|
- db
|
||||||
|
environment:
|
||||||
|
DB_CONN: "mysql+pymysql://${DB_USER}:${DB_PASSWORD}@db/${DB_NAME}"
|
||||||
|
REDIS_URL: "${REDIS_URL:-redis://redis:6379/0}"
|
||||||
|
GOTENBERG_URL: "${GOTENBERG_URL:-http://gotenberg:3000}"
|
||||||
|
PYTHONPATH: /app
|
||||||
|
command: ["rq", "worker", "conversions"]
|
||||||
|
networks:
|
||||||
|
- infoscreen-net
|
||||||
|
|
||||||
volumes:
|
volumes:
|
||||||
server-pip-cache:
|
server-pip-cache:
|
||||||
db-data:
|
db-data:
|
||||||
|
media-data:
|
||||||
|
|||||||
@@ -227,3 +227,45 @@ class SchoolHoliday(Base):
|
|||||||
"source_file_name": self.source_file_name,
|
"source_file_name": self.source_file_name,
|
||||||
"imported_at": self.imported_at.isoformat() if self.imported_at else None,
|
"imported_at": self.imported_at.isoformat() if self.imported_at else None,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# --- Conversions: Track PPT/PPTX/ODP -> PDF processing state ---
|
||||||
|
|
||||||
|
|
||||||
|
class ConversionStatus(enum.Enum):
|
||||||
|
pending = "pending"
|
||||||
|
processing = "processing"
|
||||||
|
ready = "ready"
|
||||||
|
failed = "failed"
|
||||||
|
|
||||||
|
|
||||||
|
class Conversion(Base):
|
||||||
|
__tablename__ = 'conversions'
|
||||||
|
|
||||||
|
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
# Source media to be converted
|
||||||
|
source_event_media_id = Column(
|
||||||
|
Integer,
|
||||||
|
ForeignKey('event_media.id', ondelete='CASCADE'),
|
||||||
|
nullable=False,
|
||||||
|
index=True,
|
||||||
|
)
|
||||||
|
target_format = Column(String(10), nullable=False,
|
||||||
|
index=True) # e.g. 'pdf'
|
||||||
|
# relative to server/media
|
||||||
|
target_path = Column(String(512), nullable=True)
|
||||||
|
status = Column(Enum(ConversionStatus), nullable=False,
|
||||||
|
default=ConversionStatus.pending)
|
||||||
|
file_hash = Column(String(64), nullable=False) # sha256 of source file
|
||||||
|
started_at = Column(TIMESTAMP(timezone=True), nullable=True)
|
||||||
|
completed_at = Column(TIMESTAMP(timezone=True), nullable=True)
|
||||||
|
error_message = Column(Text, nullable=True)
|
||||||
|
|
||||||
|
__table_args__ = (
|
||||||
|
# Fast lookup per media/format
|
||||||
|
Index('ix_conv_source_target', 'source_event_media_id', 'target_format'),
|
||||||
|
# Operational filtering
|
||||||
|
Index('ix_conv_status_target', 'status', 'target_format'),
|
||||||
|
# Idempotency: same source + target + file content should be unique
|
||||||
|
UniqueConstraint('source_event_media_id', 'target_format',
|
||||||
|
'file_hash', name='uq_conv_source_target_hash'),
|
||||||
|
)
|
||||||
|
|||||||
477
pptx_conversion_guide.md
Normal file
477
pptx_conversion_guide.md
Normal file
@@ -0,0 +1,477 @@
|
|||||||
|
# Recommended Implementation: PPTX-to-PDF Conversion System
|
||||||
|
|
||||||
|
## Architecture Overview
|
||||||
|
|
||||||
|
**Asynchronous server-side conversion with database tracking**
|
||||||
|
|
||||||
|
```
|
||||||
|
User Upload → API saves PPTX + DB entry → Job in Queue
|
||||||
|
↓
|
||||||
|
Client requests → API checks DB status → PDF ready? → Download PDF
|
||||||
|
→ Pending? → "Please wait"
|
||||||
|
→ Failed? → Retry/Error
|
||||||
|
```
|
||||||
|
|
||||||
|
## 1. Database Schema
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE media_files (
|
||||||
|
id UUID PRIMARY KEY,
|
||||||
|
filename VARCHAR(255),
|
||||||
|
original_path VARCHAR(512),
|
||||||
|
file_type VARCHAR(10),
|
||||||
|
mime_type VARCHAR(100),
|
||||||
|
uploaded_at TIMESTAMP,
|
||||||
|
updated_at TIMESTAMP
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE conversions (
|
||||||
|
id UUID PRIMARY KEY,
|
||||||
|
source_file_id UUID REFERENCES media_files(id) ON DELETE CASCADE,
|
||||||
|
target_format VARCHAR(10), -- 'pdf'
|
||||||
|
target_path VARCHAR(512), -- Path to generated PDF
|
||||||
|
status VARCHAR(20), -- 'pending', 'processing', 'ready', 'failed'
|
||||||
|
started_at TIMESTAMP,
|
||||||
|
completed_at TIMESTAMP,
|
||||||
|
error_message TEXT,
|
||||||
|
file_hash VARCHAR(64) -- Hash of PPTX for cache invalidation
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_conversions_source ON conversions(source_file_id, target_format);
|
||||||
|
```
|
||||||
|
|
||||||
|
## 2. Components
|
||||||
|
|
||||||
|
### **API Server (existing)**
|
||||||
|
- Accepts uploads
|
||||||
|
- Creates DB entries
|
||||||
|
- Enqueues jobs
|
||||||
|
- Delivers status and files
|
||||||
|
|
||||||
|
### **Background Worker (new)**
|
||||||
|
- Runs as separate process in **same container** as API
|
||||||
|
- Processes conversion jobs from queue
|
||||||
|
- Can run multiple worker instances in parallel
|
||||||
|
- Technology: Python RQ, Celery, or similar
|
||||||
|
|
||||||
|
### **Message Queue**
|
||||||
|
- Redis (recommended for start - simple, fast)
|
||||||
|
- Alternative: RabbitMQ for more features
|
||||||
|
|
||||||
|
### **Redis Container (new)**
|
||||||
|
- Separate container for Redis
|
||||||
|
- Handles job queue
|
||||||
|
- Minimal resource footprint
|
||||||
|
|
||||||
|
## 3. Detailed Workflow
|
||||||
|
|
||||||
|
### **Upload Process:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.post("/upload")
|
||||||
|
async def upload_file(file):
|
||||||
|
# 1. Save PPTX
|
||||||
|
file_path = save_to_disk(file)
|
||||||
|
|
||||||
|
# 2. DB entry for original file
|
||||||
|
file_record = db.create_media_file({
|
||||||
|
'filename': file.filename,
|
||||||
|
'original_path': file_path,
|
||||||
|
'file_type': 'pptx'
|
||||||
|
})
|
||||||
|
|
||||||
|
# 3. Create conversion record
|
||||||
|
conversion = db.create_conversion({
|
||||||
|
'source_file_id': file_record.id,
|
||||||
|
'target_format': 'pdf',
|
||||||
|
'status': 'pending',
|
||||||
|
'file_hash': calculate_hash(file_path)
|
||||||
|
})
|
||||||
|
|
||||||
|
# 4. Enqueue job (asynchronous!)
|
||||||
|
queue.enqueue(convert_to_pdf, conversion.id)
|
||||||
|
|
||||||
|
# 5. Return immediately to user
|
||||||
|
return {
|
||||||
|
'file_id': file_record.id,
|
||||||
|
'status': 'uploaded',
|
||||||
|
'conversion_status': 'pending'
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Worker Process:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
def convert_to_pdf(conversion_id):
|
||||||
|
conversion = db.get_conversion(conversion_id)
|
||||||
|
source_file = db.get_media_file(conversion.source_file_id)
|
||||||
|
|
||||||
|
# Status update: processing
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'processing',
|
||||||
|
'started_at': now()
|
||||||
|
})
|
||||||
|
|
||||||
|
try:
|
||||||
|
# LibreOffice Conversion
|
||||||
|
pdf_path = f"/data/converted/{conversion.id}.pdf"
|
||||||
|
subprocess.run([
|
||||||
|
'libreoffice',
|
||||||
|
'--headless',
|
||||||
|
'--convert-to', 'pdf',
|
||||||
|
'--outdir', '/data/converted/',
|
||||||
|
source_file.original_path
|
||||||
|
], check=True)
|
||||||
|
|
||||||
|
# Success
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'ready',
|
||||||
|
'target_path': pdf_path,
|
||||||
|
'completed_at': now()
|
||||||
|
})
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
# Error
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'failed',
|
||||||
|
'error_message': str(e),
|
||||||
|
'completed_at': now()
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Client Download:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.get("/files/{file_id}/display")
|
||||||
|
async def get_display_file(file_id):
|
||||||
|
file = db.get_media_file(file_id)
|
||||||
|
|
||||||
|
# Only for PPTX: check PDF conversion
|
||||||
|
if file.file_type == 'pptx':
|
||||||
|
conversion = db.get_latest_conversion(file.id, target_format='pdf')
|
||||||
|
|
||||||
|
if not conversion:
|
||||||
|
# Shouldn't happen, but just to be safe
|
||||||
|
trigger_new_conversion(file.id)
|
||||||
|
return {'status': 'pending', 'message': 'Conversion is being created'}
|
||||||
|
|
||||||
|
if conversion.status == 'ready':
|
||||||
|
return FileResponse(conversion.target_path)
|
||||||
|
|
||||||
|
elif conversion.status == 'failed':
|
||||||
|
# Optional: Auto-retry
|
||||||
|
trigger_new_conversion(file.id)
|
||||||
|
return {'status': 'failed', 'error': conversion.error_message}
|
||||||
|
|
||||||
|
else: # pending or processing
|
||||||
|
return {'status': conversion.status, 'message': 'Please wait...'}
|
||||||
|
|
||||||
|
# Serve other file types directly
|
||||||
|
return FileResponse(file.original_path)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 4. Docker Setup
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
version: '3.8'
|
||||||
|
|
||||||
|
services:
|
||||||
|
# Your API Server
|
||||||
|
api:
|
||||||
|
build: ./api
|
||||||
|
command: uvicorn main:app --host 0.0.0.0 --port 8000
|
||||||
|
ports:
|
||||||
|
- "8000:8000"
|
||||||
|
volumes:
|
||||||
|
- ./data/uploads:/data/uploads
|
||||||
|
- ./data/converted:/data/converted
|
||||||
|
environment:
|
||||||
|
- REDIS_URL=redis://redis:6379
|
||||||
|
- DATABASE_URL=postgresql://postgres:password@postgres:5432/infoscreen
|
||||||
|
depends_on:
|
||||||
|
- redis
|
||||||
|
- postgres
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
# Worker (same codebase as API, different command)
|
||||||
|
worker:
|
||||||
|
build: ./api # Same build as API!
|
||||||
|
command: python worker.py # or: rq worker
|
||||||
|
volumes:
|
||||||
|
- ./data/uploads:/data/uploads
|
||||||
|
- ./data/converted:/data/converted
|
||||||
|
environment:
|
||||||
|
- REDIS_URL=redis://redis:6379
|
||||||
|
- DATABASE_URL=postgresql://postgres:password@postgres:5432/infoscreen
|
||||||
|
depends_on:
|
||||||
|
- redis
|
||||||
|
- postgres
|
||||||
|
restart: unless-stopped
|
||||||
|
# Optional: Multiple workers
|
||||||
|
deploy:
|
||||||
|
replicas: 2
|
||||||
|
|
||||||
|
# Redis - separate container
|
||||||
|
redis:
|
||||||
|
image: redis:7-alpine
|
||||||
|
volumes:
|
||||||
|
- redis-data:/data
|
||||||
|
# Optional: persistent configuration
|
||||||
|
command: redis-server --appendonly yes
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
# Your existing Postgres
|
||||||
|
postgres:
|
||||||
|
image: postgres:15
|
||||||
|
environment:
|
||||||
|
- POSTGRES_DB=infoscreen
|
||||||
|
- POSTGRES_PASSWORD=password
|
||||||
|
volumes:
|
||||||
|
- postgres-data:/var/lib/postgresql/data
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
# Optional: Redis Commander (UI for debugging)
|
||||||
|
redis-commander:
|
||||||
|
image: rediscommander/redis-commander
|
||||||
|
environment:
|
||||||
|
- REDIS_HOSTS=local:redis:6379
|
||||||
|
ports:
|
||||||
|
- "8081:8081"
|
||||||
|
depends_on:
|
||||||
|
- redis
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
redis-data:
|
||||||
|
postgres-data:
|
||||||
|
```
|
||||||
|
|
||||||
|
## 5. Container Communication
|
||||||
|
|
||||||
|
Containers communicate via **Docker's internal network**:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# In your API/Worker code:
|
||||||
|
import redis
|
||||||
|
|
||||||
|
# Connection to Redis
|
||||||
|
redis_client = redis.from_url('redis://redis:6379')
|
||||||
|
# ^^^^^^
|
||||||
|
# Container name = hostname in Docker network
|
||||||
|
```
|
||||||
|
|
||||||
|
Docker automatically creates DNS entries, so `redis` resolves to the Redis container.
|
||||||
|
|
||||||
|
## 6. Client Behavior (Pi5)
|
||||||
|
|
||||||
|
```python
|
||||||
|
# On the Pi5 client
|
||||||
|
def display_file(file_id):
|
||||||
|
response = api.get(f"/files/{file_id}/display")
|
||||||
|
|
||||||
|
if response.content_type == 'application/pdf':
|
||||||
|
# PDF is ready
|
||||||
|
download_and_display(response)
|
||||||
|
subprocess.run(['impressive', downloaded_pdf])
|
||||||
|
|
||||||
|
elif response.json()['status'] in ['pending', 'processing']:
|
||||||
|
# Wait and retry
|
||||||
|
show_loading_screen("Presentation is being prepared...")
|
||||||
|
time.sleep(5)
|
||||||
|
display_file(file_id) # Retry
|
||||||
|
|
||||||
|
else:
|
||||||
|
# Error
|
||||||
|
show_error_screen("Error loading presentation")
|
||||||
|
```
|
||||||
|
|
||||||
|
## 7. Additional Features
|
||||||
|
|
||||||
|
### **Cache Invalidation on PPTX Update:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.put("/files/{file_id}")
|
||||||
|
async def update_file(file_id, new_file):
|
||||||
|
# Delete old conversions
|
||||||
|
db.mark_conversions_as_obsolete(file_id)
|
||||||
|
|
||||||
|
# Update file
|
||||||
|
update_media_file(file_id, new_file)
|
||||||
|
|
||||||
|
# Trigger new conversion
|
||||||
|
trigger_conversion(file_id, 'pdf')
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Status API for Monitoring:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.get("/admin/conversions/status")
|
||||||
|
async def get_conversion_stats():
|
||||||
|
return {
|
||||||
|
'pending': db.count(status='pending'),
|
||||||
|
'processing': db.count(status='processing'),
|
||||||
|
'failed': db.count(status='failed'),
|
||||||
|
'avg_duration_seconds': db.avg_duration()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Cleanup Job (Cronjob):**
|
||||||
|
|
||||||
|
```python
|
||||||
|
def cleanup_old_conversions():
|
||||||
|
# Remove PDFs from deleted files
|
||||||
|
db.delete_orphaned_conversions()
|
||||||
|
|
||||||
|
# Clean up old failed conversions
|
||||||
|
db.delete_old_failed_conversions(older_than_days=7)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 8. Redis Container Details
|
||||||
|
|
||||||
|
### **Why Separate Container?**
|
||||||
|
|
||||||
|
✅ **Separation of Concerns**: Each service has its own responsibility
|
||||||
|
✅ **Independent Lifecycle Management**: Redis can be restarted/updated independently
|
||||||
|
✅ **Better Scaling**: Redis can be moved to different hardware
|
||||||
|
✅ **Easier Backup**: Redis data can be backed up separately
|
||||||
|
✅ **Standard Docker Pattern**: Microservices architecture
|
||||||
|
|
||||||
|
### **Resource Usage:**
|
||||||
|
- RAM: ~10-50 MB for your use case
|
||||||
|
- CPU: Minimal
|
||||||
|
- Disk: Only for persistence (optional)
|
||||||
|
|
||||||
|
For 10 clients with occasional PPTX uploads, this is absolutely no problem.
|
||||||
|
|
||||||
|
## 9. Advantages of This Solution
|
||||||
|
|
||||||
|
✅ **Scalable**: Workers can be scaled horizontally
|
||||||
|
✅ **Performant**: Clients don't wait for conversion
|
||||||
|
✅ **Robust**: Status tracking and error handling
|
||||||
|
✅ **Maintainable**: Clear separation of responsibilities
|
||||||
|
✅ **Transparent**: Status queryable at any time
|
||||||
|
✅ **Efficient**: One-time conversion per file
|
||||||
|
✅ **Future-proof**: Easily extensible for other formats
|
||||||
|
✅ **Professional**: Industry-standard architecture
|
||||||
|
|
||||||
|
## 10. Migration Path
|
||||||
|
|
||||||
|
### **Phase 1 (MVP):**
|
||||||
|
- 1 worker process in API container
|
||||||
|
- Redis for queue (separate container)
|
||||||
|
- Basic DB schema
|
||||||
|
- Simple retry logic
|
||||||
|
|
||||||
|
### **Phase 2 (as needed):**
|
||||||
|
- Multiple worker instances
|
||||||
|
- Dedicated conversion service container
|
||||||
|
- Monitoring & alerting
|
||||||
|
- Prioritization logic
|
||||||
|
- Advanced caching strategies
|
||||||
|
|
||||||
|
**Start simple, scale when needed!**
|
||||||
|
|
||||||
|
## 11. Key Decisions Summary
|
||||||
|
|
||||||
|
| Aspect | Decision | Reason |
|
||||||
|
|--------|----------|--------|
|
||||||
|
| **Conversion Location** | Server-side | One conversion per file, consistent results |
|
||||||
|
| **Conversion Timing** | Asynchronous (on upload) | No client waiting time, predictable performance |
|
||||||
|
| **Data Storage** | Database-tracked | Status visibility, robust error handling |
|
||||||
|
| **Queue System** | Redis (separate container) | Standard pattern, scalable, maintainable |
|
||||||
|
| **Worker Architecture** | Background process in API container | Simple start, easy to separate later |
|
||||||
|
|
||||||
|
## 12. File Flow Diagram
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────┐
|
||||||
|
│ User Upload │
|
||||||
|
│ (PPTX) │
|
||||||
|
└──────┬──────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────┐
|
||||||
|
│ API Server │
|
||||||
|
│ 1. Save PPTX │
|
||||||
|
│ 2. Create DB rec │
|
||||||
|
│ 3. Enqueue job │
|
||||||
|
└──────┬───────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────┐
|
||||||
|
│ Redis Queue │◄─────┐
|
||||||
|
└──────┬───────────┘ │
|
||||||
|
│ │
|
||||||
|
▼ │
|
||||||
|
┌──────────────────┐ │
|
||||||
|
│ Worker Process │ │
|
||||||
|
│ 1. Get job │ │
|
||||||
|
│ 2. Convert PPTX │ │
|
||||||
|
│ 3. Update DB │ │
|
||||||
|
└──────┬───────────┘ │
|
||||||
|
│ │
|
||||||
|
▼ │
|
||||||
|
┌──────────────────┐ │
|
||||||
|
│ PDF Storage │ │
|
||||||
|
└──────┬───────────┘ │
|
||||||
|
│ │
|
||||||
|
▼ │
|
||||||
|
┌──────────────────┐ │
|
||||||
|
│ Client Requests │ │
|
||||||
|
│ 1. Check DB │ │
|
||||||
|
│ 2. Download PDF │ │
|
||||||
|
│ 3. Display │──────┘
|
||||||
|
└──────────────────┘
|
||||||
|
(via impressive)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 13. Implementation Checklist
|
||||||
|
|
||||||
|
### Database Setup
|
||||||
|
- [ ] Create `media_files` table
|
||||||
|
- [ ] Create `conversions` table
|
||||||
|
- [ ] Add indexes for performance
|
||||||
|
- [ ] Set up foreign key constraints
|
||||||
|
|
||||||
|
### API Changes
|
||||||
|
- [ ] Modify upload endpoint to create DB records
|
||||||
|
- [ ] Add conversion job enqueueing
|
||||||
|
- [ ] Implement file download endpoint with status checking
|
||||||
|
- [ ] Add status API for monitoring
|
||||||
|
- [ ] Implement cache invalidation on file update
|
||||||
|
|
||||||
|
### Worker Setup
|
||||||
|
- [ ] Create worker script/module
|
||||||
|
- [ ] Implement LibreOffice conversion logic
|
||||||
|
- [ ] Add error handling and retry logic
|
||||||
|
- [ ] Set up logging and monitoring
|
||||||
|
|
||||||
|
### Docker Configuration
|
||||||
|
- [ ] Add Redis container to docker-compose.yml
|
||||||
|
- [ ] Configure worker container
|
||||||
|
- [ ] Set up volume mounts for file storage
|
||||||
|
- [ ] Configure environment variables
|
||||||
|
- [ ] Set up container dependencies
|
||||||
|
|
||||||
|
### Client Updates
|
||||||
|
- [ ] Modify client to check conversion status
|
||||||
|
- [ ] Implement retry logic for pending conversions
|
||||||
|
- [ ] Add loading/waiting screens
|
||||||
|
- [ ] Implement error handling
|
||||||
|
|
||||||
|
### Testing
|
||||||
|
- [ ] Test upload → conversion → download flow
|
||||||
|
- [ ] Test multiple concurrent conversions
|
||||||
|
- [ ] Test error handling (corrupted PPTX, etc.)
|
||||||
|
- [ ] Test cache invalidation on file update
|
||||||
|
- [ ] Load test with multiple clients
|
||||||
|
|
||||||
|
### Monitoring & Operations
|
||||||
|
- [ ] Set up logging for conversions
|
||||||
|
- [ ] Implement cleanup job for old files
|
||||||
|
- [ ] Add metrics for conversion times
|
||||||
|
- [ ] Set up alerts for failed conversions
|
||||||
|
- [ ] Document backup procedures
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**This architecture provides a solid foundation that's simple to start with but scales professionally as your needs grow!**
|
||||||
815
pptx_conversion_guide_gotenberg.md
Normal file
815
pptx_conversion_guide_gotenberg.md
Normal file
@@ -0,0 +1,815 @@
|
|||||||
|
# Recommended Implementation: PPTX-to-PDF Conversion System with Gotenberg
|
||||||
|
|
||||||
|
## Architecture Overview
|
||||||
|
|
||||||
|
**Asynchronous server-side conversion using Gotenberg with shared storage**
|
||||||
|
|
||||||
|
```
|
||||||
|
User Upload → API saves PPTX → Job in Queue → Worker calls Gotenberg API
|
||||||
|
↓
|
||||||
|
Gotenberg converts via shared volume
|
||||||
|
↓
|
||||||
|
Client requests → API checks DB status → PDF ready? → Download PDF from shared storage
|
||||||
|
→ Pending? → "Please wait"
|
||||||
|
→ Failed? → Retry/Error
|
||||||
|
```
|
||||||
|
|
||||||
|
## 1. Database Schema
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE media_files (
|
||||||
|
id UUID PRIMARY KEY,
|
||||||
|
filename VARCHAR(255),
|
||||||
|
original_path VARCHAR(512),
|
||||||
|
file_type VARCHAR(10),
|
||||||
|
mime_type VARCHAR(100),
|
||||||
|
uploaded_at TIMESTAMP,
|
||||||
|
updated_at TIMESTAMP
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE conversions (
|
||||||
|
id UUID PRIMARY KEY,
|
||||||
|
source_file_id UUID REFERENCES media_files(id) ON DELETE CASCADE,
|
||||||
|
target_format VARCHAR(10), -- 'pdf'
|
||||||
|
target_path VARCHAR(512), -- Path to generated PDF
|
||||||
|
status VARCHAR(20), -- 'pending', 'processing', 'ready', 'failed'
|
||||||
|
started_at TIMESTAMP,
|
||||||
|
completed_at TIMESTAMP,
|
||||||
|
error_message TEXT,
|
||||||
|
file_hash VARCHAR(64) -- Hash of PPTX for cache invalidation
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_conversions_source ON conversions(source_file_id, target_format);
|
||||||
|
```
|
||||||
|
|
||||||
|
## 2. Components
|
||||||
|
|
||||||
|
### **API Server (existing)**
|
||||||
|
- Accepts uploads
|
||||||
|
- Creates DB entries
|
||||||
|
- Enqueues jobs
|
||||||
|
- Delivers status and files
|
||||||
|
|
||||||
|
### **Background Worker (new)**
|
||||||
|
- Runs as separate process in **same container** as API
|
||||||
|
- Processes conversion jobs from queue
|
||||||
|
- Calls Gotenberg API for conversion
|
||||||
|
- Updates database with results
|
||||||
|
- Technology: Python RQ, Celery, or similar
|
||||||
|
|
||||||
|
### **Gotenberg Container (new)**
|
||||||
|
- Dedicated conversion service
|
||||||
|
- HTTP API for document conversion
|
||||||
|
- Handles LibreOffice conversions internally
|
||||||
|
- Accesses files via shared volume
|
||||||
|
|
||||||
|
### **Message Queue**
|
||||||
|
- Redis (recommended for start - simple, fast)
|
||||||
|
- Alternative: RabbitMQ for more features
|
||||||
|
|
||||||
|
### **Redis Container (separate)**
|
||||||
|
- Handles job queue
|
||||||
|
- Minimal resource footprint
|
||||||
|
|
||||||
|
### **Shared Storage**
|
||||||
|
- Docker volume mounted to all containers that need file access
|
||||||
|
- API, Worker, and Gotenberg all access same files
|
||||||
|
- Simplifies file exchange between services
|
||||||
|
|
||||||
|
## 3. Detailed Workflow
|
||||||
|
|
||||||
|
### **Upload Process:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.post("/upload")
|
||||||
|
async def upload_file(file):
|
||||||
|
# 1. Save PPTX to shared volume
|
||||||
|
file_path = save_to_disk(file) # e.g., /shared/uploads/abc123.pptx
|
||||||
|
|
||||||
|
# 2. DB entry for original file
|
||||||
|
file_record = db.create_media_file({
|
||||||
|
'filename': file.filename,
|
||||||
|
'original_path': file_path,
|
||||||
|
'file_type': 'pptx'
|
||||||
|
})
|
||||||
|
|
||||||
|
# 3. Create conversion record
|
||||||
|
conversion = db.create_conversion({
|
||||||
|
'source_file_id': file_record.id,
|
||||||
|
'target_format': 'pdf',
|
||||||
|
'status': 'pending',
|
||||||
|
'file_hash': calculate_hash(file_path)
|
||||||
|
})
|
||||||
|
|
||||||
|
# 4. Enqueue job (asynchronous!)
|
||||||
|
queue.enqueue(convert_to_pdf_via_gotenberg, conversion.id)
|
||||||
|
|
||||||
|
# 5. Return immediately to user
|
||||||
|
return {
|
||||||
|
'file_id': file_record.id,
|
||||||
|
'status': 'uploaded',
|
||||||
|
'conversion_status': 'pending'
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Worker Process (calls Gotenberg):**
|
||||||
|
|
||||||
|
```python
|
||||||
|
import requests
|
||||||
|
import os
|
||||||
|
|
||||||
|
GOTENBERG_URL = os.getenv('GOTENBERG_URL', 'http://gotenberg:3000')
|
||||||
|
|
||||||
|
def convert_to_pdf_via_gotenberg(conversion_id):
|
||||||
|
conversion = db.get_conversion(conversion_id)
|
||||||
|
source_file = db.get_media_file(conversion.source_file_id)
|
||||||
|
|
||||||
|
# Status update: processing
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'processing',
|
||||||
|
'started_at': now()
|
||||||
|
})
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Prepare output path
|
||||||
|
pdf_filename = f"{conversion.id}.pdf"
|
||||||
|
pdf_path = f"/shared/converted/{pdf_filename}"
|
||||||
|
|
||||||
|
# Call Gotenberg API
|
||||||
|
# Gotenberg accesses the file via shared volume
|
||||||
|
with open(source_file.original_path, 'rb') as f:
|
||||||
|
files = {
|
||||||
|
'files': (os.path.basename(source_file.original_path), f)
|
||||||
|
}
|
||||||
|
|
||||||
|
response = requests.post(
|
||||||
|
f'{GOTENBERG_URL}/forms/libreoffice/convert',
|
||||||
|
files=files,
|
||||||
|
timeout=300 # 5 minutes timeout
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
|
||||||
|
# Save PDF to shared volume
|
||||||
|
with open(pdf_path, 'wb') as pdf_file:
|
||||||
|
pdf_file.write(response.content)
|
||||||
|
|
||||||
|
# Success
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'ready',
|
||||||
|
'target_path': pdf_path,
|
||||||
|
'completed_at': now()
|
||||||
|
})
|
||||||
|
|
||||||
|
except requests.exceptions.Timeout:
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'failed',
|
||||||
|
'error_message': 'Conversion timeout after 5 minutes',
|
||||||
|
'completed_at': now()
|
||||||
|
})
|
||||||
|
except requests.exceptions.RequestException as e:
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'failed',
|
||||||
|
'error_message': f'Gotenberg API error: {str(e)}',
|
||||||
|
'completed_at': now()
|
||||||
|
})
|
||||||
|
except Exception as e:
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'failed',
|
||||||
|
'error_message': str(e),
|
||||||
|
'completed_at': now()
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Alternative: Direct File Access via Shared Volume**
|
||||||
|
|
||||||
|
If you prefer Gotenberg to read from shared storage directly (more efficient for large files):
|
||||||
|
|
||||||
|
```python
|
||||||
|
def convert_to_pdf_via_gotenberg_shared(conversion_id):
|
||||||
|
conversion = db.get_conversion(conversion_id)
|
||||||
|
source_file = db.get_media_file(conversion.source_file_id)
|
||||||
|
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'processing',
|
||||||
|
'started_at': now()
|
||||||
|
})
|
||||||
|
|
||||||
|
try:
|
||||||
|
pdf_filename = f"{conversion.id}.pdf"
|
||||||
|
pdf_path = f"/shared/converted/{pdf_filename}"
|
||||||
|
|
||||||
|
# Gotenberg reads directly from shared volume
|
||||||
|
# We just tell it where to find the file
|
||||||
|
with open(source_file.original_path, 'rb') as f:
|
||||||
|
files = {'files': f}
|
||||||
|
|
||||||
|
response = requests.post(
|
||||||
|
f'{GOTENBERG_URL}/forms/libreoffice/convert',
|
||||||
|
files=files,
|
||||||
|
timeout=300
|
||||||
|
)
|
||||||
|
response.raise_for_status()
|
||||||
|
|
||||||
|
# Write result to shared volume
|
||||||
|
with open(pdf_path, 'wb') as pdf_file:
|
||||||
|
pdf_file.write(response.content)
|
||||||
|
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'ready',
|
||||||
|
'target_path': pdf_path,
|
||||||
|
'completed_at': now()
|
||||||
|
})
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
db.update_conversion(conversion_id, {
|
||||||
|
'status': 'failed',
|
||||||
|
'error_message': str(e),
|
||||||
|
'completed_at': now()
|
||||||
|
})
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Client Download:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.get("/files/{file_id}/display")
|
||||||
|
async def get_display_file(file_id):
|
||||||
|
file = db.get_media_file(file_id)
|
||||||
|
|
||||||
|
# Only for PPTX: check PDF conversion
|
||||||
|
if file.file_type == 'pptx':
|
||||||
|
conversion = db.get_latest_conversion(file.id, target_format='pdf')
|
||||||
|
|
||||||
|
if not conversion:
|
||||||
|
# Shouldn't happen, but just to be safe
|
||||||
|
trigger_new_conversion(file.id)
|
||||||
|
return {'status': 'pending', 'message': 'Conversion is being created'}
|
||||||
|
|
||||||
|
if conversion.status == 'ready':
|
||||||
|
# Serve PDF from shared storage
|
||||||
|
return FileResponse(conversion.target_path)
|
||||||
|
|
||||||
|
elif conversion.status == 'failed':
|
||||||
|
# Optional: Auto-retry
|
||||||
|
trigger_new_conversion(file.id)
|
||||||
|
return {'status': 'failed', 'error': conversion.error_message}
|
||||||
|
|
||||||
|
else: # pending or processing
|
||||||
|
return {'status': conversion.status, 'message': 'Please wait...'}
|
||||||
|
|
||||||
|
# Serve other file types directly
|
||||||
|
return FileResponse(file.original_path)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 4. Docker Setup
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
version: '3.8'
|
||||||
|
|
||||||
|
services:
|
||||||
|
# Your API Server
|
||||||
|
api:
|
||||||
|
build: ./api
|
||||||
|
command: uvicorn main:app --host 0.0.0.0 --port 8000
|
||||||
|
ports:
|
||||||
|
- "8000:8000"
|
||||||
|
volumes:
|
||||||
|
- shared-storage:/shared # Shared volume
|
||||||
|
environment:
|
||||||
|
- REDIS_URL=redis://redis:6379
|
||||||
|
- DATABASE_URL=postgresql://postgres:password@postgres:5432/infoscreen
|
||||||
|
- GOTENBERG_URL=http://gotenberg:3000
|
||||||
|
depends_on:
|
||||||
|
- redis
|
||||||
|
- postgres
|
||||||
|
- gotenberg
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
# Worker (same codebase as API, different command)
|
||||||
|
worker:
|
||||||
|
build: ./api # Same build as API!
|
||||||
|
command: python worker.py # or: rq worker
|
||||||
|
volumes:
|
||||||
|
- shared-storage:/shared # Shared volume
|
||||||
|
environment:
|
||||||
|
- REDIS_URL=redis://redis:6379
|
||||||
|
- DATABASE_URL=postgresql://postgres:password@postgres:5432/infoscreen
|
||||||
|
- GOTENBERG_URL=http://gotenberg:3000
|
||||||
|
depends_on:
|
||||||
|
- redis
|
||||||
|
- postgres
|
||||||
|
- gotenberg
|
||||||
|
restart: unless-stopped
|
||||||
|
# Optional: Multiple workers
|
||||||
|
deploy:
|
||||||
|
replicas: 2
|
||||||
|
|
||||||
|
# Gotenberg - Document Conversion Service
|
||||||
|
gotenberg:
|
||||||
|
image: gotenberg/gotenberg:8
|
||||||
|
# Gotenberg doesn't need the shared volume if files are sent via HTTP
|
||||||
|
# But mount it if you want direct file access
|
||||||
|
volumes:
|
||||||
|
- shared-storage:/shared # Optional: for direct file access
|
||||||
|
environment:
|
||||||
|
# Gotenberg configuration
|
||||||
|
- GOTENBERG_API_TIMEOUT=300s
|
||||||
|
- GOTENBERG_LOG_LEVEL=info
|
||||||
|
restart: unless-stopped
|
||||||
|
# Resource limits (optional but recommended)
|
||||||
|
deploy:
|
||||||
|
resources:
|
||||||
|
limits:
|
||||||
|
cpus: '2.0'
|
||||||
|
memory: 2G
|
||||||
|
reservations:
|
||||||
|
cpus: '0.5'
|
||||||
|
memory: 512M
|
||||||
|
|
||||||
|
# Redis - separate container
|
||||||
|
redis:
|
||||||
|
image: redis:7-alpine
|
||||||
|
volumes:
|
||||||
|
- redis-data:/data
|
||||||
|
command: redis-server --appendonly yes
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
# Your existing Postgres
|
||||||
|
postgres:
|
||||||
|
image: postgres:15
|
||||||
|
environment:
|
||||||
|
- POSTGRES_DB=infoscreen
|
||||||
|
- POSTGRES_PASSWORD=password
|
||||||
|
volumes:
|
||||||
|
- postgres-data:/var/lib/postgresql/data
|
||||||
|
restart: unless-stopped
|
||||||
|
|
||||||
|
# Optional: Redis Commander (UI for debugging)
|
||||||
|
redis-commander:
|
||||||
|
image: rediscommander/redis-commander
|
||||||
|
environment:
|
||||||
|
- REDIS_HOSTS=local:redis:6379
|
||||||
|
ports:
|
||||||
|
- "8081:8081"
|
||||||
|
depends_on:
|
||||||
|
- redis
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
shared-storage: # New: Shared storage for all file operations
|
||||||
|
redis-data:
|
||||||
|
postgres-data:
|
||||||
|
```
|
||||||
|
|
||||||
|
## 5. Storage Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
/shared/
|
||||||
|
├── uploads/ # Original uploaded files (PPTX, etc.)
|
||||||
|
│ ├── abc123.pptx
|
||||||
|
│ ├── def456.pptx
|
||||||
|
│ └── ...
|
||||||
|
└── converted/ # Converted PDF files
|
||||||
|
├── uuid-1.pdf
|
||||||
|
├── uuid-2.pdf
|
||||||
|
└── ...
|
||||||
|
```
|
||||||
|
|
||||||
|
## 6. Gotenberg Integration Details
|
||||||
|
|
||||||
|
### **Gotenberg API Endpoints:**
|
||||||
|
|
||||||
|
Gotenberg provides various conversion endpoints:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# LibreOffice conversion (for PPTX, DOCX, ODT, etc.)
|
||||||
|
POST http://gotenberg:3000/forms/libreoffice/convert
|
||||||
|
|
||||||
|
# HTML to PDF
|
||||||
|
POST http://gotenberg:3000/forms/chromium/convert/html
|
||||||
|
|
||||||
|
# Markdown to PDF
|
||||||
|
POST http://gotenberg:3000/forms/chromium/convert/markdown
|
||||||
|
|
||||||
|
# Merge PDFs
|
||||||
|
POST http://gotenberg:3000/forms/pdfengines/merge
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Example Conversion Request:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
import requests
|
||||||
|
|
||||||
|
def convert_with_gotenberg(input_file_path, output_file_path):
|
||||||
|
"""
|
||||||
|
Convert document using Gotenberg
|
||||||
|
"""
|
||||||
|
with open(input_file_path, 'rb') as f:
|
||||||
|
files = {
|
||||||
|
'files': (os.path.basename(input_file_path), f,
|
||||||
|
'application/vnd.openxmlformats-officedocument.presentationml.presentation')
|
||||||
|
}
|
||||||
|
|
||||||
|
# Optional: Add conversion parameters
|
||||||
|
data = {
|
||||||
|
'landscape': 'false', # Portrait mode
|
||||||
|
'nativePageRanges': '1-', # All pages
|
||||||
|
}
|
||||||
|
|
||||||
|
response = requests.post(
|
||||||
|
'http://gotenberg:3000/forms/libreoffice/convert',
|
||||||
|
files=files,
|
||||||
|
data=data,
|
||||||
|
timeout=300
|
||||||
|
)
|
||||||
|
|
||||||
|
if response.status_code == 200:
|
||||||
|
with open(output_file_path, 'wb') as out:
|
||||||
|
out.write(response.content)
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
raise Exception(f"Gotenberg error: {response.status_code} - {response.text}")
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Advanced Options:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
# With custom PDF properties
|
||||||
|
data = {
|
||||||
|
'landscape': 'false',
|
||||||
|
'nativePageRanges': '1-10', # Only first 10 pages
|
||||||
|
'pdfFormat': 'PDF/A-1a', # PDF/A format
|
||||||
|
'exportFormFields': 'false',
|
||||||
|
}
|
||||||
|
|
||||||
|
# With password protection
|
||||||
|
data = {
|
||||||
|
'userPassword': 'secret123',
|
||||||
|
'ownerPassword': 'admin456',
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## 7. Client Behavior (Pi5)
|
||||||
|
|
||||||
|
```python
|
||||||
|
# On the Pi5 client
|
||||||
|
def display_file(file_id):
|
||||||
|
response = api.get(f"/files/{file_id}/display")
|
||||||
|
|
||||||
|
if response.content_type == 'application/pdf':
|
||||||
|
# PDF is ready
|
||||||
|
download_and_display(response)
|
||||||
|
subprocess.run(['impressive', downloaded_pdf])
|
||||||
|
|
||||||
|
elif response.json()['status'] in ['pending', 'processing']:
|
||||||
|
# Wait and retry
|
||||||
|
show_loading_screen("Presentation is being prepared...")
|
||||||
|
time.sleep(5)
|
||||||
|
display_file(file_id) # Retry
|
||||||
|
|
||||||
|
else:
|
||||||
|
# Error
|
||||||
|
show_error_screen("Error loading presentation")
|
||||||
|
```
|
||||||
|
|
||||||
|
## 8. Additional Features
|
||||||
|
|
||||||
|
### **Cache Invalidation on PPTX Update:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.put("/files/{file_id}")
|
||||||
|
async def update_file(file_id, new_file):
|
||||||
|
# Delete old conversions and PDFs
|
||||||
|
conversions = db.get_conversions_for_file(file_id)
|
||||||
|
for conv in conversions:
|
||||||
|
if conv.target_path and os.path.exists(conv.target_path):
|
||||||
|
os.remove(conv.target_path)
|
||||||
|
|
||||||
|
db.mark_conversions_as_obsolete(file_id)
|
||||||
|
|
||||||
|
# Update file
|
||||||
|
update_media_file(file_id, new_file)
|
||||||
|
|
||||||
|
# Trigger new conversion
|
||||||
|
trigger_conversion(file_id, 'pdf')
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Status API for Monitoring:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
@app.get("/admin/conversions/status")
|
||||||
|
async def get_conversion_stats():
|
||||||
|
return {
|
||||||
|
'pending': db.count(status='pending'),
|
||||||
|
'processing': db.count(status='processing'),
|
||||||
|
'failed': db.count(status='failed'),
|
||||||
|
'avg_duration_seconds': db.avg_duration(),
|
||||||
|
'gotenberg_health': check_gotenberg_health()
|
||||||
|
}
|
||||||
|
|
||||||
|
def check_gotenberg_health():
|
||||||
|
try:
|
||||||
|
response = requests.get(
|
||||||
|
f'{GOTENBERG_URL}/health',
|
||||||
|
timeout=5
|
||||||
|
)
|
||||||
|
return response.status_code == 200
|
||||||
|
except:
|
||||||
|
return False
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Cleanup Job (Cronjob):**
|
||||||
|
|
||||||
|
```python
|
||||||
|
def cleanup_old_conversions():
|
||||||
|
# Remove PDFs from deleted files
|
||||||
|
orphaned = db.get_orphaned_conversions()
|
||||||
|
for conv in orphaned:
|
||||||
|
if conv.target_path and os.path.exists(conv.target_path):
|
||||||
|
os.remove(conv.target_path)
|
||||||
|
db.delete_conversion(conv.id)
|
||||||
|
|
||||||
|
# Clean up old failed conversions
|
||||||
|
old_failed = db.get_old_failed_conversions(older_than_days=7)
|
||||||
|
for conv in old_failed:
|
||||||
|
db.delete_conversion(conv.id)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 9. Advantages of Using Gotenberg
|
||||||
|
|
||||||
|
✅ **Specialized Service**: Optimized specifically for document conversion
|
||||||
|
✅ **No LibreOffice Management**: Gotenberg handles LibreOffice lifecycle internally
|
||||||
|
✅ **Better Resource Management**: Isolated conversion process
|
||||||
|
✅ **HTTP API**: Clean, standard interface
|
||||||
|
✅ **Production Ready**: Battle-tested, actively maintained
|
||||||
|
✅ **Multiple Formats**: Supports PPTX, DOCX, ODT, HTML, Markdown, etc.
|
||||||
|
✅ **PDF Features**: Merge, encrypt, watermark PDFs
|
||||||
|
✅ **Health Checks**: Built-in health endpoint
|
||||||
|
✅ **Horizontal Scaling**: Can run multiple Gotenberg instances
|
||||||
|
✅ **Memory Safe**: Automatic cleanup and restart on issues
|
||||||
|
|
||||||
|
## 10. Migration Path
|
||||||
|
|
||||||
|
### **Phase 1 (MVP):**
|
||||||
|
- 1 worker process in API container
|
||||||
|
- Redis for queue (separate container)
|
||||||
|
- Gotenberg for conversion (separate container)
|
||||||
|
- Basic DB schema
|
||||||
|
- Shared volume for file exchange
|
||||||
|
- Simple retry logic
|
||||||
|
|
||||||
|
### **Phase 2 (as needed):**
|
||||||
|
- Multiple worker instances
|
||||||
|
- Multiple Gotenberg instances (load balancing)
|
||||||
|
- Monitoring & alerting
|
||||||
|
- Prioritization logic
|
||||||
|
- Advanced caching strategies
|
||||||
|
- PDF optimization/compression
|
||||||
|
|
||||||
|
**Start simple, scale when needed!**
|
||||||
|
|
||||||
|
## 11. Key Decisions Summary
|
||||||
|
|
||||||
|
| Aspect | Decision | Reason |
|
||||||
|
|--------|----------|--------|
|
||||||
|
| **Conversion Location** | Server-side (Gotenberg) | One conversion per file, consistent results |
|
||||||
|
| **Conversion Service** | Dedicated Gotenberg container | Specialized, production-ready, better isolation |
|
||||||
|
| **Conversion Timing** | Asynchronous (on upload) | No client waiting time, predictable performance |
|
||||||
|
| **Data Storage** | Database-tracked | Status visibility, robust error handling |
|
||||||
|
| **File Exchange** | Shared Docker volume | Simple, efficient, no network overhead |
|
||||||
|
| **Queue System** | Redis (separate container) | Standard pattern, scalable, maintainable |
|
||||||
|
| **Worker Architecture** | Background process in API container | Simple start, easy to separate later |
|
||||||
|
|
||||||
|
## 12. File Flow Diagram
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────┐
|
||||||
|
│ User Upload │
|
||||||
|
│ (PPTX) │
|
||||||
|
└──────┬──────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────────┐
|
||||||
|
│ API Server │
|
||||||
|
│ 1. Save to /shared │
|
||||||
|
│ 2. Create DB record │
|
||||||
|
│ 3. Enqueue job │
|
||||||
|
└──────┬───────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────┐
|
||||||
|
│ Redis Queue │
|
||||||
|
└──────┬───────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────────┐
|
||||||
|
│ Worker Process │
|
||||||
|
│ 1. Get job │
|
||||||
|
│ 2. Call Gotenberg │
|
||||||
|
│ 3. Update DB │
|
||||||
|
└──────┬───────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────────┐
|
||||||
|
│ Gotenberg │
|
||||||
|
│ 1. Read from /shared │
|
||||||
|
│ 2. Convert PPTX │
|
||||||
|
│ 3. Return PDF │
|
||||||
|
└──────┬───────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────────┐
|
||||||
|
│ Worker saves PDF │
|
||||||
|
│ to /shared/converted│
|
||||||
|
└──────┬───────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────────┐
|
||||||
|
│ Client Requests │
|
||||||
|
│ 1. Check DB │
|
||||||
|
│ 2. Download PDF │
|
||||||
|
│ 3. Display │
|
||||||
|
└──────────────────────┘
|
||||||
|
(via impressive)
|
||||||
|
```
|
||||||
|
|
||||||
|
## 13. Implementation Checklist
|
||||||
|
|
||||||
|
### Database Setup
|
||||||
|
- [ ] Create `media_files` table
|
||||||
|
- [ ] Create `conversions` table
|
||||||
|
- [ ] Add indexes for performance
|
||||||
|
- [ ] Set up foreign key constraints
|
||||||
|
|
||||||
|
### Storage Setup
|
||||||
|
- [ ] Create shared Docker volume
|
||||||
|
- [ ] Set up directory structure (/shared/uploads, /shared/converted)
|
||||||
|
- [ ] Configure proper permissions
|
||||||
|
|
||||||
|
### API Changes
|
||||||
|
- [ ] Modify upload endpoint to save to shared storage
|
||||||
|
- [ ] Create DB records for uploads
|
||||||
|
- [ ] Add conversion job enqueueing
|
||||||
|
- [ ] Implement file download endpoint with status checking
|
||||||
|
- [ ] Add status API for monitoring
|
||||||
|
- [ ] Implement cache invalidation on file update
|
||||||
|
|
||||||
|
### Worker Setup
|
||||||
|
- [ ] Create worker script/module
|
||||||
|
- [ ] Implement Gotenberg API calls
|
||||||
|
- [ ] Add error handling and retry logic
|
||||||
|
- [ ] Set up logging and monitoring
|
||||||
|
- [ ] Handle timeouts and failures
|
||||||
|
|
||||||
|
### Docker Configuration
|
||||||
|
- [ ] Add Gotenberg container to docker-compose.yml
|
||||||
|
- [ ] Add Redis container to docker-compose.yml
|
||||||
|
- [ ] Configure worker container
|
||||||
|
- [ ] Set up shared volume mounts
|
||||||
|
- [ ] Configure environment variables
|
||||||
|
- [ ] Set up container dependencies
|
||||||
|
- [ ] Configure resource limits for Gotenberg
|
||||||
|
|
||||||
|
### Client Updates
|
||||||
|
- [ ] Modify client to check conversion status
|
||||||
|
- [ ] Implement retry logic for pending conversions
|
||||||
|
- [ ] Add loading/waiting screens
|
||||||
|
- [ ] Implement error handling
|
||||||
|
|
||||||
|
### Testing
|
||||||
|
- [ ] Test upload → conversion → download flow
|
||||||
|
- [ ] Test multiple concurrent conversions
|
||||||
|
- [ ] Test error handling (corrupted PPTX, etc.)
|
||||||
|
- [ ] Test Gotenberg timeout handling
|
||||||
|
- [ ] Test cache invalidation on file update
|
||||||
|
- [ ] Load test with multiple clients
|
||||||
|
- [ ] Test Gotenberg health checks
|
||||||
|
|
||||||
|
### Monitoring & Operations
|
||||||
|
- [ ] Set up logging for conversions
|
||||||
|
- [ ] Monitor Gotenberg health endpoint
|
||||||
|
- [ ] Implement cleanup job for old files
|
||||||
|
- [ ] Add metrics for conversion times
|
||||||
|
- [ ] Set up alerts for failed conversions
|
||||||
|
- [ ] Monitor shared storage disk usage
|
||||||
|
- [ ] Document backup procedures
|
||||||
|
|
||||||
|
### Security
|
||||||
|
- [ ] Validate file types before conversion
|
||||||
|
- [ ] Set file size limits
|
||||||
|
- [ ] Sanitize filenames
|
||||||
|
- [ ] Implement rate limiting
|
||||||
|
- [ ] Secure inter-container communication
|
||||||
|
|
||||||
|
## 14. Gotenberg Configuration Options
|
||||||
|
|
||||||
|
### **Environment Variables:**
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
gotenberg:
|
||||||
|
image: gotenberg/gotenberg:8
|
||||||
|
environment:
|
||||||
|
# API Configuration
|
||||||
|
- GOTENBERG_API_TIMEOUT=300s
|
||||||
|
- GOTENBERG_API_PORT=3000
|
||||||
|
|
||||||
|
# Logging
|
||||||
|
- GOTENBERG_LOG_LEVEL=info # debug, info, warn, error
|
||||||
|
|
||||||
|
# LibreOffice
|
||||||
|
- GOTENBERG_LIBREOFFICE_DISABLE_ROUTES=false
|
||||||
|
- GOTENBERG_LIBREOFFICE_AUTO_START=true
|
||||||
|
|
||||||
|
# Chromium (if needed for HTML/Markdown)
|
||||||
|
- GOTENBERG_CHROMIUM_DISABLE_ROUTES=true # Disable if not needed
|
||||||
|
|
||||||
|
# Resource limits
|
||||||
|
- GOTENBERG_LIBREOFFICE_MAX_QUEUE_SIZE=100
|
||||||
|
```
|
||||||
|
|
||||||
|
### **Custom Gotenberg Configuration:**
|
||||||
|
|
||||||
|
For advanced configurations, create a `gotenberg.yml`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
api:
|
||||||
|
timeout: 300s
|
||||||
|
port: 3000
|
||||||
|
|
||||||
|
libreoffice:
|
||||||
|
autoStart: true
|
||||||
|
maxQueueSize: 100
|
||||||
|
|
||||||
|
chromium:
|
||||||
|
disableRoutes: true
|
||||||
|
```
|
||||||
|
|
||||||
|
Mount it in docker-compose:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
gotenberg:
|
||||||
|
image: gotenberg/gotenberg:8
|
||||||
|
volumes:
|
||||||
|
- ./gotenberg.yml:/etc/gotenberg/config.yml:ro
|
||||||
|
- shared-storage:/shared
|
||||||
|
```
|
||||||
|
|
||||||
|
## 15. Troubleshooting
|
||||||
|
|
||||||
|
### **Common Issues:**
|
||||||
|
|
||||||
|
**Gotenberg timeout:**
|
||||||
|
```python
|
||||||
|
# Increase timeout for large files
|
||||||
|
response = requests.post(
|
||||||
|
f'{GOTENBERG_URL}/forms/libreoffice/convert',
|
||||||
|
files=files,
|
||||||
|
timeout=600 # 10 minutes for large PPTX
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Memory issues:**
|
||||||
|
```yaml
|
||||||
|
# Increase Gotenberg memory limit
|
||||||
|
gotenberg:
|
||||||
|
deploy:
|
||||||
|
resources:
|
||||||
|
limits:
|
||||||
|
memory: 4G
|
||||||
|
```
|
||||||
|
|
||||||
|
**File permission issues:**
|
||||||
|
```bash
|
||||||
|
# Ensure proper permissions on shared volume
|
||||||
|
chmod -R 755 /shared
|
||||||
|
chown -R 1000:1000 /shared
|
||||||
|
```
|
||||||
|
|
||||||
|
**Gotenberg not responding:**
|
||||||
|
```python
|
||||||
|
# Check health before conversion
|
||||||
|
def ensure_gotenberg_healthy():
|
||||||
|
try:
|
||||||
|
response = requests.get(f'{GOTENBERG_URL}/health', timeout=5)
|
||||||
|
if response.status_code != 200:
|
||||||
|
raise Exception("Gotenberg unhealthy")
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Gotenberg health check failed: {e}")
|
||||||
|
raise
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**This architecture provides a production-ready, scalable solution using Gotenberg as a specialized conversion service with efficient file sharing via Docker volumes!**
|
||||||
|
|
||||||
|
## 16. Best Practices Specific to Infoscreen
|
||||||
|
|
||||||
|
- Idempotency by content: Always compute a SHA‑256 of the uploaded source and include it in the unique key (source_event_media_id, target_format, file_hash). This prevents duplicate work for identical content and auto-busts cache on change.
|
||||||
|
- Strict MIME/type validation: Accept only .ppt, .pptx, .odp for conversion. Reject unknown types early. Consider reading the first bytes (magic) for extra safety.
|
||||||
|
- Bounded retries with jitter: Retry conversions on transient HTTP 5xx or timeouts up to N times with exponential backoff. Do not retry on 4xx or clear user errors.
|
||||||
|
- Output naming: Derive deterministic output paths under media/converted/, e.g., <basename>.pdf. Ensure no path traversal and sanitize names.
|
||||||
|
- Timeouts and size limits: Enforce server-side max upload size and per-job conversion timeout (e.g., 10 minutes). Return clear errors for oversized/long-running files.
|
||||||
|
- Isolation and quotas: Set CPU/memory limits for Gotenberg; consider a concurrency cap per worker to avoid DB starvation.
|
||||||
|
- Health probes before work: Check Gotenberg /health prior to enqueue spikes; fail-fast to avoid queue pile-ups when Gotenberg is down.
|
||||||
|
- Observability: Log job IDs, file hashes, durations, and sizes. Expose a small /api/conversions/status summary for operational visibility.
|
||||||
|
- Cleanup policy: Periodically delete orphaned conversions (media deleted) and failed jobs older than X days. Keep successful PDFs aligned with DB rows.
|
||||||
|
- Security: Never trust client paths; always resolve relative to the known media root. Do not expose the shared volume directly; serve via API only.
|
||||||
|
- Backpressure: If queue length exceeds a threshold, surface 503/“try later” on new uploads or pause enqueue to protect the system.
|
||||||
@@ -0,0 +1,8 @@
|
|||||||
|
"""Server package initializer.
|
||||||
|
|
||||||
|
Expose submodules required by external importers (e.g., RQ string paths).
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Ensure 'server.worker' is available as an attribute of the 'server' package
|
||||||
|
# so that RQ can resolve 'server.worker.convert_event_media_to_pdf'.
|
||||||
|
from . import worker # noqa: F401
|
||||||
|
|||||||
@@ -0,0 +1,28 @@
|
|||||||
|
"""merge heads after conversions
|
||||||
|
|
||||||
|
Revision ID: 2b627d0885c3
|
||||||
|
Revises: 5b3c1a2f8d10, 8d1df7199cb7
|
||||||
|
Create Date: 2025-10-06 20:27:53.974926
|
||||||
|
|
||||||
|
"""
|
||||||
|
from typing import Sequence, Union
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
import sqlalchemy as sa
|
||||||
|
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = '2b627d0885c3'
|
||||||
|
down_revision: Union[str, None] = ('5b3c1a2f8d10', '8d1df7199cb7')
|
||||||
|
branch_labels: Union[str, Sequence[str], None] = None
|
||||||
|
depends_on: Union[str, Sequence[str], None] = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
"""Upgrade schema."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
"""Downgrade schema."""
|
||||||
|
pass
|
||||||
@@ -0,0 +1,53 @@
|
|||||||
|
"""Add conversions table
|
||||||
|
|
||||||
|
Revision ID: 5b3c1a2f8d10
|
||||||
|
Revises: e6eaede720aa
|
||||||
|
Create Date: 2025-10-06 12:00:00.000000
|
||||||
|
|
||||||
|
"""
|
||||||
|
from typing import Sequence, Union
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
import sqlalchemy as sa
|
||||||
|
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = '5b3c1a2f8d10'
|
||||||
|
down_revision: Union[str, None] = 'e6eaede720aa'
|
||||||
|
branch_labels: Union[str, Sequence[str], None] = None
|
||||||
|
depends_on: Union[str, Sequence[str], None] = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
op.create_table(
|
||||||
|
'conversions',
|
||||||
|
sa.Column('id', sa.Integer(), primary_key=True, autoincrement=True),
|
||||||
|
sa.Column('source_event_media_id', sa.Integer(), nullable=False),
|
||||||
|
sa.Column('target_format', sa.String(length=10), nullable=False),
|
||||||
|
sa.Column('target_path', sa.String(length=512), nullable=True),
|
||||||
|
sa.Column('status', sa.Enum('pending', 'processing', 'ready', 'failed', name='conversionstatus'),
|
||||||
|
nullable=False, server_default='pending'),
|
||||||
|
sa.Column('file_hash', sa.String(length=64), nullable=True),
|
||||||
|
sa.Column('started_at', sa.TIMESTAMP(timezone=True), nullable=True),
|
||||||
|
sa.Column('completed_at', sa.TIMESTAMP(timezone=True), nullable=True),
|
||||||
|
sa.Column('error_message', sa.Text(), nullable=True),
|
||||||
|
sa.ForeignKeyConstraint(['source_event_media_id'], ['event_media.id'],
|
||||||
|
name='fk_conversions_event_media', ondelete='CASCADE'),
|
||||||
|
)
|
||||||
|
|
||||||
|
op.create_index('ix_conv_source_event_media_id', 'conversions', ['source_event_media_id'])
|
||||||
|
op.create_index('ix_conversions_target_format', 'conversions', ['target_format'])
|
||||||
|
op.create_index('ix_conv_status_target', 'conversions', ['status', 'target_format'])
|
||||||
|
op.create_index('ix_conv_source_target', 'conversions', ['source_event_media_id', 'target_format'])
|
||||||
|
|
||||||
|
op.create_unique_constraint('uq_conv_source_target_hash', 'conversions',
|
||||||
|
['source_event_media_id', 'target_format', 'file_hash'])
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
op.drop_constraint('uq_conv_source_target_hash', 'conversions', type_='unique')
|
||||||
|
op.drop_index('ix_conv_source_target', table_name='conversions')
|
||||||
|
op.drop_index('ix_conv_status_target', table_name='conversions')
|
||||||
|
op.drop_index('ix_conversions_target_format', table_name='conversions')
|
||||||
|
op.drop_index('ix_conv_source_event_media_id', table_name='conversions')
|
||||||
|
op.drop_table('conversions')
|
||||||
@@ -0,0 +1,40 @@
|
|||||||
|
"""Make conversions.file_hash NOT NULL
|
||||||
|
|
||||||
|
Revision ID: b5a6c3d4e7f8
|
||||||
|
Revises: 2b627d0885c3
|
||||||
|
Create Date: 2025-10-06 21:05:00.000000
|
||||||
|
|
||||||
|
"""
|
||||||
|
from typing import Sequence, Union
|
||||||
|
|
||||||
|
from alembic import op
|
||||||
|
import sqlalchemy as sa
|
||||||
|
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision: str = "b5a6c3d4e7f8"
|
||||||
|
down_revision: Union[str, None] = "2b627d0885c3"
|
||||||
|
branch_labels: Union[str, Sequence[str], None] = None
|
||||||
|
depends_on: Union[str, Sequence[str], None] = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade() -> None:
|
||||||
|
# Ensure no NULLs remain before altering nullability
|
||||||
|
op.execute("UPDATE conversions SET file_hash = '' WHERE file_hash IS NULL")
|
||||||
|
op.alter_column(
|
||||||
|
"conversions",
|
||||||
|
"file_hash",
|
||||||
|
existing_type=sa.String(length=64),
|
||||||
|
nullable=False,
|
||||||
|
existing_nullable=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade() -> None:
|
||||||
|
op.alter_column(
|
||||||
|
"conversions",
|
||||||
|
"file_hash",
|
||||||
|
existing_type=sa.String(length=64),
|
||||||
|
nullable=True,
|
||||||
|
existing_nullable=False,
|
||||||
|
)
|
||||||
@@ -6,3 +6,6 @@ python-dotenv>=1.1.0
|
|||||||
SQLAlchemy>=2.0.41
|
SQLAlchemy>=2.0.41
|
||||||
flask
|
flask
|
||||||
gunicorn
|
gunicorn
|
||||||
|
redis>=5.0.1
|
||||||
|
rq>=1.16.2
|
||||||
|
requests>=2.32.3
|
||||||
|
|||||||
94
server/routes/conversions.py
Normal file
94
server/routes/conversions.py
Normal file
@@ -0,0 +1,94 @@
|
|||||||
|
from flask import Blueprint, jsonify, request
|
||||||
|
from server.database import Session
|
||||||
|
from models.models import Conversion, ConversionStatus, EventMedia, MediaType
|
||||||
|
from server.task_queue import get_queue
|
||||||
|
from server.worker import convert_event_media_to_pdf
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
import hashlib
|
||||||
|
|
||||||
|
conversions_bp = Blueprint("conversions", __name__,
|
||||||
|
url_prefix="/api/conversions")
|
||||||
|
|
||||||
|
|
||||||
|
def sha256_file(abs_path: str) -> str:
|
||||||
|
h = hashlib.sha256()
|
||||||
|
with open(abs_path, "rb") as f:
|
||||||
|
for chunk in iter(lambda: f.read(8192), b""):
|
||||||
|
h.update(chunk)
|
||||||
|
return h.hexdigest()
|
||||||
|
|
||||||
|
|
||||||
|
@conversions_bp.route("/<int:media_id>/pdf", methods=["POST"])
|
||||||
|
def ensure_conversion(media_id: int):
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
media = session.query(EventMedia).get(media_id)
|
||||||
|
if not media or not media.file_path:
|
||||||
|
return jsonify({"error": "Media not found or no file"}), 404
|
||||||
|
|
||||||
|
# Only enqueue for office presentation formats
|
||||||
|
if media.media_type not in {MediaType.ppt, MediaType.pptx, MediaType.odp}:
|
||||||
|
return jsonify({"message": "No conversion required for this media_type"}), 200
|
||||||
|
|
||||||
|
# Compute file hash
|
||||||
|
import os
|
||||||
|
base_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
|
||||||
|
media_root = os.path.join(base_dir, "media")
|
||||||
|
abs_source = os.path.join(media_root, media.file_path)
|
||||||
|
file_hash = sha256_file(abs_source)
|
||||||
|
|
||||||
|
# Find or create conversion row
|
||||||
|
conv = (
|
||||||
|
session.query(Conversion)
|
||||||
|
.filter_by(
|
||||||
|
source_event_media_id=media.id,
|
||||||
|
target_format="pdf",
|
||||||
|
file_hash=file_hash,
|
||||||
|
)
|
||||||
|
.one_or_none()
|
||||||
|
)
|
||||||
|
if not conv:
|
||||||
|
conv = Conversion(
|
||||||
|
source_event_media_id=media.id,
|
||||||
|
target_format="pdf",
|
||||||
|
status=ConversionStatus.pending,
|
||||||
|
file_hash=file_hash,
|
||||||
|
)
|
||||||
|
session.add(conv)
|
||||||
|
session.commit()
|
||||||
|
|
||||||
|
# Enqueue if not already processing/ready
|
||||||
|
if conv.status in {ConversionStatus.pending, ConversionStatus.failed}:
|
||||||
|
q = get_queue()
|
||||||
|
job = q.enqueue(convert_event_media_to_pdf, conv.id)
|
||||||
|
return jsonify({"id": conv.id, "status": conv.status.value, "job_id": job.get_id()}), 202
|
||||||
|
else:
|
||||||
|
return jsonify({"id": conv.id, "status": conv.status.value, "target_path": conv.target_path}), 200
|
||||||
|
finally:
|
||||||
|
session.close()
|
||||||
|
|
||||||
|
|
||||||
|
@conversions_bp.route("/<int:media_id>/status", methods=["GET"])
|
||||||
|
def conversion_status(media_id: int):
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
conv = (
|
||||||
|
session.query(Conversion)
|
||||||
|
.filter_by(source_event_media_id=media_id, target_format="pdf")
|
||||||
|
.order_by(Conversion.id.desc())
|
||||||
|
.first()
|
||||||
|
)
|
||||||
|
if not conv:
|
||||||
|
return jsonify({"status": "missing"}), 404
|
||||||
|
return jsonify(
|
||||||
|
{
|
||||||
|
"id": conv.id,
|
||||||
|
"status": conv.status.value,
|
||||||
|
"target_path": conv.target_path,
|
||||||
|
"started_at": conv.started_at.isoformat() if conv.started_at else None,
|
||||||
|
"completed_at": conv.completed_at.isoformat() if conv.completed_at else None,
|
||||||
|
"error_message": conv.error_message,
|
||||||
|
}
|
||||||
|
)
|
||||||
|
finally:
|
||||||
|
session.close()
|
||||||
@@ -1,7 +1,10 @@
|
|||||||
from re import A
|
from re import A
|
||||||
from flask import Blueprint, request, jsonify, send_from_directory
|
from flask import Blueprint, request, jsonify, send_from_directory
|
||||||
from server.database import Session
|
from server.database import Session
|
||||||
from models.models import EventMedia, MediaType
|
from models.models import EventMedia, MediaType, Conversion, ConversionStatus
|
||||||
|
from server.task_queue import get_queue
|
||||||
|
from server.worker import convert_event_media_to_pdf
|
||||||
|
import hashlib
|
||||||
import os
|
import os
|
||||||
|
|
||||||
eventmedia_bp = Blueprint('eventmedia', __name__, url_prefix='/api/eventmedia')
|
eventmedia_bp = Blueprint('eventmedia', __name__, url_prefix='/api/eventmedia')
|
||||||
@@ -135,6 +138,41 @@ def filemanager_upload():
|
|||||||
)
|
)
|
||||||
session.add(media)
|
session.add(media)
|
||||||
session.commit()
|
session.commit()
|
||||||
|
|
||||||
|
# Enqueue conversion for office presentation types
|
||||||
|
if media_type in {MediaType.ppt, MediaType.pptx, MediaType.odp}:
|
||||||
|
# compute file hash
|
||||||
|
h = hashlib.sha256()
|
||||||
|
with open(file_path, 'rb') as f:
|
||||||
|
for chunk in iter(lambda: f.read(8192), b""):
|
||||||
|
h.update(chunk)
|
||||||
|
file_hash = h.hexdigest()
|
||||||
|
|
||||||
|
# upsert Conversion row
|
||||||
|
conv = (
|
||||||
|
session.query(Conversion)
|
||||||
|
.filter_by(
|
||||||
|
source_event_media_id=media.id,
|
||||||
|
target_format='pdf',
|
||||||
|
file_hash=file_hash,
|
||||||
|
)
|
||||||
|
.one_or_none()
|
||||||
|
)
|
||||||
|
if not conv:
|
||||||
|
conv = Conversion(
|
||||||
|
source_event_media_id=media.id,
|
||||||
|
target_format='pdf',
|
||||||
|
status=ConversionStatus.pending,
|
||||||
|
file_hash=file_hash,
|
||||||
|
)
|
||||||
|
session.add(conv)
|
||||||
|
session.commit()
|
||||||
|
|
||||||
|
if conv.status in {ConversionStatus.pending, ConversionStatus.failed}:
|
||||||
|
q = get_queue()
|
||||||
|
q.enqueue(convert_event_media_to_pdf, conv.id)
|
||||||
|
|
||||||
|
session.commit()
|
||||||
return jsonify({'success': True})
|
return jsonify({'success': True})
|
||||||
|
|
||||||
# --- FileManager: Download ---
|
# --- FileManager: Download ---
|
||||||
|
|||||||
@@ -55,3 +55,14 @@ def download_media_file(media_id: int, filename: str):
|
|||||||
served_name = os.path.basename(abs_path)
|
served_name = os.path.basename(abs_path)
|
||||||
session.close()
|
session.close()
|
||||||
return send_from_directory(directory, served_name, as_attachment=True)
|
return send_from_directory(directory, served_name, as_attachment=True)
|
||||||
|
|
||||||
|
|
||||||
|
@files_bp.route("/converted/<path:relpath>", methods=["GET"])
|
||||||
|
def download_converted(relpath: str):
|
||||||
|
"""Serve converted files (e.g., PDFs) relative to media/converted."""
|
||||||
|
abs_path = os.path.join(MEDIA_ROOT, relpath)
|
||||||
|
if not abs_path.startswith(MEDIA_ROOT):
|
||||||
|
return jsonify({"error": "Invalid path"}), 400
|
||||||
|
if not os.path.isfile(abs_path):
|
||||||
|
return jsonify({"error": "File not found"}), 404
|
||||||
|
return send_from_directory(os.path.dirname(abs_path), os.path.basename(abs_path), as_attachment=True)
|
||||||
|
|||||||
15
server/rq_worker.py
Normal file
15
server/rq_worker.py
Normal file
@@ -0,0 +1,15 @@
|
|||||||
|
import os
|
||||||
|
from rq import Worker
|
||||||
|
from server.task_queue import get_queue, get_redis_url
|
||||||
|
import redis
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
conn = redis.from_url(get_redis_url())
|
||||||
|
# Single queue named 'conversions'
|
||||||
|
w = Worker([get_queue().name], connection=conn)
|
||||||
|
w.work(with_scheduler=True)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
14
server/task_queue.py
Normal file
14
server/task_queue.py
Normal file
@@ -0,0 +1,14 @@
|
|||||||
|
import os
|
||||||
|
import redis
|
||||||
|
from rq import Queue
|
||||||
|
|
||||||
|
|
||||||
|
def get_redis_url() -> str:
|
||||||
|
# Default to local Redis service name in compose network
|
||||||
|
return os.getenv("REDIS_URL", "redis://redis:6379/0")
|
||||||
|
|
||||||
|
|
||||||
|
def get_queue(name: str = "conversions") -> Queue:
|
||||||
|
conn = redis.from_url(get_redis_url())
|
||||||
|
# 10 minutes default
|
||||||
|
return Queue(name, connection=conn, default_timeout=600)
|
||||||
94
server/worker.py
Normal file
94
server/worker.py
Normal file
@@ -0,0 +1,94 @@
|
|||||||
|
import os
|
||||||
|
import traceback
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
|
||||||
|
import requests
|
||||||
|
from sqlalchemy.orm import Session as SASession
|
||||||
|
|
||||||
|
from server.database import Session
|
||||||
|
from models.models import Conversion, ConversionStatus, EventMedia, MediaType
|
||||||
|
|
||||||
|
GOTENBERG_URL = os.getenv("GOTENBERG_URL", "http://gotenberg:3000")
|
||||||
|
|
||||||
|
|
||||||
|
def _now():
|
||||||
|
return datetime.now(timezone.utc)
|
||||||
|
|
||||||
|
|
||||||
|
def convert_event_media_to_pdf(conversion_id: int):
|
||||||
|
"""
|
||||||
|
Job entry point: convert a single EventMedia to PDF using Gotenberg.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
- Load conversion + source media
|
||||||
|
- Set status=processing, started_at
|
||||||
|
- POST to Gotenberg /forms/libreoffice/convert with the source file bytes
|
||||||
|
- Save response bytes to target_path
|
||||||
|
- Set status=ready, completed_at, target_path
|
||||||
|
- On error: set status=failed, error_message
|
||||||
|
"""
|
||||||
|
session: SASession = Session()
|
||||||
|
try:
|
||||||
|
conv: Conversion = session.query(Conversion).get(conversion_id)
|
||||||
|
if not conv:
|
||||||
|
return
|
||||||
|
|
||||||
|
media: EventMedia = session.query(
|
||||||
|
EventMedia).get(conv.source_event_media_id)
|
||||||
|
if not media or not media.file_path:
|
||||||
|
conv.status = ConversionStatus.failed
|
||||||
|
conv.error_message = "Source media or file_path missing"
|
||||||
|
conv.completed_at = _now()
|
||||||
|
session.commit()
|
||||||
|
return
|
||||||
|
|
||||||
|
conv.status = ConversionStatus.processing
|
||||||
|
conv.started_at = _now()
|
||||||
|
session.commit()
|
||||||
|
|
||||||
|
# Get the server directory (where this worker.py file is located)
|
||||||
|
server_dir = os.path.dirname(os.path.abspath(__file__))
|
||||||
|
media_root = os.path.join(server_dir, "media")
|
||||||
|
abs_source = os.path.join(media_root, media.file_path)
|
||||||
|
# Output target under media/converted
|
||||||
|
converted_dir = os.path.join(media_root, "converted")
|
||||||
|
os.makedirs(converted_dir, exist_ok=True)
|
||||||
|
filename_wo_ext = os.path.splitext(
|
||||||
|
os.path.basename(media.file_path))[0]
|
||||||
|
pdf_name = f"{filename_wo_ext}.pdf"
|
||||||
|
abs_target = os.path.join(converted_dir, pdf_name)
|
||||||
|
|
||||||
|
# Send to Gotenberg
|
||||||
|
with open(abs_source, "rb") as f:
|
||||||
|
files = {"files": (os.path.basename(abs_source), f)}
|
||||||
|
resp = requests.post(
|
||||||
|
f"{GOTENBERG_URL}/forms/libreoffice/convert",
|
||||||
|
files=files,
|
||||||
|
timeout=600,
|
||||||
|
)
|
||||||
|
resp.raise_for_status()
|
||||||
|
|
||||||
|
with open(abs_target, "wb") as out:
|
||||||
|
out.write(resp.content)
|
||||||
|
|
||||||
|
conv.status = ConversionStatus.ready
|
||||||
|
# Store relative path under media/
|
||||||
|
conv.target_path = os.path.relpath(abs_target, media_root)
|
||||||
|
conv.completed_at = _now()
|
||||||
|
session.commit()
|
||||||
|
except requests.exceptions.Timeout:
|
||||||
|
conv = session.query(Conversion).get(conversion_id)
|
||||||
|
if conv:
|
||||||
|
conv.status = ConversionStatus.failed
|
||||||
|
conv.error_message = "Conversion timeout"
|
||||||
|
conv.completed_at = _now()
|
||||||
|
session.commit()
|
||||||
|
except Exception as e:
|
||||||
|
conv = session.query(Conversion).get(conversion_id)
|
||||||
|
if conv:
|
||||||
|
conv.status = ConversionStatus.failed
|
||||||
|
conv.error_message = f"{e}\n{traceback.format_exc()}"
|
||||||
|
conv.completed_at = _now()
|
||||||
|
session.commit()
|
||||||
|
finally:
|
||||||
|
session.close()
|
||||||
@@ -2,6 +2,7 @@
|
|||||||
from server.routes.eventmedia import eventmedia_bp
|
from server.routes.eventmedia import eventmedia_bp
|
||||||
from server.routes.files import files_bp
|
from server.routes.files import files_bp
|
||||||
from server.routes.events import events_bp
|
from server.routes.events import events_bp
|
||||||
|
from server.routes.conversions import conversions_bp
|
||||||
from server.routes.holidays import holidays_bp
|
from server.routes.holidays import holidays_bp
|
||||||
from server.routes.academic_periods import academic_periods_bp
|
from server.routes.academic_periods import academic_periods_bp
|
||||||
from server.routes.groups import groups_bp
|
from server.routes.groups import groups_bp
|
||||||
@@ -24,6 +25,7 @@ app.register_blueprint(eventmedia_bp)
|
|||||||
app.register_blueprint(files_bp)
|
app.register_blueprint(files_bp)
|
||||||
app.register_blueprint(holidays_bp)
|
app.register_blueprint(holidays_bp)
|
||||||
app.register_blueprint(academic_periods_bp)
|
app.register_blueprint(academic_periods_bp)
|
||||||
|
app.register_blueprint(conversions_bp)
|
||||||
|
|
||||||
|
|
||||||
@app.route("/health")
|
@app.route("/health")
|
||||||
|
|||||||
Reference in New Issue
Block a user