feat(listener): migrate dashboard MQTT payload to v2-only grouped schema

- Replace _extract_image_and_timestamp() with v2-only _extract_dashboard_payload_fields() - Add _classify_dashboard_payload() + parse metrics (v2_success, parse_failures) - Add soft _validate_v2_required_fields() for warning-only field checks - Remove legacy fallback after soak confirmed legacy_fallback=0 - Fix: forward msg.payload directly to handle_screenshot() to avoid re-wrap bug - Add 33 parser tests in listener/test_listener_parser.py - Add MQTT_PAYLOAD_MIGRATION_GUIDE.md documenting the 10-step migration process - Update README.md and copilot-instructions.md to reflect v2-only schema
fix(dashboard): restore event visibility and fix lint errors in App.tsx
2026-03-30 14:18:34 +00:00 · 2026-03-30 09:51:22 +00:00 · 2026-03-29 13:13:13 +00:00 · 2026-03-24 11:18:33 +00:00 · 2026-03-10 07:33:38 +00:00
25 changed files with 5281 additions and 119 deletions
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -34,6 +34,7 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
 - `dashboard/src/settings.tsx` — settings UI (nested tabs; system defaults for presentations and videos)
 - `dashboard/src/ressourcen.tsx` — timeline view showing all groups' active events in parallel
 - `dashboard/src/ressourcen.css` — timeline and resource view styling
+- `dashboard/src/monitoring.tsx` — superadmin-only monitoring dashboard for client health, screenshots, and logs



@@ -50,11 +51,32 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro

 ### Screenshot retention
 - Screenshots sent via dashboard MQTT are stored in `server/screenshots/`.
- For each client, only the latest and last 20 timestamped screenshots are kept; older files are deleted automatically on each upload.
+- Screenshot payloads support `screenshot_type` with values `periodic`, `event_start`, `event_stop`.
+- `periodic` is the normal heartbeat/dashboard screenshot path; `event_start` and `event_stop` are high-priority screenshots for monitoring.
+- For each client, the API keeps `{uuid}.jpg` as latest and the last 20 timestamped screenshots (`{uuid}_..._{type}.jpg`), deleting older timestamped files automatically.
+- For high-priority screenshots, the API additionally maintains `{uuid}_priority.jpg` and metadata in `{uuid}_meta.json` (`latest_screenshot_type`, `last_priority_*`).

  ## Recent changes since last commit

-  ### Latest (January 2026)
+  ### Latest (March 2026)
+
+  - **Monitoring System Completion (no version bump)**:
+    - End-to-end monitoring pipeline completed: MQTT logs/health → listener persistence → monitoring APIs → superadmin dashboard
+    - API now serves aggregated monitoring via `GET /api/client-logs/monitoring-overview` and system-wide recent errors via `GET /api/client-logs/recent-errors`
+    - Monitoring dashboard (`dashboard/src/monitoring.tsx`) is active and displays client health states, screenshots, process metadata, and recent log activity
+  - **Screenshot Priority Pipeline (no version bump)**:
+    - Listener forwards `screenshot_type` from MQTT screenshot/dashboard payloads to `POST /api/clients/<uuid>/screenshot`.
+    - API stores typed screenshots, tracks latest/priority metadata, and serves priority images via `GET /screenshots/<uuid>/priority`.
+    - Monitoring overview exposes screenshot priority state (`latestScreenshotType`, `priorityScreenshotType`, `priorityScreenshotReceivedAt`, `hasActivePriorityScreenshot`) and `summary.activePriorityScreenshots`.
+    - Monitoring UI shows screenshot type badges and switches to faster refresh while priority screenshots are active.
+  - **MQTT Dashboard Payload v2 Cutover (no version bump)**:
+    - Dashboard payload parsing in `listener/listener.py` is now v2-only (`message`, `content`, `runtime`, `metadata`).
+    - Legacy top-level dashboard fallback was removed after migration soak (`legacy_fallback=0`).
+    - Listener observability summarizes parser health using `v2_success` and `parse_failures` counters.
+  - **Presentation Flags Persistence Fix**:
+    - Fixed persistence for presentation `page_progress` and `auto_progress` to ensure values are reliably stored and returned across create/update paths and detached occurrences
+
+  ### Earlier (January 2026)
  
  - **Ressourcen Page (Timeline View)**:
    - New 'Ressourcen' page with parallel timeline view showing active events for all room groups
@@ -119,15 +141,17 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
 ## Service boundaries & data flow
 - Database connection string is passed as `DB_CONN` (mysql+pymysql) to Python services.
  - API builds its engine in `server/database.py` (loads `.env` only in development).
-  - Scheduler loads `DB_CONN` in `scheduler/db_utils.py`. Recurring events are expanded for the next 7 days, and event exceptions (skipped dates, detached occurrences) are respected. Only recurring events with recurrence_end in the future remain active. The scheduler publishes only events that are active at the current time and clears retained topics (publishes `[]`) for groups without active events. Time comparisons are UTC and naive timestamps are normalized.
  - Listener also creates its own engine for writes to `clients`.
  - Scheduler queries a future window (default: 7 days) to expand recurring events using RFC 5545 rules, applies event exceptions (skipped dates, detached occurrences), and publishes only events that are active at the current time (UTC). When a group has no active events, the scheduler clears its retained topic by publishing an empty list. Time comparisons are UTC; naive timestamps are normalized. Logging is concise; conversion lookups are cached and logged only once per media.
 - MQTT topics (paho-mqtt v2, use Callback API v2):
  - Discovery: `infoscreen/discovery` (JSON includes `uuid`, hw/ip data). ACK to `infoscreen/{uuid}/discovery_ack`. See `listener/listener.py`.
-  - Heartbeat: `infoscreen/{uuid}/heartbeat` updates `Client.last_alive` (UTC).
+  - Heartbeat: `infoscreen/{uuid}/heartbeat` updates `Client.last_alive` (UTC); enhanced payload includes `current_process`, `process_pid`, `process_status`, `current_event_id`.
  - Event lists (retained): `infoscreen/events/{group_id}` from `scheduler/scheduler.py`.
  - Per-client group assignment (retained): `infoscreen/{uuid}/group_id` via `server/mqtt_helper.py`.
- Screenshots: server-side folders `server/received_screenshots/` and `server/screenshots/`; Nginx exposes `/screenshots/{uuid}.jpg` via `server/wsgi.py` route.
+  - Client logs: `infoscreen/{uuid}/logs/{error|warn|info}` with JSON payload (timestamp, message, context); QoS 1 for ERROR/WARN, QoS 0 for INFO.
+  - Client health: `infoscreen/{uuid}/health` with metrics (expected_state, actual_state, health_metrics); QoS 0, published every 5 seconds.
+  - Dashboard screenshots: `infoscreen/{uuid}/dashboard` uses grouped v2 payload blocks (`message`, `content`, `runtime`, `metadata`); listener reads screenshot data from `content.screenshot` and capture type from `metadata.capture.type`.
+- Screenshots: server-side folder `server/screenshots/`; API serves `/screenshots/{uuid}.jpg` (latest) and `/screenshots/{uuid}/priority` (active high-priority fallback to latest).

 - Dev Container guidance: If extensions reappear inside the container, remove UI-only extensions from `devcontainer.json` `extensions` and map them in `remote.extensionKind` as `"ui"`.

@@ -146,6 +170,11 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
  - `locked_until`: TIMESTAMP placeholder for account lockout (infrastructure in place, not yet enforced)
  - `deactivated_at`, `deactivated_by`: Soft-delete audit trail (FK self-reference); soft deactivation is the default, hard delete superadmin-only
  - Role hierarchy (privilege escalation enforced): `user` < `editor` < `admin` < `superadmin`
+- Client monitoring (migration: `c1d2e3f4g5h6_add_client_monitoring.py`):
+  - `ClientLog` model: Centralized log storage with fields (id, client_uuid, timestamp, level, message, context, created_at); FK to clients.uuid (CASCADE)
+  - `Client` model extended: 7 health monitoring fields (`current_event_id`, `current_process`, `process_status`, `process_pid`, `last_screenshot_analyzed`, `screen_health_status`, `last_screenshot_hash`)
+  - Enums: `LogLevel` (ERROR, WARN, INFO, DEBUG), `ProcessStatus` (running, crashed, starting, stopped), `ScreenHealthStatus` (OK, BLACK, FROZEN, UNKNOWN)
+  - Indexes: (client_uuid, timestamp DESC), (level, timestamp DESC), (created_at DESC) for performance
 - System settings: `system_settings` key–value store via `SystemSetting` for global configuration (e.g., WebUntis/Vertretungsplan supplement-table). Managed through routes in `server/routes/system_settings.py`.
  - Presentation defaults (system-wide):
    - `presentation_interval` (seconds, default "10")
@@ -189,6 +218,12 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
    - `PUT /api/users/<id>/password` — admin password reset (requires backend check to reject self-reset for consistency)
    - `DELETE /api/users/<id>` — hard delete (superadmin only, with self-deletion check)
  - Auth routes (`server/routes/auth.py`): Enhanced to track login events (sets `last_login_at`, resets `failed_login_attempts` on success; increments `failed_login_attempts` and `last_failed_login_at` on failure). Self-service password change via `PUT /api/auth/change-password` requires current password verification.
+  - Client logs (`server/routes/client_logs.py`): Centralized log retrieval for monitoring:
+    - `GET /api/client-logs/<uuid>/logs` – Query client logs with filters (level, limit, since); admin_or_higher
+    - `GET /api/client-logs/summary` – Log counts by level per client (last 24h); admin_or_higher
+    - `GET /api/client-logs/recent-errors` – System-wide error monitoring; admin_or_higher
+    - `GET /api/client-logs/monitoring-overview` – Includes screenshot priority fields per client plus `summary.activePriorityScreenshots`; superadmin_only
+    - `GET /api/client-logs/test` – Infrastructure validation (no auth); returns recent logs with counts

  Documentation maintenance: keep this file aligned with real patterns; update when routes/session/UTC rules change. Avoid long prose; link exact paths.

@@ -246,6 +281,13 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
  - API client in `dashboard/src/apiUsers.ts` for all user operations (listUsers, getUser, createUser, updateUser, resetUserPassword, deleteUser)
  - Menu visibility: "Benutzer" menu item only visible to admin+ (role-gated in App.tsx)

+- Monitoring page (`dashboard/src/monitoring.tsx`):
+  - Superadmin-only dashboard for client monitoring and diagnostics; menu item is hidden for lower roles and the route redirects non-superadmins.
+  - Uses `GET /api/client-logs/monitoring-overview` for aggregated live status, `GET /api/client-logs/recent-errors` for system-wide errors, and `GET /api/client-logs/<uuid>/logs` for per-client details.
+  - Shows per-client status (`healthy`, `warning`, `critical`, `offline`) based on heartbeat freshness, process state, screen state, and recent log counts.
+  - Displays latest screenshot preview and active priority screenshot (`/screenshots/{uuid}/priority` when active), screenshot type badges, current process metadata, and recent ERROR/WARN activity.
+  - Uses adaptive refresh: normal interval in steady state, faster polling while `activePriorityScreenshots > 0`.
+
 - Settings page (`dashboard/src/settings.tsx`):
  - Structure: Syncfusion TabComponent with role-gated tabs
    - 📅 Academic Calendar (all users)
@@ -323,6 +365,7 @@ Note: Syncfusion usage in the dashboard is already documented above; if a UI for
 - VITE_API_URL — Dashboard build-time base URL (prod); in dev the Vite proxy serves `/api` to `server:8000`.
 - HEARTBEAT_GRACE_PERIOD_DEV / HEARTBEAT_GRACE_PERIOD_PROD — Groups "alive" window (defaults 180s dev / 170s prod). Clients send heartbeats every ~65s; grace periods allow 2 missed heartbeats plus safety margin.
 - REFRESH_SECONDS — Optional scheduler republish interval; `0` disables periodic refresh.
+- PRIORITY_SCREENSHOT_TTL_SECONDS — Optional monitoring priority window in seconds (default `120`); controls when event screenshots are considered active priority.

 ## Conventions & gotchas
 - **Datetime Handling**:
@@ -332,7 +375,6 @@ Note: Syncfusion usage in the dashboard is already documented above; if a UI for
  - Frontend **must** append 'Z' before parsing: `const utcStr = dateStr.endsWith('Z') ? dateStr : dateStr + 'Z'; new Date(utcStr);`
  - Display in local timezone using `toLocaleTimeString('de-DE', { hour: '2-digit', minute: '2-digit' })`
  - When sending to API, use `date.toISOString()` which includes 'Z' and is UTC
-  - Frontend must append `Z` to API strings before parsing; backend compares in UTC and returns ISO without `Z`.
 - **JSON Naming Convention**:
  - Backend uses snake_case internally (Python convention)
  - API returns camelCase JSON (web standard): `startTime`, `endTime`, `groupId`, etc.
@@ -364,7 +406,8 @@ Docs maintenance guardrails (solo-friendly): Update this file alongside code cha
 ## Quick examples
 - Add client description persists to DB and publishes group via MQTT: see `PUT /api/clients/<uuid>/description` in `routes/clients.py`.
 - Bulk group assignment emits retained messages for each client: `PUT /api/clients/group`.
- Listener heartbeat path: `infoscreen/<uuid>/heartbeat` → sets `clients.last_alive`.
+- Listener heartbeat path: `infoscreen/<uuid>/heartbeat` → sets `clients.last_alive` and captures process health data.
+- Client monitoring flow: Client publishes to `infoscreen/{uuid}/logs/error` and `infoscreen/{uuid}/health` → listener stores/updates monitoring state → API serves `/api/client-logs/monitoring-overview`, `/api/client-logs/recent-errors`, and `/api/client-logs/<uuid>/logs` → superadmin monitoring dashboard displays live status.

 ## Scheduler payloads: presentation extras
 - Presentation event payloads now include `page_progress` and `auto_progress` in addition to `slide_interval` and media files. These are sourced from per-event fields in the database (with system defaults applied on event creation).
@@ -393,3 +436,14 @@ Questions or unclear areas? Tell us if you need: exact devcontainer debugging st
 - Breaking changes must be prefixed with `BREAKING:`
 - Keep ≤ 8–10 bullets; summarize or group micro-changes
 - JSON hygiene: valid JSON, no trailing commas, don’t edit historical entries except typos
+
+## Versioning Convention (Tech vs UI)
+
+- Use one unified app version across technical and user-facing release notes.
+- `dashboard/public/program-info.json` is user-facing and should list only user-visible changes.
+- `TECH-CHANGELOG.md` can include deeper technical details for the same released version.
+- If server/infrastructure work is implemented but not yet released or not user-visible, document it under the latest released section as:
+  - `Backend technical work (post-release notes; no version bump)`
+- Do not create a new version header in `TECH-CHANGELOG.md` for internal milestones alone.
+- Bump version numbers when a release is actually cut/deployed (or when user-facing release notes are published), not for intermediate backend-only steps.
+- When UI integration lands later, include the user-visible part in the next release version and reference prior post-release technical groundwork when useful.
--- a/AI-INSTRUCTIONS-MAINTENANCE.md
+++ b/AI-INSTRUCTIONS-MAINTENANCE.md
@@ -98,3 +98,6 @@ exit 0  # warn only; do not block commit
 - MQTT workers: `listener/listener.py`, `scheduler/scheduler.py`, `server/mqtt_helper.py`
 - Frontend: `dashboard/vite.config.ts`, `dashboard/package.json`, `dashboard/src/*`
 - Dev/Prod docs: `deployment.md`, `.env.example`
+
+## Documentation sync log
+- 2026-03-24: Synced docs for completed monitoring rollout and presentation flag persistence fix (`page_progress` / `auto_progress`). Updated `.github/copilot-instructions.md`, `README.md`, `TECH-CHANGELOG.md`, `DEV-CHANGELOG.md`, and `CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md` without a user-version bump.
--- a/CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md
+++ b/CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md
@@ -0,0 +1,757 @@
+# 🚀 Client Monitoring Implementation Guide
+
+**Phase-based implementation guide for basic monitoring in development phase**
+
+---
+
+## ✅ Phase 1: Server-Side Database Foundation
+**Status:** ✅ COMPLETE  
+**Dependencies:** None - Already implemented  
+**Time estimate:** Completed
+
+### ✅ Step 1.1: Database Migration
+**File:** `server/alembic/versions/c1d2e3f4g5h6_add_client_monitoring.py`  
+**What it does:**
+- Creates `client_logs` table for centralized logging
+- Adds health monitoring columns to `clients` table
+- Creates indexes for efficient querying
+
+**To apply:**
+```bash
+cd /workspace/server
+alembic upgrade head
+```
+
+### ✅ Step 1.2: Update Data Models  
+**File:** `models/models.py`  
+**What was added:**
+- New enums: `LogLevel`, `ProcessStatus`, `ScreenHealthStatus`
+- Updated `Client` model with health tracking fields
+- New `ClientLog` model for log storage
+
+---
+
+## 🔧 Phase 2: Server-Side Backend Logic
+**Status:** ✅ COMPLETE  
+**Dependencies:** Phase 1 complete  
+**Time estimate:** 2-3 hours
+
+### Step 2.1: Extend MQTT Listener
+**File:** `listener/listener.py`  
+**What to add:**
+
+```python
+# Add new topic subscriptions in on_connect():
+client.subscribe("infoscreen/+/logs/error")
+client.subscribe("infoscreen/+/logs/warn")
+client.subscribe("infoscreen/+/logs/info")  # Dev mode only
+client.subscribe("infoscreen/+/health")
+
+# Add new handler in on_message():
+def handle_log_message(uuid, level, payload):
+    """Store client log in database"""
+    from models.models import ClientLog, LogLevel
+    from server.database import Session
+    import json
+    
+    session = Session()
+    try:
+        log_entry = ClientLog(
+            client_uuid=uuid,
+            timestamp=payload.get('timestamp', datetime.now(timezone.utc)),
+            level=LogLevel[level],
+            message=payload.get('message', ''),
+            context=json.dumps(payload.get('context', {}))
+        )
+        session.add(log_entry)
+        session.commit()
+        print(f"[LOG] {uuid} {level}: {payload.get('message', '')}")
+    except Exception as e:
+        print(f"Error saving log: {e}")
+        session.rollback()
+    finally:
+        session.close()
+
+def handle_health_message(uuid, payload):
+    """Update client health status"""
+    from models.models import Client, ProcessStatus
+    from server.database import Session
+    
+    session = Session()
+    try:
+        client = session.query(Client).filter_by(uuid=uuid).first()
+        if client:
+            client.current_event_id = payload.get('expected_state', {}).get('event_id')
+            client.current_process = payload.get('actual_state', {}).get('process')
+            
+            status_str = payload.get('actual_state', {}).get('status')
+            if status_str:
+                client.process_status = ProcessStatus[status_str]
+            
+            client.process_pid = payload.get('actual_state', {}).get('pid')
+            session.commit()
+    except Exception as e:
+        print(f"Error updating health: {e}")
+        session.rollback()
+    finally:
+        session.close()
+```
+
+**Topic routing logic:**
+```python
+# In on_message callback, add routing:
+if topic.endswith('/logs/error'):
+    handle_log_message(uuid, 'ERROR', payload)
+elif topic.endswith('/logs/warn'):
+    handle_log_message(uuid, 'WARN', payload)
+elif topic.endswith('/logs/info'):
+    handle_log_message(uuid, 'INFO', payload)
+elif topic.endswith('/health'):
+    handle_health_message(uuid, payload)
+```
+
+### Step 2.2: Create API Routes
+**File:** `server/routes/client_logs.py` (NEW)
+
+```python
+from flask import Blueprint, jsonify, request
+from server.database import Session
+from server.permissions import admin_or_higher
+from models.models import ClientLog, Client
+from sqlalchemy import desc
+import json
+
+client_logs_bp = Blueprint("client_logs", __name__, url_prefix="/api/client-logs")
+
+@client_logs_bp.route("/<uuid>/logs", methods=["GET"])
+@admin_or_higher
+def get_client_logs(uuid):
+    """
+    Get logs for a specific client
+    Query params:
+      - level: ERROR, WARN, INFO, DEBUG (optional)
+      - limit: number of entries (default 50, max 500)
+      - since: ISO timestamp (optional)
+    """
+    session = Session()
+    try:
+        level = request.args.get('level')
+        limit = min(int(request.args.get('limit', 50)), 500)
+        since = request.args.get('since')
+        
+        query = session.query(ClientLog).filter_by(client_uuid=uuid)
+        
+        if level:
+            from models.models import LogLevel
+            query = query.filter_by(level=LogLevel[level])
+        
+        if since:
+            from datetime import datetime
+            since_dt = datetime.fromisoformat(since.replace('Z', '+00:00'))
+            query = query.filter(ClientLog.timestamp >= since_dt)
+        
+        logs = query.order_by(desc(ClientLog.timestamp)).limit(limit).all()
+        
+        result = []
+        for log in logs:
+            result.append({
+                "id": log.id,
+                "timestamp": log.timestamp.isoformat() if log.timestamp else None,
+                "level": log.level.value if log.level else None,
+                "message": log.message,
+                "context": json.loads(log.context) if log.context else {}
+            })
+        
+        session.close()
+        return jsonify({"logs": result, "count": len(result)})
+    
+    except Exception as e:
+        session.close()
+        return jsonify({"error": str(e)}), 500
+
+@client_logs_bp.route("/summary", methods=["GET"])
+@admin_or_higher
+def get_logs_summary():
+    """Get summary of errors/warnings across all clients"""
+    session = Session()
+    try:
+        from sqlalchemy import func
+        from models.models import LogLevel
+        from datetime import datetime, timedelta
+        
+        # Last 24 hours
+        since = datetime.utcnow() - timedelta(hours=24)
+        
+        stats = session.query(
+            ClientLog.client_uuid,
+            ClientLog.level,
+            func.count(ClientLog.id).label('count')
+        ).filter(
+            ClientLog.timestamp >= since
+        ).group_by(
+            ClientLog.client_uuid,
+            ClientLog.level
+        ).all()
+        
+        result = {}
+        for stat in stats:
+            uuid = stat.client_uuid
+            if uuid not in result:
+                result[uuid] = {"ERROR": 0, "WARN": 0, "INFO": 0}
+            result[uuid][stat.level.value] = stat.count
+        
+        session.close()
+        return jsonify({"summary": result, "period_hours": 24})
+    
+    except Exception as e:
+        session.close()
+        return jsonify({"error": str(e)}), 500
+```
+
+**Register in `server/wsgi.py`:**
+```python
+from server.routes.client_logs import client_logs_bp
+app.register_blueprint(client_logs_bp)
+```
+
+### Step 2.3: Add Health Data to Heartbeat Handler
+**File:** `listener/listener.py` (extend existing heartbeat handler)
+
+```python
+# Modify existing heartbeat handler to capture health data
+def on_message(client, userdata, message):
+    topic = message.topic
+    
+    # Existing heartbeat logic...
+    if '/heartbeat' in topic:
+        uuid = extract_uuid_from_topic(topic)
+        try:
+            payload = json.loads(message.payload.decode())
+            
+            # Update last_alive (existing)
+            session = Session()
+            client_obj = session.query(Client).filter_by(uuid=uuid).first()
+            if client_obj:
+                client_obj.last_alive = datetime.now(timezone.utc)
+                
+                # NEW: Update health data if present in heartbeat
+                if 'process_status' in payload:
+                    client_obj.process_status = ProcessStatus[payload['process_status']]
+                if 'current_process' in payload:
+                    client_obj.current_process = payload['current_process']
+                if 'process_pid' in payload:
+                    client_obj.process_pid = payload['process_pid']
+                if 'current_event_id' in payload:
+                    client_obj.current_event_id = payload['current_event_id']
+                
+                session.commit()
+            session.close()
+        except Exception as e:
+            print(f"Error processing heartbeat: {e}")
+```
+
+---
+
+## 🖥️ Phase 3: Client-Side Implementation
+**Status:** ✅ COMPLETE  
+**Dependencies:** Phase 2 complete  
+**Time estimate:** 3-4 hours
+
+### Step 3.1: Create Client Watchdog Script
+**File:** `client/watchdog.py` (NEW - on client device)
+
+```python
+#!/usr/bin/env python3
+"""
+Client-side process watchdog
+Monitors VLC, Chromium, PDF viewer and reports health to server
+"""
+import psutil
+import paho.mqtt.client as mqtt
+import json
+import time
+from datetime import datetime, timezone
+import sys
+import os
+
+class MediaWatchdog:
+    def __init__(self, client_uuid, mqtt_broker, mqtt_port=1883):
+        self.uuid = client_uuid
+        self.mqtt_client = mqtt.Client()
+        self.mqtt_client.connect(mqtt_broker, mqtt_port, 60)
+        self.mqtt_client.loop_start()
+        
+        self.current_process = None
+        self.current_event_id = None
+        self.restart_attempts = 0
+        self.MAX_RESTARTS = 3
+    
+    def send_log(self, level, message, context=None):
+        """Send log message to server via MQTT"""
+        topic = f"infoscreen/{self.uuid}/logs/{level.lower()}"
+        payload = {
+            "timestamp": datetime.now(timezone.utc).isoformat(),
+            "message": message,
+            "context": context or {}
+        }
+        self.mqtt_client.publish(topic, json.dumps(payload), qos=1)
+        print(f"[{level}] {message}")
+    
+    def send_health(self, process_name, pid, status, event_id=None):
+        """Send health status to server"""
+        topic = f"infoscreen/{self.uuid}/health"
+        payload = {
+            "timestamp": datetime.now(timezone.utc).isoformat(),
+            "expected_state": {
+                "event_id": event_id
+            },
+            "actual_state": {
+                "process": process_name,
+                "pid": pid,
+                "status": status  # 'running', 'crashed', 'starting', 'stopped'
+            }
+        }
+        self.mqtt_client.publish(topic, json.dumps(payload), qos=1, retain=False)
+    
+    def is_process_running(self, process_name):
+        """Check if a process is running"""
+        for proc in psutil.process_iter(['name', 'pid']):
+            try:
+                if process_name.lower() in proc.info['name'].lower():
+                    return proc.info['pid']
+            except (psutil.NoSuchProcess, psutil.AccessDenied):
+                pass
+        return None
+    
+    def monitor_loop(self):
+        """Main monitoring loop"""
+        print(f"Watchdog started for client {self.uuid}")
+        self.send_log("INFO", "Watchdog service started", {"uuid": self.uuid})
+        
+        while True:
+            try:
+                # Check expected process (would be set by main event handler)
+                if self.current_process:
+                    pid = self.is_process_running(self.current_process)
+                    
+                    if pid:
+                        # Process is running
+                        self.send_health(
+                            self.current_process,
+                            pid,
+                            "running",
+                            self.current_event_id
+                        )
+                        self.restart_attempts = 0  # Reset on success
+                    else:
+                        # Process crashed
+                        self.send_log(
+                            "ERROR",
+                            f"Process {self.current_process} crashed or stopped",
+                            {
+                                "event_id": self.current_event_id,
+                                "process": self.current_process,
+                                "restart_attempt": self.restart_attempts
+                            }
+                        )
+                        
+                        if self.restart_attempts < self.MAX_RESTARTS:
+                            self.send_log("WARN", f"Attempting restart ({self.restart_attempts + 1}/{self.MAX_RESTARTS})")
+                            self.restart_attempts += 1
+                            # TODO: Implement restart logic (call event handler)
+                        else:
+                            self.send_log("ERROR", "Max restart attempts exceeded", {
+                                "event_id": self.current_event_id
+                            })
+                
+                time.sleep(5)  # Check every 5 seconds
+                
+            except KeyboardInterrupt:
+                print("Watchdog stopped by user")
+                break
+            except Exception as e:
+                self.send_log("ERROR", f"Watchdog error: {str(e)}", {
+                    "exception": str(e),
+                    "traceback": str(sys.exc_info())
+                })
+                time.sleep(10)  # Wait longer on error
+
+if __name__ == "__main__":
+    import sys
+    if len(sys.argv) < 3:
+        print("Usage: python watchdog.py <client_uuid> <mqtt_broker>")
+        sys.exit(1)
+    
+    uuid = sys.argv[1]
+    broker = sys.argv[2]
+    
+    watchdog = MediaWatchdog(uuid, broker)
+    watchdog.monitor_loop()
+```
+
+### Step 3.2: Integrate with Existing Event Handler
+**File:** `client/event_handler.py` (modify existing)
+
+```python
+# When starting a new event, notify watchdog
+def play_event(event_data):
+    event_type = event_data.get('event_type')
+    event_id = event_data.get('id')
+    
+    if event_type == 'video':
+        process_name = 'vlc'
+        # Start VLC...
+    elif event_type == 'website':
+        process_name = 'chromium'
+        # Start Chromium...
+    elif event_type == 'presentation':
+        process_name = 'pdf_viewer'  # or your PDF tool
+        # Start PDF viewer...
+    
+    # Notify watchdog about expected process
+    watchdog.current_process = process_name
+    watchdog.current_event_id = event_id
+    watchdog.restart_attempts = 0
+```
+
+### Step 3.3: Enhanced Heartbeat Payload
+**File:** `client/heartbeat.py` (modify existing)
+
+```python
+# Modify existing heartbeat to include process status
+def send_heartbeat(mqtt_client, uuid):
+    # Get current process status
+    current_process = None
+    process_pid = None
+    process_status = "stopped"
+    
+    # Check if expected process is running
+    if watchdog.current_process:
+        pid = watchdog.is_process_running(watchdog.current_process)
+        if pid:
+            current_process = watchdog.current_process
+            process_pid = pid
+            process_status = "running"
+    
+    payload = {
+        "uuid": uuid,
+        "timestamp": datetime.now(timezone.utc).isoformat(),
+        # Existing fields...
+        # NEW health fields:
+        "current_process": current_process,
+        "process_pid": process_pid,
+        "process_status": process_status,
+        "current_event_id": watchdog.current_event_id
+    }
+    
+    mqtt_client.publish(f"infoscreen/{uuid}/heartbeat", json.dumps(payload))
+```
+
+---
+
+## 🎨 Phase 4: Dashboard UI Integration
+**Status:** ✅ COMPLETE  
+**Dependencies:** Phases 2 & 3 complete  
+**Time estimate:** 2-3 hours
+
+### Step 4.1: Create Log Viewer Component
+**File:** `dashboard/src/ClientLogs.tsx` (NEW)
+
+```typescript
+import React from 'react';
+import { GridComponent, ColumnsDirective, ColumnDirective, Page, Inject } from '@syncfusion/ej2-react-grids';
+
+interface LogEntry {
+  id: number;
+  timestamp: string;
+  level: 'ERROR' | 'WARN' | 'INFO' | 'DEBUG';
+  message: string;
+  context: any;
+}
+
+interface ClientLogsProps {
+  clientUuid: string;
+}
+
+export const ClientLogs: React.FC<ClientLogsProps> = ({ clientUuid }) => {
+  const [logs, setLogs] = React.useState<LogEntry[]>([]);
+  const [loading, setLoading] = React.useState(false);
+
+  const loadLogs = async (level?: string) => {
+    setLoading(true);
+    try {
+      const params = new URLSearchParams({ limit: '50' });
+      if (level) params.append('level', level);
+      
+      const response = await fetch(`/api/client-logs/${clientUuid}/logs?${params}`);
+      const data = await response.json();
+      setLogs(data.logs);
+    } catch (err) {
+      console.error('Failed to load logs:', err);
+    } finally {
+      setLoading(false);
+    }
+  };
+
+  React.useEffect(() => {
+    loadLogs();
+    const interval = setInterval(() => loadLogs(), 30000); // Refresh every 30s
+    return () => clearInterval(interval);
+  }, [clientUuid]);
+
+  const levelTemplate = (props: any) => {
+    const colors = {
+      ERROR: 'text-red-600 bg-red-100',
+      WARN: 'text-yellow-600 bg-yellow-100',
+      INFO: 'text-blue-600 bg-blue-100',
+      DEBUG: 'text-gray-600 bg-gray-100'
+    };
+    return (
+      <span className={`px-2 py-1 rounded ${colors[props.level as keyof typeof colors]}`}>
+        {props.level}
+      </span>
+    );
+  };
+
+  return (
+    <div>
+      <div className="mb-4 flex gap-2">
+        <button onClick={() => loadLogs()} className="e-btn e-primary">All</button>
+        <button onClick={() => loadLogs('ERROR')} className="e-btn e-danger">Errors</button>
+        <button onClick={() => loadLogs('WARN')} className="e-btn e-warning">Warnings</button>
+        <button onClick={() => loadLogs('INFO')} className="e-btn e-info">Info</button>
+      </div>
+
+      <GridComponent
+        dataSource={logs}
+        allowPaging={true}
+        pageSettings={{ pageSize: 20 }}
+      >
+        <ColumnsDirective>
+          <ColumnDirective field='timestamp' headerText='Time' width='180' format='yMd HH:mm:ss' />
+          <ColumnDirective field='level' headerText='Level' width='100' template={levelTemplate} />
+          <ColumnDirective field='message' headerText='Message' width='400' />
+        </ColumnsDirective>
+        <Inject services={[Page]} />
+      </GridComponent>
+    </div>
+  );
+};
+```
+
+### Step 4.2: Add Health Indicators to Client Cards
+**File:** `dashboard/src/clients.tsx` (modify existing)
+
+```typescript
+// Add health indicator to client card
+const getHealthBadge = (client: Client) => {
+  if (!client.process_status) {
+    return <span className="badge badge-secondary">Unknown</span>;
+  }
+  
+  const badges = {
+    running: <span className="badge badge-success">✓ Running</span>,
+    crashed: <span className="badge badge-danger">✗ Crashed</span>,
+    starting: <span className="badge badge-warning">⟳ Starting</span>,
+    stopped: <span className="badge badge-secondary">■ Stopped</span>
+  };
+  
+  return badges[client.process_status] || null;
+};
+
+// In client card render:
+<div className="client-card">
+  <h3>{client.hostname || client.uuid}</h3>
+  <div>Status: {getHealthBadge(client)}</div>
+  <div>Process: {client.current_process || 'None'}</div>
+  <div>Event ID: {client.current_event_id || 'None'}</div>
+  <button onClick={() => showLogs(client.uuid)}>View Logs</button>
+</div>
+```
+
+### Step 4.3: Add System Health Dashboard (Superadmin)
+**File:** `dashboard/src/SystemMonitor.tsx` (NEW)
+
+```typescript
+import React from 'react';
+import { ClientLogs } from './ClientLogs';
+
+export const SystemMonitor: React.FC = () => {
+  const [summary, setSummary] = React.useState<any>({});
+
+  const loadSummary = async () => {
+    const response = await fetch('/api/client-logs/summary');
+    const data = await response.json();
+    setSummary(data.summary);
+  };
+
+  React.useEffect(() => {
+    loadSummary();
+    const interval = setInterval(loadSummary, 30000);
+    return () => clearInterval(interval);
+  }, []);
+
+  return (
+    <div className="system-monitor">
+      <h2>System Health Monitor (Superadmin)</h2>
+      
+      <div className="alert-panel">
+        <h3>Active Issues</h3>
+        {Object.entries(summary).map(([uuid, stats]: [string, any]) => (
+          stats.ERROR > 0 || stats.WARN > 5 ? (
+            <div key={uuid} className="alert">
+              🔴 {uuid}: {stats.ERROR} errors, {stats.WARN} warnings (24h)
+            </div>
+          ) : null
+        ))}
+      </div>
+      
+      {/* Real-time log stream */}
+      <div className="log-stream">
+        <h3>Recent Logs (All Clients)</h3>
+        {/* Implement real-time log aggregation */}
+      </div>
+    </div>
+  );
+};
+```
+
+---
+
+## 🧪 Phase 5: Testing & Validation
+**Status:** ✅ COMPLETE  
+**Dependencies:** All previous phases  
+**Time estimate:** 1-2 hours
+
+### Step 5.1: Server-Side Tests
+
+```bash
+# Test database migration
+cd /workspace/server
+alembic upgrade head
+alembic downgrade -1
+alembic upgrade head
+
+# Test API endpoints
+curl -X GET "http://localhost:8000/api/client-logs/<uuid>/logs?limit=10"
+curl -X GET "http://localhost:8000/api/client-logs/summary"
+```
+
+### Step 5.2: Client-Side Tests
+
+```bash
+# On client device
+python3 watchdog.py <your-uuid> <mqtt-broker-ip>
+
+# Simulate process crash
+pkill vlc  # Should trigger error log and restart attempt
+
+# Check MQTT messages
+mosquitto_sub -h <broker> -t "infoscreen/+/logs/#" -v
+mosquitto_sub -h <broker> -t "infoscreen/+/health" -v
+```
+
+### Step 5.3: Dashboard Tests
+
+1. Open dashboard and navigate to Clients page
+2. Verify health indicators show correct status
+3. Click "View Logs" and verify logs appear
+4. Navigate to System Monitor (superadmin)
+5. Verify summary statistics are correct
+
+---
+
+## 📝 Configuration Summary
+
+### Environment Variables
+
+**Server (docker-compose.yml):**
+```yaml
+- LOG_RETENTION_DAYS=90  # How long to keep logs
+- DEBUG_MODE=true        # Enable INFO level logging via MQTT
+```
+
+**Client:**
+```bash
+export MQTT_BROKER="your-server-ip"
+export CLIENT_UUID="abc-123-def"
+export WATCHDOG_ENABLED=true
+```
+
+### MQTT Topics Reference
+
+| Topic Pattern | Direction | Purpose |
+|--------------|-----------|---------|
+| `infoscreen/{uuid}/logs/error` | Client → Server | Error messages |
+| `infoscreen/{uuid}/logs/warn` | Client → Server | Warning messages |
+| `infoscreen/{uuid}/logs/info` | Client → Server | Info (dev only) |
+| `infoscreen/{uuid}/health` | Client → Server | Health metrics |
+| `infoscreen/{uuid}/heartbeat` | Client → Server | Enhanced heartbeat |
+
+### Database Tables
+
+**client_logs:**
+- Stores all centralized logs
+- Indexed by client_uuid, timestamp, level
+- Auto-cleanup after 90 days (recommended)
+
+**clients (extended):**
+- `current_event_id`: Which event should be playing
+- `current_process`: Expected process name
+- `process_status`: running/crashed/starting/stopped
+- `process_pid`: Process ID
+- `screen_health_status`: OK/BLACK/FROZEN/UNKNOWN
+- `last_screenshot_analyzed`: Last analysis time
+- `last_screenshot_hash`: For frozen detection
+
+---
+
+## 🎯 Next Steps After Implementation
+
+1. **Deploy Phase 1-2** to staging environment
+2. **Test with 1-2 pilot clients** before full rollout
+3. **Monitor traffic & performance** (should be minimal)
+4. **Fine-tune log levels** based on actual noise
+5. **Add alerting** (email/Slack when errors > threshold)
+6. **Implement screenshot analysis** (Phase 2 enhancement)
+7. **Add trending/analytics** (which clients are least reliable)
+
+---
+
+## 🚨 Troubleshooting
+
+**Logs not appearing in database:**
+- Check MQTT broker logs: `docker logs infoscreen-mqtt`
+- Verify listener subscriptions: Check `listener/listener.py` logs
+- Test MQTT manually: `mosquitto_pub -h broker -t "infoscreen/test/logs/error" -m '{"message":"test"}'`
+
+**High database growth:**
+- Check log_retention cleanup cronjob
+- Reduce INFO level logging frequency
+- Add sampling (log every 10th occurrence instead of all)
+
+**Client watchdog not detecting crashes:**
+- Verify psutil can see processes: `ps aux | grep vlc`
+- Check permissions (may need sudo for some process checks)
+- Increase monitor loop frequency for faster detection
+
+---
+
+## ✅ Completion Checklist
+
+- [x] Phase 1: Database migration applied
+- [x] Phase 2: Listener extended for log topics
+- [x] Phase 2: API endpoints created and tested
+- [x] Phase 3: Client watchdog implemented
+- [x] Phase 3: Enhanced heartbeat deployed
+- [x] Phase 4: Dashboard log viewer working
+- [x] Phase 4: Health indicators visible
+- [x] Phase 5: End-to-end testing complete
+- [x] Documentation updated with new features
+- [x] Production deployment plan created
+
+---
+
+**Last Updated:** 2026-03-24  
+**Author:** GitHub Copilot  
+**For:** Infoscreen 2025 Project
--- a/CLIENT_MONITORING_SPECIFICATION.md
+++ b/CLIENT_MONITORING_SPECIFICATION.md
@@ -0,0 +1,979 @@
+# Client-Side Monitoring Specification
+
+**Version:** 1.0  
+**Date:** 2026-03-10  
+**For:** Infoscreen Client Implementation  
+**Server Endpoint:** `192.168.43.201:8000` (or your production server)  
+**MQTT Broker:** `192.168.43.201:1883` (or your production MQTT broker)
+
+---
+
+## 1. Overview
+
+Each infoscreen client must implement health monitoring and logging capabilities to report status to the central server via MQTT.
+
+### 1.1 Goals
+- **Detect failures:** Process crashes, frozen screens, content mismatches
+- **Provide visibility:** Real-time health status visible on server dashboard
+- **Enable remote diagnosis:** Centralized log storage for debugging
+- **Auto-recovery:** Attempt automatic restart on failure
+
+### 1.2 Architecture
+```
+┌─────────────────────────────────────────┐
+│         Infoscreen Client               │
+│                                         │
+│  ┌──────────────┐    ┌──────────────┐  │
+│  │ Media Player │    │   Watchdog   │  │
+│  │ (VLC/Chrome) │◄───│   Monitor    │  │
+│  └──────────────┘    └──────┬───────┘  │
+│                              │          │
+│  ┌──────────────┐            │          │
+│  │  Event Mgr   │            │          │
+│  │  (receives   │            │          │
+│  │   schedule)  │◄───────────┘          │
+│  └──────┬───────┘                       │
+│         │                               │
+│  ┌──────▼───────────────────────┐      │
+│  │     MQTT Client               │      │
+│  │  - Heartbeat (every 60s)      │      │
+│  │  - Logs (error/warn/info)     │      │
+│  │  - Health metrics (every 5s)  │      │
+│  └──────┬────────────────────────┘      │
+└─────────┼──────────────────────────────┘
+          │
+          │ MQTT over TCP
+          ▼
+    ┌─────────────┐
+    │ MQTT Broker │
+    │  (server)   │
+    └─────────────┘
+```
+
+### 1.3 Current Compatibility Notes
+- The server now accepts both the original specification payloads and the currently implemented Phase 3 client payloads.
+- `infoscreen/{uuid}/health` may currently contain a reduced payload with only `expected_state.event_id` and `actual_state.process|pid|status`. Additional `health_metrics` fields from this specification remain recommended.
+- `event_id` is still specified as an integer. For compatibility with the current Phase 3 client, the server also tolerates string values such as `event_123` and extracts the numeric suffix where possible.
+- If the client sends `process_health` inside `infoscreen/{uuid}/dashboard`, the server treats it as a fallback source for `current_process`, `process_pid`, `process_status`, and `current_event_id`.
+- Long term, the preferred client payload remains the structure in this specification so the server can surface richer monitoring data such as screen state and resource metrics.
+
+---
+
+## 2. MQTT Protocol Specification
+
+### 2.1 Connection Parameters
+```
+Broker: 192.168.43.201 (or DNS hostname)
+Port: 1883 (standard MQTT)
+Protocol: MQTT v3.1.1
+Client ID: "infoscreen-{client_uuid}"
+Clean Session: false (retain subscriptions)
+Keep Alive: 60 seconds
+Username/Password: (if configured on broker)
+```
+
+### 2.2 QoS Levels
+- **Heartbeat:** QoS 0 (fire and forget, high frequency)
+- **Logs (ERROR/WARN):** QoS 1 (at least once delivery, important)
+- **Logs (INFO):** QoS 0 (optional, high volume)
+- **Health metrics:** QoS 0 (frequent, latest value matters)
+
+---
+
+## 3. Topic Structure & Payload Formats
+
+### 3.1 Log Messages
+
+#### Topic Pattern:
+```
+infoscreen/{client_uuid}/logs/{level}
+```
+
+Where `{level}` is one of: `error`, `warn`, `info`
+
+#### Payload Format (JSON):
+```json
+{
+  "timestamp": "2026-03-10T07:30:00Z",
+  "message": "Human-readable error description",
+  "context": {
+    "event_id": 42,
+    "process": "vlc",
+    "error_code": "NETWORK_TIMEOUT",
+    "additional_key": "any relevant data"
+  }
+}
+```
+
+#### Field Specifications:
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `timestamp` | string (ISO 8601 UTC) | Yes | When the event occurred. Use `YYYY-MM-DDTHH:MM:SSZ` format |
+| `message` | string | Yes | Human-readable description of the event (max 1000 chars) |
+| `context` | object | No | Additional structured data (will be stored as JSON) |
+
+#### Example Topics:
+```
+infoscreen/9b8d1856-ff34-4864-a726-12de072d0f77/logs/error
+infoscreen/9b8d1856-ff34-4864-a726-12de072d0f77/logs/warn
+infoscreen/9b8d1856-ff34-4864-a726-12de072d0f77/logs/info
+```
+
+#### When to Send Logs:
+
+**ERROR (Always send):**
+- Process crashed (VLC/Chromium/PDF viewer terminated unexpectedly)
+- Content failed to load (404, network timeout, corrupt file)
+- Hardware failure detected (display off, audio device missing)
+- Exception caught in main event loop
+- Maximum restart attempts exceeded
+
+**WARN (Always send):**
+- Process restarted automatically (after crash)
+- High resource usage (CPU >80%, RAM >90%)
+- Slow performance (frame drops, lag)
+- Non-critical failures (screenshot capture failed, cache full)
+- Fallback content displayed (primary source unavailable)
+
+**INFO (Send in development, optional in production):**
+- Process started successfully
+- Event transition (switched from video to presentation)
+- Content loaded successfully
+- Watchdog service started/stopped
+
+---
+
+### 3.2 Health Metrics
+
+#### Topic Pattern:
+```
+infoscreen/{client_uuid}/health
+```
+
+#### Payload Format (JSON):
+```json
+{
+  "timestamp": "2026-03-10T07:30:00Z",
+  "expected_state": {
+    "event_id": 42,
+    "event_type": "video",
+    "media_file": "presentation.mp4",
+    "started_at": "2026-03-10T07:15:00Z"
+  },
+  "actual_state": {
+    "process": "vlc",
+    "pid": 1234,
+    "status": "running",
+    "uptime_seconds": 900,
+    "position": 45.3,
+    "duration": 180.0
+  },
+  "health_metrics": {
+    "screen_on": true,
+    "last_frame_update": "2026-03-10T07:29:58Z",
+    "frames_dropped": 2,
+    "network_errors": 0,
+    "cpu_percent": 15.3,
+    "memory_mb": 234
+  }
+}
+```
+
+#### Field Specifications:
+
+**expected_state:**
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `event_id` | integer | Yes | Current event ID from scheduler |
+| `event_type` | string | Yes | `presentation`, `video`, `website`, `webuntis`, `message` |
+| `media_file` | string | No | Filename or URL of current content |
+| `started_at` | string (ISO 8601) | Yes | When this event started playing |
+
+**actual_state:**
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `process` | string | Yes | `vlc`, `chromium`, `pdf_viewer`, `none` |
+| `pid` | integer | No | Process ID (if running) |
+| `status` | string | Yes | `running`, `crashed`, `starting`, `stopped` |
+| `uptime_seconds` | integer | No | How long process has been running |
+| `position` | float | No | Current playback position (seconds, for video/audio) |
+| `duration` | float | No | Total content duration (seconds) |
+
+**health_metrics:**
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `screen_on` | boolean | Yes | Is display powered on? |
+| `last_frame_update` | string (ISO 8601) | No | Last time screen content changed |
+| `frames_dropped` | integer | No | Video frames dropped (performance indicator) |
+| `network_errors` | integer | No | Count of network errors in last interval |
+| `cpu_percent` | float | No | CPU usage (0-100) |
+| `memory_mb` | integer | No | RAM usage in megabytes |
+
+#### Sending Frequency:
+- **Normal operation:** Every 5 seconds
+- **During startup/transition:** Every 1 second
+- **After error:** Immediately + every 2 seconds until recovered
+
+---
+
+### 3.3 Enhanced Heartbeat
+
+The existing heartbeat topic should be enhanced to include process status.
+
+#### Topic Pattern:
+```
+infoscreen/{client_uuid}/heartbeat
+```
+
+#### Enhanced Payload Format (JSON):
+```json
+{
+  "uuid": "9b8d1856-ff34-4864-a726-12de072d0f77",
+  "timestamp": "2026-03-10T07:30:00Z",
+  "current_process": "vlc",
+  "process_pid": 1234,
+  "process_status": "running",
+  "current_event_id": 42
+}
+```
+
+#### New Fields (add to existing heartbeat):
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `current_process` | string | No | Name of active media player process |
+| `process_pid` | integer | No | Process ID |
+| `process_status` | string | No | `running`, `crashed`, `starting`, `stopped` |
+| `current_event_id` | integer | No | Event ID currently being displayed |
+
+#### Sending Frequency:
+- Keep existing: **Every 60 seconds**
+- Include new fields if available
+
+---
+
+## 4. Process Monitoring Requirements
+
+### 4.1 Processes to Monitor
+
+| Media Type | Process Name | How to Detect |
+|------------|--------------|---------------|
+| Video | `vlc` | `ps aux \| grep vlc` or `pgrep vlc` |
+| Website/WebUntis | `chromium` or `chromium-browser` | `pgrep chromium` |
+| PDF Presentation | `evince`, `okular`, or custom viewer | `pgrep {viewer_name}` |
+
+### 4.2 Monitoring Checks (Every 5 seconds)
+
+#### Check 1: Process Alive
+```
+Goal: Verify expected process is running
+Method: 
+  - Get list of running processes (psutil or `ps`)
+  - Check if expected process name exists
+  - Match PID if known
+Result:
+  - If missing → status = "crashed"
+  - If found → status = "running"
+Action on crash:
+  - Send ERROR log immediately
+  - Attempt restart (max 3 attempts)
+  - Send WARN log on each restart
+  - If max restarts exceeded → send ERROR log, display fallback
+```
+
+#### Check 2: Process Responsive
+```
+Goal: Detect frozen processes
+Method:
+  - For VLC: Query HTTP interface (status.json)
+  - For Chromium: Use DevTools Protocol (CDP)
+  - For custom viewers: Check last screen update time
+Result:
+  - If same frame >30 seconds → likely frozen
+  - If playback position not advancing → frozen
+Action on freeze:
+  - Send WARN log
+  - Force refresh (reload page, seek video, next slide)
+  - If refresh fails → restart process
+```
+
+#### Check 3: Content Match
+```
+Goal: Verify correct content is displayed
+Method:
+  - Compare expected event_id with actual media/URL
+  - Check scheduled time window (is event still active?)
+Result:
+  - Mismatch → content error
+Action:
+  - Send WARN log
+  - Reload correct event from scheduler
+```
+
+---
+
+## 5. Process Control Interface Requirements
+
+### 5.1 VLC Control
+
+**Requirement:** Enable VLC HTTP interface for monitoring
+
+**Launch Command:**
+```bash
+vlc --intf http --http-host 127.0.0.1 --http-port 8080 --http-password "vlc_password" \
+    --fullscreen --loop /path/to/video.mp4
+```
+
+**Status Query:**
+```bash
+curl http://127.0.0.1:8080/requests/status.json --user ":vlc_password"
+```
+
+**Response Fields to Monitor:**
+```json
+{
+  "state": "playing",     // "playing", "paused", "stopped"
+  "position": 0.25,       // 0.0-1.0 (25% through)
+  "time": 45,             // seconds into playback
+  "length": 180,          // total duration in seconds
+  "volume": 256           // 0-512
+}
+```
+
+---
+
+### 5.2 Chromium Control
+
+**Requirement:** Enable Chrome DevTools Protocol (CDP)
+
+**Launch Command:**
+```bash
+chromium --remote-debugging-port=9222 --kiosk --app=https://example.com
+```
+
+**Status Query:**
+```bash
+curl http://127.0.0.1:9222/json
+```
+
+**Response Fields to Monitor:**
+```json
+[
+  {
+    "url": "https://example.com",
+    "title": "Page Title",
+    "type": "page"
+  }
+]
+```
+
+**Advanced:** Use CDP WebSocket for events (page load, navigation, errors)
+
+---
+
+### 5.3 PDF Viewer (Custom or Standard)
+
+**Option A: Standard Viewer (e.g., Evince)**
+- No built-in API
+- Monitor via process check + screenshot comparison
+
+**Option B: Custom Python Viewer**
+- Implement REST API for status queries
+- Track: current page, total pages, last transition time
+
+---
+
+## 6. Watchdog Service Architecture
+
+### 6.1 Service Components
+
+**Component 1: Process Monitor Thread**
+```
+Responsibilities:
+  - Check process alive every 5 seconds
+  - Detect crashes and frozen processes
+  - Attempt automatic restart
+  - Send health metrics via MQTT
+
+State Machine:
+  IDLE → STARTING → RUNNING → (if crash) → RESTARTING → RUNNING
+                             → (if max restarts) → FAILED
+```
+
+**Component 2: MQTT Publisher Thread**
+```
+Responsibilities:
+  - Maintain MQTT connection
+  - Send heartbeat every 60 seconds
+  - Send logs on-demand (queued from other components)
+  - Send health metrics every 5 seconds
+  - Reconnect on connection loss
+```
+
+**Component 3: Event Manager Integration**
+```
+Responsibilities:
+  - Receive event schedule from server
+  - Notify watchdog of expected process/content
+  - Launch media player processes
+  - Handle event transitions
+```
+
+### 6.2 Service Lifecycle
+
+**On Startup:**
+1. Load configuration (client UUID, MQTT broker, etc.)
+2. Connect to MQTT broker
+3. Send INFO log: "Watchdog service started"
+4. Wait for first event from scheduler
+
+**During Operation:**
+1. Monitor loop runs every 5 seconds
+2. Check expected vs actual process state
+3. Send health metrics
+4. Handle failures (log + restart)
+
+**On Shutdown:**
+1. Send INFO log: "Watchdog service stopping"
+2. Gracefully stop monitored processes
+3. Disconnect from MQTT
+4. Exit cleanly
+
+---
+
+## 7. Auto-Recovery Logic
+
+### 7.1 Restart Strategy
+
+**Step 1: Detect Failure**
+```
+Trigger: Process not found in process list
+Action:
+  - Log ERROR: "Process {name} crashed"
+  - Increment restart counter
+  - Check if within retry limit (max 3)
+```
+
+**Step 2: Attempt Restart**
+```
+If restart_attempts < MAX_RESTARTS:
+  - Log WARN: "Attempting restart ({attempt}/{MAX_RESTARTS})"
+  - Kill any zombie processes
+  - Wait 2 seconds (cooldown)
+  - Launch process with same parameters
+  - Wait 5 seconds for startup
+  - Verify process is running
+  - If success: reset restart counter, log INFO
+  - If fail: increment counter, repeat
+```
+
+**Step 3: Permanent Failure**
+```
+If restart_attempts >= MAX_RESTARTS:
+  - Log ERROR: "Max restart attempts exceeded, failing over"
+  - Display fallback content (static image with error message)
+  - Send notification to server (separate alert topic, optional)
+  - Wait for manual intervention or scheduler event change
+```
+
+### 7.2 Restart Cooldown
+
+**Purpose:** Prevent rapid restart loops that waste resources
+
+**Implementation:**
+```
+After each restart attempt:
+  - Wait 2 seconds before next restart
+  - After 3 failures: wait 30 seconds before trying again
+  - Reset counter on successful run >5 minutes
+```
+
+---
+
+## 8. Resource Monitoring
+
+### 8.1 System Metrics to Track
+
+**CPU Usage:**
+```
+Method: Read /proc/stat or use psutil.cpu_percent()
+Frequency: Every 5 seconds
+Threshold: Warn if >80% for >60 seconds
+```
+
+**Memory Usage:**
+```
+Method: Read /proc/meminfo or use psutil.virtual_memory()
+Frequency: Every 5 seconds
+Threshold: Warn if >90% for >30 seconds
+```
+
+**Display Status:**
+```
+Method: Check DPMS state or xset query
+Frequency: Every 30 seconds
+Threshold: Error if display off (unexpected)
+```
+
+**Network Connectivity:**
+```
+Method: Ping server or check MQTT connection
+Frequency: Every 60 seconds
+Threshold: Warn if no server connectivity
+```
+
+---
+
+## 9. Development vs Production Mode
+
+### 9.1 Development Mode
+
+**Enable via:** Environment variable `DEBUG=true` or `ENV=development`
+
+**Behavior:**
+- Send INFO level logs
+- More verbose logging to console
+- Shorter monitoring intervals (faster feedback)
+- Screenshot capture every 30 seconds
+- No rate limiting on logs
+
+### 9.2 Production Mode
+
+**Enable via:** `ENV=production`
+
+**Behavior:**
+- Send only ERROR and WARN logs
+- Minimal console output
+- Standard monitoring intervals
+- Screenshot capture every 60 seconds
+- Rate limiting: max 10 logs per minute per level
+
+---
+
+## 10. Configuration File Format
+
+### 10.1 Recommended Config: JSON
+
+**File:** `/etc/infoscreen/config.json` or `~/.config/infoscreen/config.json`
+
+```json
+{
+  "client": {
+    "uuid": "9b8d1856-ff34-4864-a726-12de072d0f77",
+    "hostname": "infoscreen-room-101"
+  },
+  "mqtt": {
+    "broker": "192.168.43.201",
+    "port": 1883,
+    "username": "",
+    "password": "",
+    "keepalive": 60
+  },
+  "monitoring": {
+    "enabled": true,
+    "health_interval_seconds": 5,
+    "heartbeat_interval_seconds": 60,
+    "max_restart_attempts": 3,
+    "restart_cooldown_seconds": 2
+  },
+  "logging": {
+    "level": "INFO",
+    "send_info_logs": false,
+    "console_output": true,
+    "local_log_file": "/var/log/infoscreen/watchdog.log"
+  },
+  "processes": {
+    "vlc": {
+      "http_port": 8080,
+      "http_password": "vlc_password"
+    },
+    "chromium": {
+      "debug_port": 9222
+    }
+  }
+}
+```
+
+---
+
+## 11. Error Scenarios & Expected Behavior
+
+### Scenario 1: VLC Crashes Mid-Video
+```
+1. Watchdog detects: process_status = "crashed"
+2. Send ERROR log: "VLC process crashed"
+3. Attempt 1: Restart VLC with same video, seek to last position
+4. If success: Send INFO log "VLC restarted successfully"
+5. If fail: Repeat 2 more times
+6. After 3 failures: Send ERROR "Max restarts exceeded", show fallback
+```
+
+### Scenario 2: Network Timeout Loading Website
+```
+1. Chromium fails to load page (CDP reports error)
+2. Send WARN log: "Page load timeout"
+3. Attempt reload (Chromium refresh)
+4. If success after 10s: Continue monitoring
+5. If timeout again: Send ERROR, try restarting Chromium
+```
+
+### Scenario 3: Display Powers Off (Hardware)
+```
+1. DPMS check detects display off
+2. Send ERROR log: "Display powered off"
+3. Attempt to wake display (xset dpms force on)
+4. If success: Send INFO log
+5. If fail: Hardware issue, alert admin
+```
+
+### Scenario 4: High CPU Usage
+```
+1. CPU >80% for 60 seconds
+2. Send WARN log: "High CPU usage: 85%"
+3. Check if expected (e.g., video playback is normal)
+4. If unexpected: investigate process causing it
+5. If critical (>95%): consider restarting offending process
+```
+
+---
+
+## 12. Testing & Validation
+
+### 12.1 Manual Tests (During Development)
+
+**Test 1: Process Crash Simulation**
+```bash
+# Start video, then kill VLC manually
+killall vlc
+# Expected: ERROR log sent, automatic restart within 5 seconds
+```
+
+**Test 2: MQTT Connectivity**
+```bash
+# Subscribe to all client topics on server
+mosquitto_sub -h 192.168.43.201 -t "infoscreen/{uuid}/#" -v
+# Expected: See heartbeat every 60s, health every 5s
+```
+
+**Test 3: Log Levels**
+```bash
+# Trigger error condition and verify log appears in database
+curl http://192.168.43.201:8000/api/client-logs/test
+# Expected: See new log entry with correct level/message
+```
+
+### 12.2 Acceptance Criteria
+
+✅ **Client must:**
+1. Send heartbeat every 60 seconds without gaps
+2. Send ERROR log within 5 seconds of process crash
+3. Attempt automatic restart (max 3 times)
+4. Report health metrics every 5 seconds
+5. Survive MQTT broker restart (reconnect automatically)
+6. Survive network interruption (buffer logs, send when reconnected)
+7. Use correct timestamp format (ISO 8601 UTC)
+8. Only send logs for real client UUID (FK constraint)
+
+---
+
+## 13. Python Libraries (Recommended)
+
+**For process monitoring:**
+- `psutil` - Cross-platform process and system utilities
+
+**For MQTT:**
+- `paho-mqtt` - Official MQTT client (use v2.x with Callback API v2)
+
+**For VLC control:**
+- `requests` - HTTP client for status queries
+
+**For Chromium control:**
+- `websocket-client` or `pychrome` - Chrome DevTools Protocol
+
+**For datetime:**
+- `datetime` (stdlib) - Use `datetime.now(timezone.utc).isoformat()`
+
+**Example requirements.txt:**
+```
+paho-mqtt>=2.0.0
+psutil>=5.9.0
+requests>=2.31.0
+python-dateutil>=2.8.0
+```
+
+---
+
+## 14. Security Considerations
+
+### 14.1 MQTT Security
+- If broker requires auth, store credentials in config file with restricted permissions (`chmod 600`)
+- Consider TLS/SSL for MQTT (port 8883) if on untrusted network
+- Use unique client ID to prevent impersonation
+
+### 14.2 Process Control APIs
+- VLC HTTP password should be random, not default
+- Chromium debug port should bind to `127.0.0.1` only (not `0.0.0.0`)
+- Restrict file system access for media player processes
+
+### 14.3 Log Content
+- **Do not log:** Passwords, API keys, personal data
+- **Sanitize:** File paths (strip user directories), URLs (remove query params with tokens)
+
+---
+
+## 15. Performance Targets
+
+| Metric | Target | Acceptable | Critical |
+|--------|--------|------------|----------|
+| Health check interval | 5s | 10s | 30s |
+| Crash detection time | <5s | <10s | <30s |
+| Restart time | <10s | <20s | <60s |
+| MQTT publish latency | <100ms | <500ms | <2s |
+| CPU usage (watchdog) | <2% | <5% | <10% |
+| RAM usage (watchdog) | <50MB | <100MB | <200MB |
+| Log message size | <1KB | <10KB | <100KB |
+
+---
+
+## 16. Troubleshooting Guide (For Client Development)
+
+### Issue: Logs not appearing in server database
+**Check:**
+1. Is MQTT broker reachable? (`mosquitto_pub` test from client)
+2. Is client UUID correct and exists in `clients` table?
+3. Is timestamp format correct (ISO 8601 with 'Z')?
+4. Check server listener logs for errors
+
+### Issue: Health metrics not updating
+**Check:**
+1. Is health loop running? (check watchdog service status)
+2. Is MQTT connected? (check connection status in logs)
+3. Is payload JSON valid? (use JSON validator)
+
+### Issue: Process restarts in loop
+**Check:**
+1. Is media file/URL accessible?
+2. Is process command correct? (test manually)
+3. Check process exit code (crash reason)
+4. Increase restart cooldown to avoid rapid loops
+
+---
+
+## 17. Complete Message Flow Diagram
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                    Infoscreen Client                     │
+│                                                          │
+│  Event Occurs:                                           │
+│    - Process crashed                                     │
+│    - High CPU usage                                      │
+│    - Content loaded                                      │
+│                                                          │
+│  ┌────────────────┐                                     │
+│  │ Decision Logic │                                     │
+│  │  - Is it ERROR?│                                     │
+│  │  - Is it WARN? │                                     │
+│  │  - Is it INFO? │                                     │
+│  └────────┬───────┘                                     │
+│           │                                              │
+│           ▼                                              │
+│  ┌────────────────────────────────┐                    │
+│  │ Build JSON Payload              │                    │
+│  │ {                               │                    │
+│  │   "timestamp": "...",           │                    │
+│  │   "message": "...",             │                    │
+│  │   "context": {...}              │                    │
+│  │ }                               │                    │
+│  └────────┬───────────────────────┘                    │
+│           │                                              │
+│           ▼                                              │
+│  ┌────────────────────────────────┐                    │
+│  │ MQTT Publish                    │                    │
+│  │ Topic: infoscreen/{uuid}/logs/error                 │
+│  │ QoS: 1                          │                    │
+│  └────────┬───────────────────────┘                    │
+└───────────┼──────────────────────────────────────────┘
+            │
+            │ TCP/IP (MQTT Protocol)
+            │
+            ▼
+     ┌──────────────┐
+     │ MQTT Broker  │
+     │ (Mosquitto)  │
+     └──────┬───────┘
+            │
+            │ Topic: infoscreen/+/logs/#
+            │
+            ▼
+     ┌──────────────────────────────┐
+     │   Listener Service            │
+     │   (Python)                    │
+     │                               │
+     │  - Parse JSON                 │
+     │  - Validate UUID              │
+     │  - Store in database          │
+     └──────┬───────────────────────┘
+            │
+            ▼
+     ┌──────────────────────────────┐
+     │   MariaDB Database            │
+     │                               │
+     │   Table: client_logs          │
+     │   - client_uuid               │
+     │   - timestamp                 │
+     │   - level                     │
+     │   - message                   │
+     │   - context (JSON)            │
+     └──────┬───────────────────────┘
+            │
+            │ SQL Query
+            │
+            ▼
+     ┌──────────────────────────────┐
+     │   API Server (Flask)          │
+     │                               │
+     │   GET /api/client-logs/{uuid}/logs
+     │   GET /api/client-logs/summary
+     └──────┬───────────────────────┘
+            │
+            │ HTTP/JSON
+            │
+            ▼
+     ┌──────────────────────────────┐
+     │   Dashboard (React)           │
+     │                               │
+     │   - Display logs              │
+     │   - Filter by level           │
+     │   - Show health status        │
+     └───────────────────────────────┘
+```
+
+---
+
+## 18. Quick Reference Card
+
+### MQTT Topics Summary
+```
+infoscreen/{uuid}/logs/error    → Critical failures
+infoscreen/{uuid}/logs/warn     → Non-critical issues
+infoscreen/{uuid}/logs/info     → Informational (dev mode)
+infoscreen/{uuid}/health        → Health metrics (every 5s)
+infoscreen/{uuid}/heartbeat     → Enhanced heartbeat (every 60s)
+```
+
+### JSON Timestamp Format
+```python
+from datetime import datetime, timezone
+timestamp = datetime.now(timezone.utc).isoformat()
+# Output: "2026-03-10T07:30:00+00:00" or "2026-03-10T07:30:00Z"
+```
+
+### Process Status Values
+```
+"running"  - Process is alive and responding
+"crashed"  - Process terminated unexpectedly
+"starting" - Process is launching (startup phase)
+"stopped"  - Process intentionally stopped
+```
+
+### Restart Logic
+```
+Max attempts: 3
+Cooldown: 2 seconds between attempts
+Reset: After 5 minutes of successful operation
+```
+
+---
+
+## 19. Contact & Support
+
+**Server API Documentation:**
+- Base URL: `http://192.168.43.201:8000`
+- Health check: `GET /health`
+- Test logs: `GET /api/client-logs/test` (no auth)
+- Full API docs: See `CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md` on server
+
+**MQTT Broker:**
+- Host: `192.168.43.201`
+- Port: `1883` (standard), `9001` (WebSocket)
+- Test tool: `mosquitto_pub` / `mosquitto_sub`
+
+**Database Schema:**
+- Table: `client_logs`
+- Foreign Key: `client_uuid` → `clients.uuid` (ON DELETE CASCADE)
+- Constraint: UUID must exist in clients table before logging
+
+**Server-Side Logs:**
+```bash
+# View listener logs (processes MQTT messages)
+docker compose logs -f listener
+
+# View server logs (API requests)
+docker compose logs -f server
+```
+
+---
+
+## 20. Appendix: Example Implementations
+
+### A. Minimal Python Watchdog (Pseudocode)
+
+```python
+import time
+import json
+import psutil
+import paho.mqtt.client as mqtt
+from datetime import datetime, timezone
+
+class MinimalWatchdog:
+    def __init__(self, client_uuid, mqtt_broker):
+        self.uuid = client_uuid
+        self.mqtt_client = mqtt.Client(callback_api_version=mqtt.CallbackAPIVersion.VERSION2)
+        self.mqtt_client.connect(mqtt_broker, 1883, 60)
+        self.mqtt_client.loop_start()
+        
+        self.expected_process = None
+        self.restart_attempts = 0
+        self.MAX_RESTARTS = 3
+    
+    def send_log(self, level, message, context=None):
+        topic = f"infoscreen/{self.uuid}/logs/{level}"
+        payload = {
+            "timestamp": datetime.now(timezone.utc).isoformat(),
+            "message": message,
+            "context": context or {}
+        }
+        self.mqtt_client.publish(topic, json.dumps(payload), qos=1)
+    
+    def is_process_running(self, process_name):
+        for proc in psutil.process_iter(['name']):
+            if process_name in proc.info['name']:
+                return True
+        return False
+    
+    def monitor_loop(self):
+        while True:
+            if self.expected_process:
+                if not self.is_process_running(self.expected_process):
+                    self.send_log("error", f"{self.expected_process} crashed")
+                    if self.restart_attempts < self.MAX_RESTARTS:
+                        self.restart_process()
+                    else:
+                        self.send_log("error", "Max restarts exceeded")
+            
+            time.sleep(5)
+
+# Usage:
+watchdog = MinimalWatchdog("9b8d1856-ff34-4864-a726-12de072d0f77", "192.168.43.201")
+watchdog.expected_process = "vlc"
+watchdog.monitor_loop()
+```
+
+---
+
+**END OF SPECIFICATION**
+
+Questions? Refer to:
+- `CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md` (server repo)
+- Server API: `http://192.168.43.201:8000/api/client-logs/test`
+- MQTT test: `mosquitto_sub -h 192.168.43.201 -t infoscreen/#`
--- a/DEV-CHANGELOG.md
+++ b/DEV-CHANGELOG.md
@@ -5,6 +5,10 @@ This changelog tracks all changes made in the development workspace, including i
 ---

 ## Unreleased (development workspace)
+- Monitoring system completion: End-to-end monitoring pipeline is active (MQTT logs/health → listener persistence → monitoring APIs → superadmin dashboard).
+- Monitoring API: Added/active endpoints `GET /api/client-logs/monitoring-overview` and `GET /api/client-logs/recent-errors`; per-client logs via `GET /api/client-logs/<uuid>/logs`.
+- Dashboard monitoring UI: Superadmin monitoring page is integrated and displays client health status, screenshots, process metadata, and recent error activity.
+- Bugfix: Presentation flags `page_progress` and `auto_progress` now persist reliably across create/update and detached-occurrence flows.
 - Frontend (Settings → Events): Added Presentations defaults (slideshow interval, page-progress, auto-progress) with load/save via `/api/system-settings`; UI uses Syncfusion controls.
 - Backend defaults: Seeded `presentation_interval` ("10"), `presentation_page_progress` ("true"), `presentation_auto_progress` ("true") in `server/init_defaults.py` when missing.
 - Data model: Added per-event fields `page_progress` and `auto_progress` on `Event`; Alembic migration applied successfully.
--- a/MQTT_PAYLOAD_MIGRATION_GUIDE.md
+++ b/MQTT_PAYLOAD_MIGRATION_GUIDE.md
@@ -0,0 +1,194 @@
+# MQTT Payload Migration Guide
+
+## Purpose
+This guide describes a practical migration from the current dashboard screenshot payload to a grouped schema, with client-side implementation first and server-side migration second.
+
+## Scope
+- Environment: development and alpha systems (no production installs)
+- Message topic: infoscreen/<client_id>/dashboard
+- Capture types to preserve: periodic, event_start, event_stop
+
+## Target Schema (v2)
+The canonical message should be grouped into four logical blocks in this order:
+
+1. message
+2. content
+3. runtime
+4. metadata
+
+Example shape:
+
+```json
+{
+  "message": {
+    "client_id": "<uuid>",
+    "status": "alive"
+  },
+  "content": {
+    "screenshot": {
+      "filename": "latest.jpg",
+      "data": "<base64>",
+      "timestamp": "2026-03-30T10:15:41.123456+00:00",
+      "size": 183245
+    }
+  },
+  "runtime": {
+    "system_info": {
+      "hostname": "pi-display-01",
+      "ip": "192.168.1.42",
+      "uptime": 123456.7
+    },
+    "process_health": {
+      "event_id": "evt-123",
+      "event_type": "presentation",
+      "current_process": "impressive",
+      "process_pid": 4123,
+      "process_status": "running",
+      "restart_count": 0
+    }
+  },
+  "metadata": {
+    "schema_version": "2.0",
+    "producer": "simclient",
+    "published_at": "2026-03-30T10:15:42.004321+00:00",
+    "capture": {
+      "type": "periodic",
+      "captured_at": "2026-03-30T10:15:41.123456+00:00",
+      "age_s": 0.9,
+      "triggered": false,
+      "send_immediately": false
+    },
+    "transport": {
+      "qos": 0,
+      "publisher": "simclient"
+    }
+  }
+}
+```
+
+## Step-by-Step: Client-Side First
+
+1. Create a migration branch.
+- Example: feature/payload-v2
+
+2. Freeze a baseline sample from MQTT.
+- Capture one payload via mosquitto_sub and store it for comparison.
+
+3. Implement one canonical payload builder.
+- Centralize JSON assembly in one function only.
+- Do not duplicate payload construction across code paths.
+
+4. Add versioned metadata.
+- Set metadata.schema_version = "2.0".
+- Add metadata.producer = "simclient".
+- Add metadata.published_at in UTC ISO format.
+
+5. Map existing data into grouped blocks.
+- client_id/status -> message
+- screenshot object -> content.screenshot
+- system_info/process_health -> runtime
+- capture mode and freshness -> metadata.capture
+
+6. Preserve existing capture semantics.
+- Keep type values unchanged: periodic, event_start, event_stop.
+- Keep UTC ISO timestamps.
+- Keep screenshot encoding and size behavior unchanged.
+
+7. Optional short-term compatibility mode (recommended for one sprint).
+- Either:
+  - Keep current legacy fields in parallel, or
+  - Add a legacy block with old field names.
+- Goal: prevent immediate server breakage while parser updates are merged.
+
+8. Improve publish logs for verification.
+- Log schema_version, metadata.capture.type, metadata.capture.age_s.
+
+9. Validate all three capture paths end-to-end.
+- periodic capture
+- event_start trigger capture
+- event_stop trigger capture
+
+10. Lock the client contract.
+- Save one validated JSON sample per capture type.
+- Use those samples in server parser tests.
+
+## Step-by-Step: Server-Side Migration
+
+1. Add support for grouped v2 parsing.
+- Parse from message/content/runtime/metadata first.
+
+2. Add fallback parser for legacy payload (temporary).
+- If grouped keys are absent, parse old top-level keys.
+
+3. Normalize to one internal server model.
+- Convert both parser paths into one DTO/entity used by dashboard logic.
+
+4. Validate required fields.
+- Required:
+  - message.client_id
+  - message.status
+  - metadata.schema_version
+  - metadata.capture.type
+- Optional:
+  - runtime.process_health
+  - content.screenshot (if no screenshot available)
+
+5. Update dashboard consumers.
+- Read grouped fields from internal model (not raw old keys).
+
+6. Add migration observability.
+- Counters:
+  - v2 parse success
+  - legacy fallback usage
+  - parse failures
+- Warning log for unknown schema_version.
+
+7. Run mixed-format integration tests.
+- New client -> new server
+- Legacy client -> new server (fallback path)
+
+8. Cut over to v2 preferred.
+- Keep fallback for short soak period only.
+
+9. Remove fallback and legacy assumptions.
+- After stability window, remove old parser path.
+
+10. Final cleanup.
+- Keep one schema doc and test fixtures.
+- Remove temporary compatibility switches.
+
+## Legacy to v2 Field Mapping
+
+| Legacy field | v2 field |
+|---|---|
+| client_id | message.client_id |
+| status | message.status |
+| screenshot | content.screenshot |
+| screenshot_type | metadata.capture.type |
+| screenshot_age_s | metadata.capture.age_s |
+| timestamp | metadata.published_at |
+| system_info | runtime.system_info |
+| process_health | runtime.process_health |
+
+## Acceptance Criteria
+
+1. All capture types parse and display correctly.
+- periodic
+- event_start
+- event_stop
+
+2. Screenshot payload integrity is unchanged.
+- filename, data, timestamp, size remain valid.
+
+3. Metadata is centrally visible at message end.
+- schema_version, capture metadata, transport metadata all inside metadata.
+
+4. No regression in dashboard update timing.
+- Triggered screenshots still publish quickly.
+
+## Suggested Timeline (Dev Only)
+
+1. Day 1: client v2 payload implementation + local tests
+2. Day 2: server v2 parser + fallback
+3. Day 3-5: soak in dev, monitor parse metrics
+4. Day 6+: remove fallback and finalize v2-only
--- a/PHASE_3_CLIENT_MONITORING_IMPLEMENTATION.md
+++ b/PHASE_3_CLIENT_MONITORING_IMPLEMENTATION.md
@@ -0,0 +1,533 @@
+# Phase 3: Client-Side Monitoring Implementation
+
+**Status**: ✅ COMPLETE  
+**Date**: 11. März 2026  
+**Architecture**: Two-process design with health-state bridge
+
+---
+
+## Overview
+
+This document describes the **Phase 3** client-side monitoring implementation integrated into the existing infoscreen-dev codebase. The implementation adds:
+
+1. ✅ **Health-state tracking** for all display processes (Impressive, Chromium, VLC)
+2. ✅ **Tiered logging**: Local rotating logs + selective MQTT transmission
+3. ✅ **Process crash detection** with bounded restart attempts
+4. ✅ **MQTT health/log topics** feeding the monitoring server
+5. ✅ **Impressive-aware process mapping** (presentations → impressive, websites → chromium, videos → vlc)
+
+---
+
+## Architecture
+
+### Two-Process Design
+
+```
+┌─────────────────────────────────────────────────────────┐
+│ simclient.py (MQTT Client)                              │
+│ - Discovers device, sends heartbeat                      │
+│ - Downloads presentation files                           │
+│ - Reads health state from display_manager               │
+│ - Publishes health/log messages to MQTT                 │
+│ - Sends screenshots for dashboard                        │
+└────────┬────────────────────────────────────┬───────────┘
+         │                                    │
+         │ reads: current_process_health.json │
+         │                                    │
+         │ writes: current_event.json         │
+         │                                    │
+┌────────▼────────────────────────────────────▼───────────┐
+│ display_manager.py (Display Control)                     │
+│ - Monitors events and manages displays                   │
+│ - Launches Impressive (presentations)                    │
+│ - Launches Chromium (websites)                           │
+│ - Launches VLC (videos)                                  │
+│ - Tracks process health and crashes                      │
+│ - Detects and restarts crashed processes               │
+│ - Writes health state to JSON bridge                     │
+│ - Captures screenshots to shared folder                  │
+└─────────────────────────────────────────────────────────┘
+```
+
+---
+
+## Implementation Details
+
+### 1. Health State Tracking (display_manager.py)
+
+**File**: `src/display_manager.py`  
+**New Class**: `ProcessHealthState`
+
+Tracks process health and persists to JSON for simclient to read:
+
+```python
+class ProcessHealthState:
+    """Track and persist process health state for monitoring integration"""
+    
+    - event_id: Currently active event identifier
+    - event_type: presentation, website, video, or None
+    - process_name: impressive, chromium-browser, vlc, or None
+    - process_pid: Process ID or None for libvlc
+    - status: running, crashed, starting, stopped
+    - restart_count: Number of restart attempts
+    - max_restarts: Maximum allowed restarts (3)
+```
+
+Methods:
+- `update_running()` - Mark process as started (logs to monitoring.log)
+- `update_crashed()` - Mark process as crashed (warning to monitoring.log)
+- `update_restart_attempt()` - Increment restart counter (logs attempt and checks max)
+- `update_stopped()` - Mark process as stopped (info to monitoring.log)
+- `save()` - Persist state to `src/current_process_health.json`
+
+**New Health State File**: `src/current_process_health.json`
+
+```json
+{
+  "event_id": "event_123",
+  "event_type": "presentation",
+  "current_process": "impressive",
+  "process_pid": 1234,
+  "process_status": "running",
+  "restart_count": 0,
+  "timestamp": "2026-03-11T10:30:45.123456+00:00"
+}
+```
+
+### 2. Monitoring Logger (both files)
+
+**Local Rotating Logs**: 5 files × 5 MB each = 25 MB max per device
+
+**display_manager.py**:
+```python
+MONITORING_LOG_PATH = "logs/monitoring.log"
+monitoring_logger = logging.getLogger("monitoring")
+monitoring_handler = RotatingFileHandler(MONITORING_LOG_PATH, maxBytes=5*1024*1024, backupCount=5)
+```
+
+**simclient.py**:
+- Shares same `logs/monitoring.log` file
+- Both processes write to monitoring logger for health events
+- Local logs never rotate (persisted for technician inspection)
+
+**Log Filtering** (tiered strategy):
+- **ERROR**: Local + MQTT (published to `infoscreen/{uuid}/logs/error`)
+- **WARN**: Local + MQTT (published to `infoscreen/{uuid}/logs/warn`)
+- **INFO**: Local only (unless `DEBUG_MODE=1`)
+- **DEBUG**: Local only (always)
+
+### 3. Process Mapping with Impressive Support
+
+**display_manager.py** - When starting processes:
+
+| Event Type | Process Name | Health Status |
+|-----------|--------------|---------------|
+| presentation | `impressive` | tracked with PID |
+| website/webpage/webuntis | `chromium` or `chromium-browser` | tracked with PID |
+| video | `vlc` | tracked (may have no PID if using libvlc) |
+
+**Per-Process Updates**:
+- Presentation: `health.update_running('event_id', 'presentation', 'impressive', pid)`
+- Website: `health.update_running('event_id', 'website', browser_name, pid)`
+- Video: `health.update_running('event_id', 'video', 'vlc', pid or None)`
+
+### 4. Crash Detection and Restart Logic
+
+**display_manager.py** - `process_events()` method:
+
+```
+If process not running AND same event_id:
+  ├─ Check exit code
+  ├─ If presentation with exit code 0: Normal completion (no restart)
+  ├─ Else: Mark crashed
+  │  ├─ health.update_crashed()
+  │  └─ health.update_restart_attempt()
+  │     ├─ If restart_count > max_restarts: Give up
+  │     └─ Else: Restart display (loop back to start_display_for_event)
+  └─ Log to monitoring.log at each step
+```
+
+**Restart Logic**:
+- Max 3 restart attempts per event
+- Restarts only if same event still active
+- Graceful exit (code 0) for Impressive auto-quit presentations is treated as normal
+- All crashes logged to monitoring.log with context
+
+### 5. MQTT Health and Log Topics
+
+**simclient.py** - New functions:
+
+**`read_health_state()`**
+- Reads `src/current_process_health.json` written by display_manager
+- Returns dict or None if no active process
+
+**`publish_health_message(client, client_id)`**
+- Topic: `infoscreen/{uuid}/health`
+- QoS: 1 (reliable)
+- Payload:
+```json
+{
+  "timestamp": "2026-03-11T10:30:45.123456+00:00",
+  "expected_state": {
+    "event_id": "event_123"
+  },
+  "actual_state": {
+    "process": "impressive",
+    "pid": 1234,
+    "status": "running"
+  }
+}
+```
+
+**`publish_log_message(client, client_id, level, message, context)`**
+- Topics: `infoscreen/{uuid}/logs/error` or `infoscreen/{uuid}/logs/warn`
+- QoS: 1 (reliable)
+- Log level filtering (only ERROR/WARN sent unless DEBUG_MODE=1)
+- Payload:
+```json
+{
+  "timestamp": "2026-03-11T10:30:45.123456+00:00",
+  "message": "Process started: event_id=123 event_type=presentation process=impressive pid=1234",
+  "context": {
+    "event_id": "event_123",
+    "process": "impressive",
+    "event_type": "presentation"
+  }
+}
+```
+
+**Enhanced Dashboard Heartbeat**:
+- Topic: `infoscreen/{uuid}/dashboard`
+- Now includes `process_health` block with event_id, process name, status, restart count
+
+### 6. Integration Points
+
+**Existing Features Preserved**:
+- ✅ Impressive PDF presentations with auto-advance and loop
+- ✅ Chromium website display with auto-scroll injection
+- ✅ VLC video playback (python-vlc preferred, binary fallback)
+- ✅ Screenshot capture and transmission
+- ✅ HDMI-CEC TV control
+- ✅ Two-process architecture
+
+**New Integration Points**:
+
+| File | Function | Change |
+|------|----------|--------|
+| display_manager.py | `__init__()` | Initialize `ProcessHealthState()` |
+| display_manager.py | `start_presentation()` | Call `health.update_running()` with impressive |
+| display_manager.py | `start_video()` | Call `health.update_running()` with vlc |
+| display_manager.py | `start_webpage()` | Call `health.update_running()` with chromium |
+| display_manager.py | `process_events()` | Detect crashes, call `health.update_crashed()` and `update_restart_attempt()` |
+| display_manager.py | `stop_current_display()` | Call `health.update_stopped()` |
+| simclient.py | `screenshot_service_thread()` | (No changes to interval) |
+| simclient.py | Main heartbeat loop | Call `publish_health_message()` after successful heartbeat |
+| simclient.py | `send_screenshot_heartbeat()` | Read health state and include in dashboard payload |
+
+---
+
+## Logging Hierarchy
+
+### Local Rotating Files (5 × 5 MB)
+
+**`logs/display_manager.log`** (existing - updated):
+- Display event processing
+- Process lifecycle (start/stop)
+- HDMI-CEC operations
+- Presentation status
+- Video/website startup
+
+**`logs/simclient.log`** (existing - updated):
+- MQTT connection/reconnection
+- Discovery and heartbeat
+- File downloads
+- Group membership changes
+- Dashboard payload info
+
+**`logs/monitoring.log`** (NEW):
+- Process health events (start, crash, restart, stop)
+- Both display_manager and simclient write here
+- Centralized health tracking
+- Technician-focused: "What happened to the processes?"
+
+```
+# Example monitoring.log entries:
+2026-03-11 10:30:45 [INFO] Process started: event_id=event_123 event_type=presentation process=impressive pid=1234
+2026-03-11 10:35:20 [WARNING] Process crashed: event_id=event_123 event_type=presentation process=impressive restart_count=0/3
+2026-03-11 10:35:20 [WARNING] Restarting process: attempt 1/3 for impressive
+2026-03-11 10:35:25 [INFO] Process started: event_id=event_123 event_type=presentation process=impressive pid=1245
+```
+
+### MQTT Transmission (Selective)
+
+**Always sent** (when error occurs):
+- `infoscreen/{uuid}/logs/error` - Critical failures
+- `infoscreen/{uuid}/logs/warn` - Restarts, crashes, missing binaries
+
+**Development mode only** (if DEBUG_MODE=1):
+- `infoscreen/{uuid}/logs/info` - Event start/stop, process running status
+
+**Never sent**:
+- DEBUG messages (local-only debug details)
+- INFO messages in production
+
+---
+
+## Environment Variables
+
+No new required variables. Existing configuration supports monitoring:
+
+```bash
+# Existing (unchanged):
+ENV=development|production
+DEBUG_MODE=0|1  # Enables INFO logs to MQTT
+LOG_LEVEL=DEBUG|INFO|WARNING|ERROR  # Local log verbosity
+HEARTBEAT_INTERVAL=5|60  # seconds
+SCREENSHOT_INTERVAL=30|300  # seconds (display_manager_screenshot_capture)
+
+# Recommended for monitoring:
+SCREENSHOT_CAPTURE_INTERVAL=30  # How often display_manager captures screenshots
+SCREENSHOT_MAX_WIDTH=800  # Downscale for bandwidth
+SCREENSHOT_JPEG_QUALITY=70  # Balance quality/size
+
+# File server (if different from MQTT broker):
+FILE_SERVER_HOST=192.168.1.100
+FILE_SERVER_PORT=8000
+FILE_SERVER_SCHEME=http
+```
+
+---
+
+## Testing Validation
+
+### System-Level Test Sequence
+
+**1. Start Services**:
+```bash
+# Terminal 1: Display Manager
+./scripts/start-display-manager.sh
+
+# Terminal 2: MQTT Client
+./scripts/start-dev.sh
+
+# Terminal 3: Monitor logs
+tail -f logs/monitoring.log
+```
+
+**2. Trigger Each Event Type**:
+```bash
+# Via test menu or MQTT publish:
+./scripts/test-display-manager.sh  # Options 1-3 trigger events
+```
+
+**3. Verify Health State File**:
+```bash
+# Check health state gets written immediately
+cat src/current_process_health.json
+# Should show: event_id, event_type, current_process (impressive/chromium/vlc), process_status=running
+```
+
+**4. Check MQTT Topics**:
+```bash
+# Monitor health messages:
+mosquitto_sub -h localhost -t "infoscreen/+/health" -v
+
+# Monitor log messages:
+mosquitto_sub -h localhost -t "infoscreen/+/logs/#" -v
+
+# Monitor dashboard heartbeat:
+mosquitto_sub -h localhost -t "infoscreen/+/dashboard" -v | head -c 500 && echo "..."
+```
+
+**5. Simulate Process Crash**:
+```bash
+# Find impressive/chromium/vlc PID:
+ps aux | grep -E 'impressive|chromium|vlc'
+
+# Kill process:
+kill -9 <pid>
+
+# Watch monitoring.log for crash detection and restart
+tail -f logs/monitoring.log
+# Should see: [WARNING] Process crashed... [WARNING] Restarting process...
+```
+
+**6. Verify Server Integration**:
+```bash
+# Server receives health messages:
+sqlite3 infoscreen.db "SELECT process_status, current_process, restart_count FROM clients WHERE uuid='...';"
+# Should show latest status from health message
+
+# Server receives logs:
+sqlite3 infoscreen.db "SELECT level, message FROM client_logs WHERE client_uuid='...' ORDER BY timestamp DESC LIMIT 10;"
+# Should show ERROR/WARN entries from crashes/restarts
+```
+
+---
+
+## Troubleshooting
+
+### Health State File Not Created
+
+**Symptom**: `src/current_process_health.json` missing  
+**Causes**:
+- No event active (file only created when display starts)
+- display_manager not running
+
+**Check**:
+```bash
+ps aux | grep display_manager
+tail -f logs/display_manager.log | grep "Process started\|Process stopped"
+```
+
+### MQTT Health Messages Not Arriving
+
+**Symptom**: No health messages on `infoscreen/{uuid}/health` topic  
+**Causes**:
+- simclient not reading health state file
+- MQTT connection dropped
+- Health update function not called
+
+**Check**:
+```bash
+# Check health file exists and is recent:
+ls -l src/current_process_health.json
+stat src/current_process_health.json | grep Modify
+
+# Monitor simclient logs:
+tail -f logs/simclient.log | grep -E "Health|heartbeat|publish"
+
+# Verify MQTT connection:
+mosquitto_sub -h localhost -t "infoscreen/+/heartbeat" -v
+```
+
+### Restart Loop (Process Keeps Crashing)
+
+**Symptom**: monitoring.log shows repeated crashes and restarts  
+**Check**: 
+```bash
+# Read last log lines of the process (stored by display_manager):
+tail -f logs/impressive.out.log  # for presentations
+tail -f logs/browser.out.log      # for websites
+tail -f logs/video_player.out.log # for videos
+```
+
+**Common Causes**:
+- Missing binary (impressive not installed, chromium not found, vlc not available)
+- Corrupt presentation file
+- Invalid URL for website
+- Insufficient permissions for screenshots
+
+### Log Messages Not Reaching Server
+
+**Symptom**: client_logs table in server DB is empty  
+**Causes**:
+- Log level filtering: INFO messages in production are local-only
+- Logs only published on ERROR/WARN
+- MQTT publish failing silently
+
+**Check**:
+```bash
+# Force DEBUG_MODE to see all logs:
+export DEBUG_MODE=1
+export LOG_LEVEL=DEBUG
+# Restart simclient and trigger event
+
+# Monitor local logs first:
+tail -f logs/monitoring.log | grep -i error
+```
+
+---
+
+## Performance Considerations
+
+**Bandwidth per Client**:
+- Health message: ~200 bytes per heartbeat interval (every 5-60s)
+- Screenshot heartbeat: ~50-100 KB (every 30-300s)
+- Log messages: ~100-500 bytes per crash/error (rare)
+- **Total**: ~0.5-2 MB/day per device (very minimal)
+
+**Disk Space on Client**:
+- Monitoring logs: 5 files × 5 MB = 25 MB max
+- Display manager logs: 5 files × 2 MB = 10 MB max
+- MQTT client logs: 5 files × 2 MB = 10 MB max
+- Screenshots: 20 files × 50-100 KB = 1-2 MB max
+- **Total**: ~50 MB max (typical for Raspberry Pi USB/SSD)
+
+**Rotation Strategy**:
+- Old files automatically deleted when size limit reached
+- Technician can SSH and `tail -f` any time
+- No database overhead (file-based rotation is minimal CPU)
+
+---
+
+## Integration with Server (Phase 2)
+
+The client implementation sends data to the server's Phase 2 endpoints:
+
+**Expected Server Implementation** (from CLIENT_MONITORING_SETUP.md):
+
+1. **MQTT Listener** receives and stores:
+   - `infoscreen/{uuid}/logs/error`, `/logs/warn`, `/logs/info`
+   - `infoscreen/{uuid}/health` messages
+   - Updates `clients` table with health fields
+
+2. **Database Tables**:
+   - `clients.process_status`: running/crashed/starting/stopped
+   - `clients.current_process`: impressive/chromium/vlc/None
+   - `clients.process_pid`: PID value
+   - `clients.current_event_id`: Active event
+   - `client_logs`: table stores logs with level/message/context
+
+3. **API Endpoints**:
+   - `GET /api/client-logs/{uuid}/logs?level=ERROR&limit=50`
+   - `GET /api/client-logs/summary` (errors/warnings across all clients)
+
+---
+
+## Summary of Changes
+
+### Files Modified
+
+1. **`src/display_manager.py`**:
+   - Added `psutil` import for future process monitoring
+   - Added `ProcessHealthState` class (60 lines)
+   - Added monitoring logger setup (8 lines)
+   - Added `health.update_running()` calls in `start_presentation()`, `start_video()`, `start_webpage()`
+   - Added crash detection and restart logic in `process_events()`
+   - Added `health.update_stopped()` in `stop_current_display()`
+
+2. **`src/simclient.py`**:
+   - Added `timezone` import
+   - Added monitoring logger setup (8 lines)
+   - Added `read_health_state()` function
+   - Added `publish_health_message()` function
+   - Added `publish_log_message()` function (with level filtering)
+   - Updated `send_screenshot_heartbeat()` to include health data
+   - Updated heartbeat loop to call `publish_health_message()`
+
+### Files Created
+
+1. **`src/current_process_health.json`** (at runtime):
+   - Bridge file between display_manager and simclient
+   - Shared volume compatible (works in container setup)
+
+2. **`logs/monitoring.log`** (at runtime):
+   - New rotating log file (5 × 5MB)
+   - Health events from both processes
+
+---
+
+## Next Steps
+
+1. **Deploy to test client** and run validation sequence above
+2. **Deploy server Phase 2** (if not yet done) to receive health/log messages
+3. **Verify database updates** in server-side `clients` and `client_logs` tables
+4. **Test dashboard UI** (Phase 4) to display health indicators
+5. **Configure alerting** (email/Slack) for ERROR level messages
+
+---
+
+**Implementation Date**: 11. März 2026  
+**Part of**: Infoscreen 2025 Client Monitoring System  
+**Status**: Production Ready (with server Phase 2 integration)
--- a/README.md
+++ b/README.md
@@ -39,6 +39,7 @@ A comprehensive multi-service digital signage solution for educational instituti

 Data flow summary:
 - Listener: consumes discovery and heartbeat messages from the MQTT Broker and updates the API Server (client registration/heartbeats).
+- Listener screenshot flow: consumes `infoscreen/{uuid}/screenshot` and `infoscreen/{uuid}/dashboard`. Dashboard messages use grouped v2 schema (`message`, `content`, `runtime`, `metadata`); screenshot data is read from `content.screenshot`, capture type from `metadata.capture.type`, and forwarded to `POST /api/clients/{uuid}/screenshot`.
 - Scheduler: reads events from the API Server and publishes only currently active content to the MQTT Broker (retained topics per group). When a group has no active events, the scheduler clears its retained topic by publishing an empty list. All time comparisons are done in UTC; any naive timestamps are normalized.
 - Clients: send discovery/heartbeat via the MQTT Broker (handled by the Listener) and receive content from the Scheduler via MQTT.
 - Worker: receives conversion commands directly from the API Server and reports results/status back to the API (no MQTT involved).
@@ -225,17 +226,15 @@ For detailed deployment instructions, see:
  
 ## Recent changes since last commit

- Video / Streaming support: Added end-to-end support for video events. The API and dashboard now allow creating `video` events referencing uploaded media. The server exposes a range-capable streaming endpoint at `/api/eventmedia/stream/<media_id>/<filename>` so clients can seek during playback.
- Scheduler metadata: Scheduler now performs a best-effort HEAD probe for video stream URLs and includes basic metadata in the retained MQTT payload: `mime_type`, `size` (bytes) and `accept_ranges` (bool). Placeholders for richer metadata (`duration`, `resolution`, `bitrate`, `qualities`, `thumbnails`, `checksum`) are emitted as null/empty until a background worker fills them.
- Dashboard & uploads: The dashboard's FileManager upload limits were increased (to support Full-HD uploads) and client-side validation enforces a maximum video length (10 minutes). The event modal exposes playback flags (`autoplay`, `loop`, `volume`, `muted`) and initializes them from system defaults for new events.
- DB model & API: `Event` includes `muted` in addition to `autoplay`, `loop`, and `volume`; endpoints accept, persist, and return these fields for video events. Events reference uploaded media via `event_media_id`.
- Settings UI: Settings page refactored to nested tabs; added Events → Videos defaults (autoplay, loop, volume, mute) backed by system settings keys (`video_autoplay`, `video_loop`, `video_volume`, `video_muted`).
- Academic Calendar UI: Merged “School Holidays Import” and “List” into a single “📥 Import & Liste” tab; nested tab selection is persisted with controlled `selectedItem` state to avoid jumps.
+- Monitoring system: End-to-end monitoring is now implemented. The listener ingests `logs/*` and `health` MQTT topics, the API exposes monitoring endpoints (`/api/client-logs/monitoring-overview`, `/api/client-logs/recent-errors`, `/api/client-logs/<uuid>/logs`), and the superadmin dashboard page shows live client status, screenshots, and recent errors.
+- Screenshot priority flow: Screenshot payloads now support `screenshot_type` (`periodic`, `event_start`, `event_stop`). `event_start` and `event_stop` are treated as high-priority screenshots; the API stores typed screenshots, maintains priority metadata, and serves active priority screenshots through `/screenshots/{uuid}/priority`.
+- MQTT dashboard payload v2 cutover: Listener parsing is now v2-only for dashboard JSON payloads (`message/content/runtime/metadata`). Legacy top-level dashboard fallback has been removed after migration completion; parser observability tracks `v2_success` and `parse_failures`.
+- Presentation persistence fix: Fixed persistence of presentation flags so `page_progress` and `auto_progress` are reliably stored and returned for create/update flows and detached occurrences.
+- Additional improvements: Video/streaming, scheduler metadata, settings defaults, and UI refinements remain documented in the detailed sections below.

 These changes are designed to be safe if metadata extraction or probes fail — clients should still attempt playback using the provided `url` and fall back to requesting/resolving richer metadata when available.

 See `MQTT_EVENT_PAYLOAD_GUIDE.md` for details.
-  - `infoscreen/{uuid}/group_id` - Client group assignment

 ## 🧩 Developer Environment Notes (Dev Container)
 - Extensions: UI-only `Dev Containers` runs on the host UI; not installed inside the container to avoid reinstallation loops. See `/.devcontainer/devcontainer.json` (`remote.extensionKind`).
@@ -345,8 +344,9 @@ mosquitto_sub -h localhost -t "infoscreen/+/heartbeat" -v
 - `POST /api/conversions/{media_id}/pdf` - Request conversion
 - `GET /api/conversions/{media_id}/status` - Check conversion status
 - `GET /api/eventmedia/stream/<media_id>/<filename>` - Stream media with byte-range support (206) for seeking
- `POST /api/clients/{uuid}/screenshot` - Upload screenshot for client (base64 JPEG)
- **Screenshot retention:** Only the latest and last 20 timestamped screenshots per client are stored on the server. Older screenshots are automatically deleted.
+- `POST /api/clients/{uuid}/screenshot` - Upload screenshot for client (base64 JPEG, optional `timestamp`, optional `screenshot_type` = `periodic|event_start|event_stop`)
+- **Screenshot retention:** The API stores `{uuid}.jpg` as latest plus the last 20 timestamped screenshots per client; older timestamped files are deleted automatically.
+- **Priority screenshots:** For `event_start`/`event_stop`, the API also keeps `{uuid}_priority.jpg` and metadata (`{uuid}_meta.json`) used by monitoring priority selection.

 ### System Settings
 - `GET /api/system-settings` - List all system settings (admin+)
@@ -380,7 +380,11 @@ mosquitto_sub -h localhost -t "infoscreen/+/heartbeat" -v

 ### Health & Monitoring
 - `GET /health` - Service health check
- `GET /api/screenshots/{uuid}.jpg` - Client screenshots
+- `GET /screenshots/{uuid}.jpg` - Latest client screenshot
+- `GET /screenshots/{uuid}/priority` - Active high-priority screenshot (falls back to latest)
+- `GET /api/client-logs/monitoring-overview` - Aggregated monitoring overview for dashboard (superadmin)
+- `GET /api/client-logs/recent-errors` - Recent error feed across clients (admin+)
+- `GET /api/client-logs/{uuid}/logs` - Filtered per-client logs (admin+)

 ## 🎨 Frontend Features

@@ -444,6 +448,11 @@ mosquitto_sub -h localhost -t "infoscreen/+/heartbeat" -v
   - Real-time event status: shows currently running events with type, title, and time window
   - Filters out unassigned groups for focused view
   - Resource-based Syncfusion timeline scheduler with resize and drag-drop support
+- **Monitoring**: Superadmin-only monitoring dashboard
+   - Live client health states (`healthy`, `warning`, `critical`, `offline`) from heartbeat/process/log data
+   - Latest screenshot preview with screenshot-type badges (`periodic`, `event_start`, `event_stop`) and process metadata per client
+   - Active priority screenshots are surfaced immediately and polled faster while priority items are active
+   - System-wide recent error stream and per-client log drill-down
 - **Program info**: Version, build info, tech stack and paginated changelog (reads `dashboard/public/program-info.json`)

 ## 🔒 Security & Authentication
@@ -474,7 +483,8 @@ mosquitto_sub -h localhost -t "infoscreen/+/heartbeat" -v
 - MQTT: Pub/sub functionality test
 - Dashboard: Nginx availability
 - **Scheduler**: Logging is concise; conversion lookups are cached and logged only once per media.
- Dashboard: Nginx availability
+- Monitoring API: `/api/client-logs/monitoring-overview` and `/api/client-logs/recent-errors` for live diagnostics
+ - Monitoring overview includes screenshot priority state (`latestScreenshotType`, `priorityScreenshotType`, `priorityScreenshotReceivedAt`, `hasActivePriorityScreenshot`) and `summary.activePriorityScreenshots`

 ### Logging Strategy
 - **Development**: Docker Compose logs with service prefixes
@@ -549,7 +559,6 @@ docker exec -it infoscreen-db mysqladmin ping
 # Restart dependent services
 ```

-**MQTT communication issues**
 **Vite import-analysis errors (Syncfusion splitbuttons)**
 ```bash
 # Symptom
@@ -565,6 +574,8 @@ docker compose rm -sf dashboard
 docker volume rm <project>_dashboard-node-modules <project>_dashboard-vite-cache || true
 docker compose up -d --build dashboard
 ```
+
+**MQTT communication issues**
 ```bash
 # Test MQTT broker
 mosquitto_pub -h localhost -t test -m "hello"
--- a/TECH-CHANGELOG.md
+++ b/TECH-CHANGELOG.md
@@ -56,6 +56,58 @@ Notes for integrators:
 - CSS follows modern Material 3 color-function notation (`rgb(r g b / alpha%)`)
 - Syncfusion ScheduleComponent requires TimelineViews, Resize, and DragAndDrop modules injected

+Backend technical work (post-release notes; no version bump):
+- 📊 **Client Monitoring Infrastructure (Server-Side) (2026-03-10)**:
+	- Database schema: New Alembic migration `c1d2e3f4g5h6_add_client_monitoring.py` (idempotent) adds:
+		- `client_logs` table: Stores centralized logs with columns (id, client_uuid, timestamp, level, message, context, created_at)
+		- Foreign key: `client_logs.client_uuid` → `clients.uuid` (ON DELETE CASCADE)
+		- Health monitoring columns added to `clients` table: `current_event_id`, `current_process`, `process_status`, `process_pid`, `last_screenshot_analyzed`, `screen_health_status`, `last_screenshot_hash`
+		- Indexes for performance: (client_uuid, timestamp DESC), (level, timestamp DESC), (created_at DESC)
+	- Data models (`models/models.py`):
+		- New enums: `LogLevel` (ERROR, WARN, INFO, DEBUG), `ProcessStatus` (running, crashed, starting, stopped), `ScreenHealthStatus` (OK, BLACK, FROZEN, UNKNOWN)
+		- New model: `ClientLog` with foreign key to `Client` (CASCADE on delete)
+		- Extended `Client` model with 7 health monitoring fields
+	- MQTT listener extensions (`listener/listener.py`):
+		- New topic subscriptions: `infoscreen/+/logs/error`, `infoscreen/+/logs/warn`, `infoscreen/+/logs/info`, `infoscreen/+/health`
+		- Log handler: Parses JSON payloads, creates `ClientLog` entries, validates client UUID exists (FK constraint)
+		- Health handler: Updates client state from MQTT health messages
+		- Enhanced heartbeat handler: Captures `process_status`, `current_process`, `process_pid`, `current_event_id` from payload
+	- API endpoints (`server/routes/client_logs.py`):
+		- `GET /api/client-logs/<uuid>/logs` – Retrieve client logs with filters (level, limit, since); authenticated (admin_or_higher)
+		- `GET /api/client-logs/summary` – Get log counts by level per client for last 24h; authenticated (admin_or_higher)
+		- `GET /api/client-logs/monitoring-overview` – Aggregated monitoring overview for dashboard clients/statuses; authenticated (admin_or_higher)
+		- `GET /api/client-logs/recent-errors` – System-wide error monitoring; authenticated (admin_or_higher)
+		- `GET /api/client-logs/test` – Infrastructure validation endpoint (no auth required)
+		- Blueprint registered in `server/wsgi.py` as `client_logs_bp`
+	- Dev environment fix: Updated `docker-compose.override.yml` listener service to use `working_dir: /workspace` and direct command path for live code reload
+- 🖥️ **Monitoring Dashboard Integration (2026-03-24)**:
+	- Frontend monitoring dashboard (`dashboard/src/monitoring.tsx`) is active and wired to monitoring APIs
+	- Superadmin-only route/menu integration completed in `dashboard/src/App.tsx`
+	- Added dashboard monitoring API client (`dashboard/src/apiClientMonitoring.ts`) for overview and recent errors
+- 🐛 **Presentation Flags Persistence Fix (2026-03-24)**:
+	- Fixed persistence for presentation flags `page_progress` and `auto_progress` across create/update and detached-occurrence flows
+	- API serialization now reliably returns stored values for presentation behavior fields
+- 📡 **MQTT Protocol Extensions**:
+	- New log topics: `infoscreen/{uuid}/logs/{error|warn|info}` with JSON payload (timestamp, message, context)
+	- New health topic: `infoscreen/{uuid}/health` with metrics (expected_state, actual_state, health_metrics)
+	- Enhanced heartbeat: `infoscreen/{uuid}/heartbeat` now includes `current_process`, `process_pid`, `process_status`, `current_event_id`
+	- QoS levels: ERROR/WARN logs use QoS 1 (at least once), INFO/health use QoS 0 (fire and forget)
+- 📖 **Documentation**:
+	- New file: `CLIENT_MONITORING_SPECIFICATION.md` – Comprehensive 20-section technical spec for client-side implementation (MQTT protocol, process monitoring, auto-recovery, payload formats, testing guide)
+	- New file: `CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md` – 5-phase implementation guide (database, backend, client watchdog, dashboard UI, testing)
+	- Updated `.github/copilot-instructions.md`: Added MQTT topics section, client monitoring integration notes
+- ✅ **Validation**:
+	- End-to-end testing completed: MQTT message → listener → database → API confirmed working
+	- Test flow: Published message to `infoscreen/{real-uuid}/logs/error` → listener logs showed receipt → database stored entry → test API returned log data
+	- Known client UUIDs validated: 9b8d1856-ff34-4864-a726-12de072d0f77, 7f65c615-5827-4ada-9ac8-4727c2e8ee55, bdbfff95-0b2b-4265-8cc7-b0284509540a
+
+Notes for integrators:
+- Tiered logging strategy: ERROR/WARN always centralized (QoS 1), INFO dev-only (QoS 0), DEBUG local-only
+- Monitoring dashboard is implemented and consumes `/api/client-logs/monitoring-overview`, `/api/client-logs/recent-errors`, and `/api/client-logs/<uuid>/logs`
+- Foreign key constraint prevents logging for non-existent clients (data integrity enforced)
+- Migration is idempotent and can be safely rerun after interruption
+- Use `GET /api/client-logs/test` for quick infrastructure validation without authentication
+
 ## 2025.1.0-beta.1 (TBD)
 - 🔐 **User Management & Role-Based Access Control**:
 	- Backend: Implemented comprehensive user management API (`server/routes/users.py`) with 6 endpoints (GET, POST, PUT, DELETE users + password reset).
--- a/dashboard/src/App.tsx
+++ b/dashboard/src/App.tsx
@@ -1,5 +1,5 @@
 import React, { useState } from 'react';
-import { BrowserRouter as Router, Routes, Route, Link, Outlet, useNavigate } from 'react-router-dom';
+import { BrowserRouter as Router, Routes, Route, Link, Outlet, useNavigate, Navigate } from 'react-router-dom';
 import { SidebarComponent } from '@syncfusion/ej2-react-navigations';
 import { ButtonComponent } from '@syncfusion/ej2-react-buttons';
 import { DropDownButtonComponent } from '@syncfusion/ej2-react-splitbuttons';
@@ -19,6 +19,7 @@ import {
  Settings,
  Monitor,
  MonitorDotIcon,
+  Activity,
  LogOut,
  Wrench,
  Info,
@@ -31,6 +32,7 @@ const sidebarItems = [
  { name: 'Ressourcen', path: '/ressourcen', icon: Boxes, minRole: 'editor' },
  { name: 'Raumgruppen', path: '/infoscr_groups', icon: MonitorDotIcon, minRole: 'admin' },
  { name: 'Infoscreen-Clients', path: '/clients', icon: Monitor, minRole: 'admin' },
+  { name: 'Monitor-Dashboard', path: '/monitoring', icon: Activity, minRole: 'superadmin' },
  { name: 'Erweiterungsmodus', path: '/setup', icon: Wrench, minRole: 'admin' },
  { name: 'Medien', path: '/medien', icon: Image, minRole: 'editor' },
  { name: 'Benutzer', path: '/benutzer', icon: User, minRole: 'admin' },
@@ -49,6 +51,7 @@ import Benutzer from './users';
 import Einstellungen from './settings';
 import SetupMode from './SetupMode';
 import Programminfo from './programminfo';
+import MonitoringDashboard from './monitoring';
 import Logout from './logout';
 import Login from './login';
 import { useAuth } from './useAuth';
@@ -436,7 +439,7 @@ const Layout: React.FC = () => {
                type="password"
                placeholder="Aktuelles Passwort"
                value={pwdCurrent}
-                input={(e: any) => setPwdCurrent(e.value)}
+                input={(e: { value?: string }) => setPwdCurrent(e.value ?? '')}
                disabled={pwdBusy}
              />
            </div>
@@ -446,7 +449,7 @@ const Layout: React.FC = () => {
                type="password"
                placeholder="Mindestens 6 Zeichen"
                value={pwdNew}
-                input={(e: any) => setPwdNew(e.value)}
+                input={(e: { value?: string }) => setPwdNew(e.value ?? '')}
                disabled={pwdBusy}
              />
            </div>
@@ -456,7 +459,7 @@ const Layout: React.FC = () => {
                type="password"
                placeholder="Wiederholen"
                value={pwdConfirm}
-                input={(e: any) => setPwdConfirm(e.value)}
+                input={(e: { value?: string }) => setPwdConfirm(e.value ?? '')}
                disabled={pwdBusy}
              />
            </div>
@@ -480,6 +483,14 @@ const App: React.FC = () => {
    return <>{children}</>;
  };

+  const RequireSuperadmin: React.FC<{ children: React.ReactNode }> = ({ children }) => {
+    const { isAuthenticated, loading, user } = useAuth();
+    if (loading) return <div style={{ padding: 24 }}>Lade ...</div>;
+    if (!isAuthenticated) return <Login />;
+    if (user?.role !== 'superadmin') return <Navigate to="/" replace />;
+    return <>{children}</>;
+  };
+
  return (
    <ToastProvider>
      <Routes>
@@ -499,6 +510,14 @@ const App: React.FC = () => {
          <Route path="benutzer" element={<Benutzer />} />
          <Route path="einstellungen" element={<Einstellungen />} />
          <Route path="clients" element={<Infoscreens />} />
+          <Route
+            path="monitoring"
+            element={
+              <RequireSuperadmin>
+                <MonitoringDashboard />
+              </RequireSuperadmin>
+            }
+          />
          <Route path="setup" element={<SetupMode />} />
          <Route path="programminfo" element={<Programminfo />} />
        </Route>
--- a/dashboard/src/apiClientMonitoring.ts
+++ b/dashboard/src/apiClientMonitoring.ts
@@ -0,0 +1,111 @@
+export interface MonitoringLogEntry {
+  id: number;
+  timestamp: string | null;
+  level: 'ERROR' | 'WARN' | 'INFO' | 'DEBUG' | null;
+  message: string;
+  context: Record<string, unknown>;
+  client_uuid?: string;
+}
+
+export interface MonitoringClient {
+  uuid: string;
+  hostname?: string | null;
+  description?: string | null;
+  ip?: string | null;
+  model?: string | null;
+  groupId?: number | null;
+  groupName?: string | null;
+  registrationTime?: string | null;
+  lastAlive?: string | null;
+  isAlive: boolean;
+  status: 'healthy' | 'warning' | 'critical' | 'offline';
+  currentEventId?: number | null;
+  currentProcess?: string | null;
+  processStatus?: string | null;
+  processPid?: number | null;
+  screenHealthStatus?: string | null;
+  lastScreenshotAnalyzed?: string | null;
+  lastScreenshotHash?: string | null;
+  latestScreenshotType?: 'periodic' | 'event_start' | 'event_stop' | null;
+  priorityScreenshotType?: 'event_start' | 'event_stop' | null;
+  priorityScreenshotReceivedAt?: string | null;
+  hasActivePriorityScreenshot?: boolean;
+  screenshotUrl: string;
+  logCounts24h: {
+    error: number;
+    warn: number;
+    info: number;
+    debug: number;
+  };
+  latestLog?: MonitoringLogEntry | null;
+  latestError?: MonitoringLogEntry | null;
+}
+
+export interface MonitoringOverview {
+  summary: {
+    totalClients: number;
+    onlineClients: number;
+    offlineClients: number;
+    healthyClients: number;
+    warningClients: number;
+    criticalClients: number;
+    errorLogs: number;
+    warnLogs: number;
+    activePriorityScreenshots: number;
+  };
+  periodHours: number;
+  gracePeriodSeconds: number;
+  since: string;
+  timestamp: string;
+  clients: MonitoringClient[];
+}
+
+export interface ClientLogsResponse {
+  client_uuid: string;
+  logs: MonitoringLogEntry[];
+  count: number;
+  limit: number;
+}
+
+async function parseJsonResponse<T>(response: Response, fallbackMessage: string): Promise<T> {
+  const data = await response.json();
+  if (!response.ok) {
+    throw new Error(data.error || fallbackMessage);
+  }
+  return data as T;
+}
+
+export async function fetchMonitoringOverview(hours = 24): Promise<MonitoringOverview> {
+  const response = await fetch(`/api/client-logs/monitoring-overview?hours=${hours}`, {
+    credentials: 'include',
+  });
+  return parseJsonResponse<MonitoringOverview>(response, 'Fehler beim Laden der Monitoring-Übersicht');
+}
+
+export async function fetchRecentClientErrors(limit = 20): Promise<MonitoringLogEntry[]> {
+  const response = await fetch(`/api/client-logs/recent-errors?limit=${limit}`, {
+    credentials: 'include',
+  });
+  const data = await parseJsonResponse<{ errors: MonitoringLogEntry[] }>(
+    response,
+    'Fehler beim Laden der letzten Fehler'
+  );
+  return data.errors;
+}
+
+export async function fetchClientMonitoringLogs(
+  uuid: string,
+  options: { level?: string; limit?: number } = {}
+): Promise<MonitoringLogEntry[]> {
+  const params = new URLSearchParams();
+  if (options.level && options.level !== 'ALL') {
+    params.set('level', options.level);
+  }
+  params.set('limit', String(options.limit ?? 100));
+
+  const response = await fetch(`/api/client-logs/${uuid}/logs?${params.toString()}`, {
+    credentials: 'include',
+  });
+  const data = await parseJsonResponse<ClientLogsResponse>(response, 'Fehler beim Laden der Client-Logs');
+  return data.logs;
+}
--- a/dashboard/src/appointments.tsx
+++ b/dashboard/src/appointments.tsx
@@ -523,28 +523,10 @@ const Appointments: React.FC = () => {
  }, [holidays, allowScheduleOnHolidays]);

  const dataSource = useMemo(() => {
-    // Filter: Events with SkipHolidays=true (from internal Event type) are never shown on holidays
-    const filteredEvents = events.filter(ev => {
-      if (ev.SkipHolidays) {
-        // If event falls within a holiday, hide it
-        const s = ev.StartTime instanceof Date ? ev.StartTime : new Date(ev.StartTime);
-        const e = ev.EndTime instanceof Date ? ev.EndTime : new Date(ev.EndTime);
-        for (const h of holidays) {
-          const hs = new Date(h.start_date + 'T00:00:00');
-          const he = new Date(h.end_date + 'T23:59:59');
-          if (
-            (s >= hs && s <= he) ||
-            (e >= hs && e <= he) ||
-            (s <= hs && e >= he)
-          ) {
-            return false;
-          }
-        }
-      }
-      return true;
-    });
-    return [...filteredEvents, ...holidayDisplayEvents, ...holidayBlockEvents];
-  }, [events, holidayDisplayEvents, holidayBlockEvents, holidays]);
+    // Existing events should always be visible; holiday skipping for recurring events
+    // is handled via RecurrenceException from the backend.
+    return [...events, ...holidayDisplayEvents, ...holidayBlockEvents];
+  }, [events, holidayDisplayEvents, holidayBlockEvents]);

  // Removed dataSource logging

@@ -1227,37 +1209,6 @@ const Appointments: React.FC = () => {
          }
        }}
        eventRendered={(args: EventRenderedArgs) => {
-          // Always hide events that skip holidays when they fall on holidays, regardless of toggle
-          if (args.data) {
-            const ev = args.data as unknown as Partial<Event>;
-            if (ev.SkipHolidays && !args.data.isHoliday) {
-            const s =
-              args.data.StartTime instanceof Date
-                ? args.data.StartTime
-                : new Date(args.data.StartTime);
-            const e =
-              args.data.EndTime instanceof Date ? args.data.EndTime : new Date(args.data.EndTime);
-            if (isWithinHolidayRange(s, e)) {
-              args.cancel = true;
-              return;
-            }
-            }
-          }
-
-          // Blende Nicht-Ferien-Events aus, falls sie in Ferien fallen und Terminieren nicht erlaubt ist
-          // Hide events on holidays if not allowed
-          if (!allowScheduleOnHolidays && args.data && !args.data.isHoliday) {
-            const s =
-              args.data.StartTime instanceof Date
-                ? args.data.StartTime
-                : new Date(args.data.StartTime);
-            const e =
-              args.data.EndTime instanceof Date ? args.data.EndTime : new Date(args.data.EndTime);
-            if (isWithinHolidayRange(s, e)) {
-              args.cancel = true;
-              return;
-            }
-          }

          if (selectedGroupId && args.data && args.data.Id) {
            const groupColor = getGroupColor(selectedGroupId, groups);
--- a/dashboard/src/monitoring.css
+++ b/dashboard/src/monitoring.css
@@ -0,0 +1,373 @@
+.monitoring-page {
+  display: flex;
+  flex-direction: column;
+  gap: 1.25rem;
+  padding: 0.5rem 0.25rem 1rem;
+}
+
+.monitoring-header-row {
+  display: flex;
+  justify-content: space-between;
+  align-items: flex-start;
+  gap: 1rem;
+  flex-wrap: wrap;
+}
+
+.monitoring-title {
+  margin: 0;
+  font-size: 1.75rem;
+  font-weight: 700;
+  color: #5c4318;
+}
+
+.monitoring-subtitle {
+  margin: 0.35rem 0 0;
+  color: #6b7280;
+  max-width: 60ch;
+}
+
+.monitoring-toolbar {
+  display: flex;
+  align-items: end;
+  gap: 0.75rem;
+  flex-wrap: wrap;
+}
+
+.monitoring-toolbar-field {
+  display: flex;
+  flex-direction: column;
+  gap: 0.35rem;
+  min-width: 190px;
+}
+
+.monitoring-toolbar-field-compact {
+  min-width: 160px;
+}
+
+.monitoring-toolbar-field label {
+  font-size: 0.875rem;
+  font-weight: 600;
+  color: #5b4b32;
+}
+
+.monitoring-meta-row {
+  display: flex;
+  gap: 1rem;
+  flex-wrap: wrap;
+  color: #6b7280;
+  font-size: 0.92rem;
+}
+
+.monitoring-summary-grid {
+  display: grid;
+  grid-template-columns: repeat(auto-fit, minmax(180px, 1fr));
+  gap: 1rem;
+}
+
+.monitoring-metric-card {
+  overflow: hidden;
+}
+
+.monitoring-metric-content {
+  display: flex;
+  flex-direction: column;
+  gap: 0.35rem;
+}
+
+.monitoring-metric-title {
+  font-size: 0.9rem;
+  font-weight: 600;
+  color: #6b7280;
+}
+
+.monitoring-metric-value {
+  font-size: 2rem;
+  font-weight: 700;
+  color: #1f2937;
+  line-height: 1;
+}
+
+.monitoring-metric-subtitle {
+  font-size: 0.85rem;
+  color: #64748b;
+}
+
+.monitoring-main-grid {
+  display: grid;
+  grid-template-columns: minmax(0, 2fr) minmax(320px, 1fr);
+  gap: 1rem;
+  align-items: start;
+}
+
+.monitoring-sidebar-column {
+  display: flex;
+  flex-direction: column;
+  gap: 1rem;
+}
+
+.monitoring-panel {
+  background: #fff;
+  border: 1px solid #e5e7eb;
+  border-radius: 16px;
+  padding: 1.1rem;
+  box-shadow: 0 12px 40px rgb(120 89 28 / 8%);
+}
+
+.monitoring-clients-panel {
+  min-width: 0;
+}
+
+.monitoring-panel-header {
+  display: flex;
+  justify-content: space-between;
+  align-items: center;
+  gap: 0.75rem;
+  margin-bottom: 0.85rem;
+}
+
+.monitoring-panel-header-stacked {
+  align-items: end;
+  flex-wrap: wrap;
+}
+
+.monitoring-panel-header h3 {
+  margin: 0;
+  font-size: 1.1rem;
+  font-weight: 700;
+}
+
+.monitoring-panel-header span {
+  color: #6b7280;
+  font-size: 0.9rem;
+}
+
+.monitoring-detail-card .e-card-content {
+  padding-top: 0;
+}
+
+.monitoring-detail-list {
+  display: flex;
+  flex-direction: column;
+  gap: 0.75rem;
+}
+
+.monitoring-detail-row {
+  display: flex;
+  justify-content: space-between;
+  gap: 1rem;
+  align-items: flex-start;
+  border-bottom: 1px solid #f1f5f9;
+  padding-bottom: 0.55rem;
+}
+
+.monitoring-detail-row span {
+  color: #64748b;
+  font-size: 0.9rem;
+}
+
+.monitoring-detail-row strong {
+  text-align: right;
+  color: #111827;
+}
+
+.monitoring-status-badge {
+  display: inline-flex;
+  align-items: center;
+  justify-content: center;
+  padding: 0.22rem 0.6rem;
+  border-radius: 999px;
+  font-weight: 700;
+  font-size: 0.78rem;
+  letter-spacing: 0.01em;
+}
+
+.monitoring-screenshot {
+  width: 100%;
+  border-radius: 12px;
+  border: 1px solid #e5e7eb;
+  background: linear-gradient(135deg, #f8fafc, #e2e8f0);
+  min-height: 180px;
+  object-fit: cover;
+}
+
+.monitoring-screenshot-meta {
+  margin-top: 0.55rem;
+  font-size: 0.88rem;
+  color: #64748b;
+  display: flex;
+  flex-direction: column;
+  gap: 0.35rem;
+}
+
+.monitoring-shot-type {
+  display: inline-flex;
+  align-items: center;
+  border-radius: 999px;
+  padding: 0.15rem 0.55rem;
+  font-size: 0.78rem;
+  font-weight: 700;
+}
+
+.monitoring-shot-type-periodic {
+  background: #e2e8f0;
+  color: #334155;
+}
+
+.monitoring-shot-type-event {
+  background: #ffedd5;
+  color: #9a3412;
+}
+
+.monitoring-shot-type-active {
+  box-shadow: 0 0 0 2px #fdba74;
+}
+
+.monitoring-error-box {
+  display: flex;
+  flex-direction: column;
+  gap: 0.5rem;
+  padding: 0.85rem;
+  border-radius: 12px;
+  background: linear-gradient(135deg, #fff1f2, #fee2e2);
+  border: 1px solid #fecdd3;
+}
+
+.monitoring-error-time {
+  color: #9f1239;
+  font-size: 0.85rem;
+  font-weight: 600;
+}
+
+.monitoring-error-message {
+  color: #4c0519;
+  font-weight: 600;
+}
+
+.monitoring-mono {
+  font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, 'Liberation Mono', 'Courier New', monospace;
+  font-size: 0.85rem;
+}
+
+.monitoring-log-detail-row {
+  display: flex;
+  justify-content: space-between;
+  gap: 1rem;
+  align-items: flex-start;
+  border-bottom: 1px solid #f1f5f9;
+  padding-bottom: 0.55rem;
+}
+
+.monitoring-log-detail-row span {
+  color: #64748b;
+  font-size: 0.9rem;
+}
+
+.monitoring-log-detail-row strong {
+  text-align: right;
+  color: #111827;
+}
+
+.monitoring-log-context {
+  margin: 0;
+  background: #f8fafc;
+  border: 1px solid #e2e8f0;
+  border-radius: 10px;
+  padding: 0.75rem;
+  white-space: pre-wrap;
+  overflow-wrap: anywhere;
+  max-height: 280px;
+  overflow: auto;
+  font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, 'Liberation Mono', 'Courier New', monospace;
+  font-size: 0.84rem;
+  color: #0f172a;
+}
+
+.monitoring-log-dialog-content {
+  display: flex;
+  flex-direction: column;
+  gap: 1rem;
+  padding: 0.9rem 1rem 0.55rem;
+}
+
+.monitoring-log-dialog-body {
+  min-height: 340px;
+  display: flex;
+  flex-direction: column;
+  justify-content: space-between;
+}
+
+.monitoring-log-dialog-actions {
+  margin-top: 0.5rem;
+  padding: 0 1rem 0.9rem;
+  display: flex;
+  justify-content: flex-end;
+}
+
+.monitoring-log-context-title {
+  font-weight: 600;
+  margin-bottom: 0.55rem;
+}
+
+.monitoring-log-dialog-content .monitoring-log-detail-row {
+  padding: 0.1rem 0 0.75rem;
+}
+
+.monitoring-log-dialog-content .monitoring-log-context {
+  padding: 0.95rem;
+  border-radius: 12px;
+}
+
+.monitoring-lower-grid {
+  display: grid;
+  grid-template-columns: repeat(2, minmax(0, 1fr));
+  gap: 1rem;
+}
+
+@media (width <= 1200px) {
+  .monitoring-main-grid,
+  .monitoring-lower-grid {
+    grid-template-columns: 1fr;
+  }
+}
+
+@media (width <= 720px) {
+  .monitoring-page {
+    padding: 0.25rem 0 0.75rem;
+  }
+
+  .monitoring-title {
+    font-size: 1.5rem;
+  }
+
+  .monitoring-header-row,
+  .monitoring-panel-header,
+  .monitoring-detail-row,
+  .monitoring-log-detail-row {
+    flex-direction: column;
+    align-items: flex-start;
+  }
+
+  .monitoring-detail-row strong,
+  .monitoring-log-detail-row strong {
+    text-align: left;
+  }
+
+  .monitoring-toolbar,
+  .monitoring-toolbar-field,
+  .monitoring-toolbar-field-compact {
+    width: 100%;
+  }
+
+  .monitoring-log-dialog-content {
+    padding: 0.4rem 0.2rem 0.1rem;
+    gap: 0.75rem;
+  }
+
+  .monitoring-log-dialog-body {
+    min-height: 300px;
+  }
+
+  .monitoring-log-dialog-actions {
+    padding: 0 0.2rem 0.4rem;
+  }
+}
--- a/dashboard/src/monitoring.tsx
+++ b/dashboard/src/monitoring.tsx
@@ -0,0 +1,573 @@
+import React from 'react';
+import {
+  fetchClientMonitoringLogs,
+  fetchMonitoringOverview,
+  fetchRecentClientErrors,
+  type MonitoringClient,
+  type MonitoringLogEntry,
+  type MonitoringOverview,
+} from './apiClientMonitoring';
+import { useAuth } from './useAuth';
+import { ButtonComponent } from '@syncfusion/ej2-react-buttons';
+import { DropDownListComponent } from '@syncfusion/ej2-react-dropdowns';
+import {
+  GridComponent,
+  ColumnsDirective,
+  ColumnDirective,
+  Inject,
+  Page,
+  Search,
+  Sort,
+  Toolbar,
+} from '@syncfusion/ej2-react-grids';
+import { MessageComponent } from '@syncfusion/ej2-react-notifications';
+import { DialogComponent } from '@syncfusion/ej2-react-popups';
+import './monitoring.css';
+
+const REFRESH_INTERVAL_MS = 15000;
+const PRIORITY_REFRESH_INTERVAL_MS = 3000;
+
+const hourOptions = [
+  { text: 'Letzte 6 Stunden', value: 6 },
+  { text: 'Letzte 24 Stunden', value: 24 },
+  { text: 'Letzte 72 Stunden', value: 72 },
+  { text: 'Letzte 168 Stunden', value: 168 },
+];
+
+const logLevelOptions = [
+  { text: 'Alle Logs', value: 'ALL' },
+  { text: 'ERROR', value: 'ERROR' },
+  { text: 'WARN', value: 'WARN' },
+  { text: 'INFO', value: 'INFO' },
+  { text: 'DEBUG', value: 'DEBUG' },
+];
+
+const statusPalette: Record<string, { label: string; color: string; background: string }> = {
+  healthy: { label: 'Stabil', color: '#166534', background: '#dcfce7' },
+  warning: { label: 'Warnung', color: '#92400e', background: '#fef3c7' },
+  critical: { label: 'Kritisch', color: '#991b1b', background: '#fee2e2' },
+  offline: { label: 'Offline', color: '#334155', background: '#e2e8f0' },
+};
+
+function parseUtcDate(value?: string | null): Date | null {
+  if (!value) return null;
+  const trimmed = value.trim();
+  if (!trimmed) return null;
+
+  const hasTimezone = /[zZ]$|[+-]\d{2}:?\d{2}$/.test(trimmed);
+  const utcValue = hasTimezone ? trimmed : `${trimmed}Z`;
+  const parsed = new Date(utcValue);
+  if (Number.isNaN(parsed.getTime())) return null;
+  return parsed;
+}
+
+function formatTimestamp(value?: string | null): string {
+  if (!value) return 'Keine Daten';
+  const date = parseUtcDate(value);
+  if (!date) return value;
+  return date.toLocaleString('de-DE');
+}
+
+function formatRelative(value?: string | null): string {
+  if (!value) return 'Keine Daten';
+  const date = parseUtcDate(value);
+  if (!date) return 'Unbekannt';
+
+  const diffMs = Date.now() - date.getTime();
+  const diffMinutes = Math.floor(diffMs / 60000);
+  const diffHours = Math.floor(diffMinutes / 60);
+  const diffDays = Math.floor(diffHours / 24);
+
+  if (diffMinutes < 1) return 'gerade eben';
+  if (diffMinutes < 60) return `vor ${diffMinutes} Min.`;
+  if (diffHours < 24) return `vor ${diffHours} Std.`;
+  return `vor ${diffDays} Tag${diffDays === 1 ? '' : 'en'}`;
+}
+
+function statusBadge(status: string) {
+  const palette = statusPalette[status] || statusPalette.offline;
+  return (
+    <span
+      className="monitoring-status-badge"
+      style={{ color: palette.color, backgroundColor: palette.background }}
+    >
+      {palette.label}
+    </span>
+  );
+}
+
+function screenshotTypeBadge(type?: string | null, hasPriority = false) {
+  const normalized = (type || 'periodic').toLowerCase();
+  const map: Record<string, { label: string; className: string }> = {
+    periodic: { label: 'Periodisch', className: 'monitoring-shot-type-periodic' },
+    event_start: { label: 'Event-Start', className: 'monitoring-shot-type-event' },
+    event_stop: { label: 'Event-Stopp', className: 'monitoring-shot-type-event' },
+  };
+
+  const info = map[normalized] || map.periodic;
+  const classes = `monitoring-shot-type ${info.className}${hasPriority ? ' monitoring-shot-type-active' : ''}`;
+  return <span className={classes}>{info.label}</span>;
+}
+
+function renderMetricCard(title: string, value: number, subtitle: string, accent: string) {
+  return (
+    <div className="e-card monitoring-metric-card" style={{ borderTop: `4px solid ${accent}` }}>
+      <div className="e-card-content monitoring-metric-content">
+        <div className="monitoring-metric-title">{title}</div>
+        <div className="monitoring-metric-value">{value}</div>
+        <div className="monitoring-metric-subtitle">{subtitle}</div>
+      </div>
+    </div>
+  );
+}
+
+function renderContext(context?: Record<string, unknown>): string {
+  if (!context || Object.keys(context).length === 0) {
+    return 'Kein Kontext vorhanden';
+  }
+  try {
+    return JSON.stringify(context, null, 2);
+  } catch {
+    return 'Kontext konnte nicht formatiert werden';
+  }
+}
+
+function buildScreenshotUrl(client: MonitoringClient, overviewTimestamp?: string | null): string {
+  const refreshKey = client.lastScreenshotHash || client.lastScreenshotAnalyzed || overviewTimestamp;
+  if (!refreshKey) {
+    return client.screenshotUrl;
+  }
+
+  const separator = client.screenshotUrl.includes('?') ? '&' : '?';
+  return `${client.screenshotUrl}${separator}v=${encodeURIComponent(refreshKey)}`;
+}
+
+const MonitoringDashboard: React.FC = () => {
+  const { user } = useAuth();
+  const [hours, setHours] = React.useState<number>(24);
+  const [logLevel, setLogLevel] = React.useState<string>('ALL');
+  const [overview, setOverview] = React.useState<MonitoringOverview | null>(null);
+  const [recentErrors, setRecentErrors] = React.useState<MonitoringLogEntry[]>([]);
+  const [clientLogs, setClientLogs] = React.useState<MonitoringLogEntry[]>([]);
+  const [selectedClientUuid, setSelectedClientUuid] = React.useState<string | null>(null);
+  const [loading, setLoading] = React.useState<boolean>(true);
+  const [error, setError] = React.useState<string | null>(null);
+  const [logsLoading, setLogsLoading] = React.useState<boolean>(false);
+  const [screenshotErrored, setScreenshotErrored] = React.useState<boolean>(false);
+  const selectedClientUuidRef = React.useRef<string | null>(null);
+  const [selectedLogEntry, setSelectedLogEntry] = React.useState<MonitoringLogEntry | null>(null);
+
+  const selectedClient = React.useMemo<MonitoringClient | null>(() => {
+    if (!overview || !selectedClientUuid) return null;
+    return overview.clients.find(client => client.uuid === selectedClientUuid) || null;
+  }, [overview, selectedClientUuid]);
+
+  const selectedClientScreenshotUrl = React.useMemo<string | null>(() => {
+    if (!selectedClient) return null;
+    return buildScreenshotUrl(selectedClient, overview?.timestamp || null);
+  }, [selectedClient, overview?.timestamp]);
+
+  React.useEffect(() => {
+    selectedClientUuidRef.current = selectedClientUuid;
+  }, [selectedClientUuid]);
+
+  const loadOverview = React.useCallback(async (requestedHours: number, preserveSelection = true) => {
+    setLoading(true);
+    setError(null);
+    try {
+      const [overviewData, errorsData] = await Promise.all([
+        fetchMonitoringOverview(requestedHours),
+        fetchRecentClientErrors(25),
+      ]);
+      setOverview(overviewData);
+      setRecentErrors(errorsData);
+
+      const currentSelection = selectedClientUuidRef.current;
+      const nextSelectedUuid =
+        preserveSelection && currentSelection && overviewData.clients.some(client => client.uuid === currentSelection)
+          ? currentSelection
+          : overviewData.clients[0]?.uuid || null;
+
+      setSelectedClientUuid(nextSelectedUuid);
+      setScreenshotErrored(false);
+    } catch (loadError) {
+      setError(loadError instanceof Error ? loadError.message : 'Monitoring-Daten konnten nicht geladen werden');
+    } finally {
+      setLoading(false);
+    }
+  }, []);
+
+  React.useEffect(() => {
+    loadOverview(hours, false);
+  }, [hours, loadOverview]);
+
+  React.useEffect(() => {
+    const hasActivePriorityScreenshots = (overview?.summary.activePriorityScreenshots || 0) > 0;
+    const intervalMs = hasActivePriorityScreenshots ? PRIORITY_REFRESH_INTERVAL_MS : REFRESH_INTERVAL_MS;
+    const intervalId = window.setInterval(() => {
+      loadOverview(hours);
+    }, intervalMs);
+
+    return () => window.clearInterval(intervalId);
+  }, [hours, loadOverview, overview?.summary.activePriorityScreenshots]);
+
+  React.useEffect(() => {
+    if (!selectedClientUuid) {
+      setClientLogs([]);
+      return;
+    }
+
+    let active = true;
+    const loadLogs = async () => {
+      setLogsLoading(true);
+      try {
+        const logs = await fetchClientMonitoringLogs(selectedClientUuid, { level: logLevel, limit: 100 });
+        if (active) {
+          setClientLogs(logs);
+        }
+      } catch (loadError) {
+        if (active) {
+          setClientLogs([]);
+          setError(loadError instanceof Error ? loadError.message : 'Client-Logs konnten nicht geladen werden');
+        }
+      } finally {
+        if (active) {
+          setLogsLoading(false);
+        }
+      }
+    };
+
+    loadLogs();
+    return () => {
+      active = false;
+    };
+  }, [selectedClientUuid, logLevel]);
+
+  React.useEffect(() => {
+    setScreenshotErrored(false);
+  }, [selectedClientUuid]);
+
+  if (!user || user.role !== 'superadmin') {
+    return (
+      <MessageComponent severity="Error" content="Dieses Monitoring-Dashboard ist nur für Superadministratoren sichtbar." />
+    );
+  }
+
+  const clientGridData = (overview?.clients || []).map(client => ({
+    ...client,
+    displayName: client.description || client.hostname || client.uuid,
+    lastAliveDisplay: formatTimestamp(client.lastAlive),
+    currentProcessDisplay: client.currentProcess || 'kein Prozess',
+    processStatusDisplay: client.processStatus || 'unbekannt',
+    errorCount: client.logCounts24h.error,
+    warnCount: client.logCounts24h.warn,
+  }));
+
+  return (
+    <div className="monitoring-page">
+      <div className="monitoring-header-row">
+        <div>
+          <h2 className="monitoring-title">Monitor-Dashboard</h2>
+          <p className="monitoring-subtitle">
+            Live-Zustand der Infoscreen-Clients, Prozessstatus und zentrale Fehlerprotokolle.
+          </p>
+        </div>
+        <div className="monitoring-toolbar">
+          <div className="monitoring-toolbar-field">
+            <label>Zeitraum</label>
+            <DropDownListComponent
+              dataSource={hourOptions}
+              fields={{ text: 'text', value: 'value' }}
+              value={hours}
+              change={(args: { value: number }) => setHours(Number(args.value))}
+            />
+          </div>
+          <ButtonComponent cssClass="e-primary" onClick={() => loadOverview(hours)} disabled={loading}>
+            Aktualisieren
+          </ButtonComponent>
+        </div>
+      </div>
+
+      {error && <MessageComponent severity="Error" content={error} />}
+
+      {overview && (
+        <div className="monitoring-meta-row">
+          <span>Stand: {formatTimestamp(overview.timestamp)}</span>
+          <span>Alive-Fenster: {overview.gracePeriodSeconds} Sekunden</span>
+          <span>Betrachtungszeitraum: {overview.periodHours} Stunden</span>
+        </div>
+      )}
+
+      <div className="monitoring-summary-grid">
+        {renderMetricCard('Clients gesamt', overview?.summary.totalClients || 0, 'Registrierte Displays', '#7c3aed')}
+        {renderMetricCard('Online', overview?.summary.onlineClients || 0, 'Heartbeat innerhalb der Grace-Periode', '#15803d')}
+        {renderMetricCard('Warnungen', overview?.summary.warningClients || 0, 'Warn-Logs oder Übergangszustände', '#d97706')}
+        {renderMetricCard('Kritisch', overview?.summary.criticalClients || 0, 'Crashs oder Fehler-Logs', '#dc2626')}
+        {renderMetricCard('Offline', overview?.summary.offlineClients || 0, 'Keine frischen Signale', '#475569')}
+        {renderMetricCard('Prioritäts-Screens', overview?.summary.activePriorityScreenshots || 0, 'Event-Start/Stop aktiv', '#ea580c')}
+        {renderMetricCard('Fehler-Logs', overview?.summary.errorLogs || 0, 'Im gewählten Zeitraum', '#b91c1c')}
+      </div>
+
+      {loading && !overview ? (
+        <MessageComponent severity="Info" content="Monitoring-Daten werden geladen ..." />
+      ) : (
+        <div className="monitoring-main-grid">
+          <div className="monitoring-panel monitoring-clients-panel">
+            <div className="monitoring-panel-header">
+              <h3>Client-Zustand</h3>
+              <span>{overview?.clients.length || 0} Einträge</span>
+            </div>
+            <GridComponent
+              dataSource={clientGridData}
+              allowPaging={true}
+              pageSettings={{ pageSize: 10 }}
+              allowSorting={true}
+              toolbar={['Search']}
+              height={460}
+              rowSelected={(args: { data: MonitoringClient }) => {
+                setSelectedClientUuid(args.data.uuid);
+              }}
+            >
+              <ColumnsDirective>
+                <ColumnDirective
+                  field="status"
+                  headerText="Status"
+                  width="120"
+                  template={(props: MonitoringClient) => statusBadge(props.status)}
+                />
+                <ColumnDirective field="displayName" headerText="Client" width="190" />
+                <ColumnDirective field="groupName" headerText="Gruppe" width="150" />
+                <ColumnDirective field="currentProcessDisplay" headerText="Prozess" width="130" />
+                <ColumnDirective field="processStatusDisplay" headerText="Prozessstatus" width="130" />
+                <ColumnDirective field="errorCount" headerText="ERROR" textAlign="Right" width="90" />
+                <ColumnDirective field="warnCount" headerText="WARN" textAlign="Right" width="90" />
+                <ColumnDirective field="lastAliveDisplay" headerText="Letztes Signal" width="170" />
+              </ColumnsDirective>
+              <Inject services={[Page, Search, Sort, Toolbar]} />
+            </GridComponent>
+          </div>
+
+          <div className="monitoring-sidebar-column">
+            <div className="e-card monitoring-detail-card">
+              <div className="e-card-header">
+                <div className="e-card-header-caption">
+                  <div className="e-card-title">Aktiver Client</div>
+                </div>
+              </div>
+              <div className="e-card-content">
+                {selectedClient ? (
+                  <div className="monitoring-detail-list">
+                    <div className="monitoring-detail-row">
+                      <span>Name</span>
+                      <strong>{selectedClient.description || selectedClient.hostname || selectedClient.uuid}</strong>
+                    </div>
+                    <div className="monitoring-detail-row">
+                      <span>Status</span>
+                      <strong>{statusBadge(selectedClient.status)}</strong>
+                    </div>
+                    <div className="monitoring-detail-row">
+                      <span>UUID</span>
+                      <strong className="monitoring-mono">{selectedClient.uuid}</strong>
+                    </div>
+                    <div className="monitoring-detail-row">
+                      <span>Raumgruppe</span>
+                      <strong>{selectedClient.groupName || 'Nicht zugeordnet'}</strong>
+                    </div>
+                    <div className="monitoring-detail-row">
+                      <span>Prozess</span>
+                      <strong>{selectedClient.currentProcess || 'kein Prozess'}</strong>
+                    </div>
+                    <div className="monitoring-detail-row">
+                      <span>PID</span>
+                      <strong>{selectedClient.processPid || 'keine PID'}</strong>
+                    </div>
+                    <div className="monitoring-detail-row">
+                      <span>Event-ID</span>
+                      <strong>{selectedClient.currentEventId || 'keine Zuordnung'}</strong>
+                    </div>
+                    <div className="monitoring-detail-row">
+                      <span>Letztes Signal</span>
+                      <strong>{formatRelative(selectedClient.lastAlive)}</strong>
+                    </div>
+                    <div className="monitoring-detail-row">
+                      <span>Bildschirmstatus</span>
+                      <strong>{selectedClient.screenHealthStatus || 'UNKNOWN'}</strong>
+                    </div>
+                    <div className="monitoring-detail-row">
+                      <span>Letzte Analyse</span>
+                      <strong>{formatTimestamp(selectedClient.lastScreenshotAnalyzed)}</strong>
+                    </div>
+                    <div className="monitoring-detail-row">
+                      <span>Screenshot-Typ</span>
+                      <strong>
+                        {screenshotTypeBadge(
+                          selectedClient.latestScreenshotType,
+                          !!selectedClient.hasActivePriorityScreenshot
+                        )}
+                      </strong>
+                    </div>
+                    {selectedClient.priorityScreenshotReceivedAt && (
+                      <div className="monitoring-detail-row">
+                        <span>Priorität empfangen</span>
+                        <strong>{formatTimestamp(selectedClient.priorityScreenshotReceivedAt)}</strong>
+                      </div>
+                    )}
+                  </div>
+                ) : (
+                  <MessageComponent severity="Info" content="Wählen Sie links einen Client aus." />
+                )}
+              </div>
+            </div>
+
+            <div className="e-card monitoring-detail-card">
+              <div className="e-card-header">
+                <div className="e-card-header-caption">
+                  <div className="e-card-title">Der letzte Screenshot</div>
+                </div>
+              </div>
+              <div className="e-card-content">
+                {selectedClient ? (
+                  <>
+                    {screenshotErrored ? (
+                      <MessageComponent severity="Warning" content="Für diesen Client liegt noch kein Screenshot vor." />
+                    ) : (
+                      <img
+                        src={selectedClientScreenshotUrl || selectedClient.screenshotUrl}
+                        alt={`Screenshot ${selectedClient.uuid}`}
+                        className="monitoring-screenshot"
+                        onError={() => setScreenshotErrored(true)}
+                      />
+                    )}
+                    <div className="monitoring-screenshot-meta">
+                      <span>Empfangen: {formatTimestamp(selectedClient.lastScreenshotAnalyzed)}</span>
+                      <span>
+                        Typ:{' '}
+                        {screenshotTypeBadge(
+                          selectedClient.latestScreenshotType,
+                          !!selectedClient.hasActivePriorityScreenshot
+                        )}
+                      </span>
+                    </div>
+                  </>
+                ) : (
+                  <MessageComponent severity="Info" content="Kein Client ausgewählt." />
+                )}
+              </div>
+            </div>
+
+            <div className="e-card monitoring-detail-card">
+              <div className="e-card-header">
+                <div className="e-card-header-caption">
+                  <div className="e-card-title">Letzter Fehler</div>
+                </div>
+              </div>
+              <div className="e-card-content">
+                {selectedClient?.latestError ? (
+                  <div className="monitoring-error-box">
+                    <div className="monitoring-error-time">{formatTimestamp(selectedClient.latestError.timestamp)}</div>
+                    <div className="monitoring-error-message">{selectedClient.latestError.message}</div>
+                  </div>
+                ) : (
+                  <MessageComponent severity="Success" content="Kein ERROR-Log für den ausgewählten Client gefunden." />
+                )}
+              </div>
+            </div>
+          </div>
+        </div>
+      )}
+
+      <div className="monitoring-lower-grid">
+        <div className="monitoring-panel">
+          <div className="monitoring-panel-header monitoring-panel-header-stacked">
+            <div>
+              <h3>Client-Logs</h3>
+              <span>{selectedClient ? `Client ${selectedClient.uuid}` : 'Kein Client ausgewählt'}</span>
+            </div>
+            <div className="monitoring-toolbar-field monitoring-toolbar-field-compact">
+              <label>Level</label>
+              <DropDownListComponent
+                dataSource={logLevelOptions}
+                fields={{ text: 'text', value: 'value' }}
+                value={logLevel}
+                change={(args: { value: string }) => setLogLevel(String(args.value))}
+              />
+            </div>
+          </div>
+          {logsLoading && <MessageComponent severity="Info" content="Client-Logs werden geladen ..." />}
+          <GridComponent
+            dataSource={clientLogs}
+            allowPaging={true}
+            pageSettings={{ pageSize: 8 }}
+            allowSorting={true}
+            height={320}
+            rowSelected={(args: { data: MonitoringLogEntry }) => {
+              setSelectedLogEntry(args.data);
+            }}
+          >
+            <ColumnsDirective>
+              <ColumnDirective field="timestamp" headerText="Zeit" width="180" template={(props: MonitoringLogEntry) => formatTimestamp(props.timestamp)} />
+              <ColumnDirective field="level" headerText="Level" width="90" />
+              <ColumnDirective field="message" headerText="Nachricht" width="360" />
+            </ColumnsDirective>
+            <Inject services={[Page, Sort]} />
+          </GridComponent>
+        </div>
+
+        <div className="monitoring-panel">
+          <div className="monitoring-panel-header">
+            <h3>Letzte Fehler systemweit</h3>
+            <span>{recentErrors.length} Einträge</span>
+          </div>
+          <GridComponent dataSource={recentErrors} allowPaging={true} pageSettings={{ pageSize: 8 }} allowSorting={true} height={320}>
+            <ColumnsDirective>
+              <ColumnDirective field="timestamp" headerText="Zeit" width="180" template={(props: MonitoringLogEntry) => formatTimestamp(props.timestamp)} />
+              <ColumnDirective field="client_uuid" headerText="Client" width="220" />
+              <ColumnDirective field="message" headerText="Nachricht" width="360" />
+            </ColumnsDirective>
+            <Inject services={[Page, Sort]} />
+          </GridComponent>
+        </div>
+      </div>
+
+      <DialogComponent
+        isModal={true}
+        visible={!!selectedLogEntry}
+        width="860px"
+        minHeight="420px"
+        header="Log-Details"
+        animationSettings={{ effect: 'None' }}
+        buttons={[]}
+        showCloseIcon={true}
+        close={() => setSelectedLogEntry(null)}
+      >
+        {selectedLogEntry && (
+          <div className="monitoring-log-dialog-body">
+            <div className="monitoring-log-dialog-content">
+              <div className="monitoring-log-detail-row">
+                <span>Zeit</span>
+                <strong>{formatTimestamp(selectedLogEntry.timestamp)}</strong>
+              </div>
+              <div className="monitoring-log-detail-row">
+                <span>Level</span>
+                <strong>{selectedLogEntry.level || 'Unbekannt'}</strong>
+              </div>
+              <div className="monitoring-log-detail-row">
+                <span>Nachricht</span>
+                <strong style={{ whiteSpace: 'normal', textAlign: 'left' }}>{selectedLogEntry.message}</strong>
+              </div>
+              <div>
+                <div className="monitoring-log-context-title">Kontext</div>
+                <pre className="monitoring-log-context">{renderContext(selectedLogEntry.context)}</pre>
+              </div>
+            </div>
+            <div className="monitoring-log-dialog-actions">
+              <ButtonComponent onClick={() => setSelectedLogEntry(null)}>Schließen</ButtonComponent>
+            </div>
+          </div>
+        )}
+      </DialogComponent>
+    </div>
+  );
+};
+
+export default MonitoringDashboard;
--- a/dashboard/src/ressourcen.tsx
+++ b/dashboard/src/ressourcen.tsx
@@ -33,7 +33,7 @@ const Ressourcen: React.FC = () => {
  const [groupOrder, setGroupOrder] = useState<number[]>([]);
  const [showOrderPanel, setShowOrderPanel] = useState<boolean>(false);
  const [timelineView] = useState<TimelineView>('day');
-  const [viewDate] = useState<Date>(() => {
+  const [viewDate, setViewDate] = useState<Date>(() => {
    const now = new Date();
    now.setHours(0, 0, 0, 0);
    return now;
@@ -110,23 +110,31 @@ const Ressourcen: React.FC = () => {
        for (const group of groups) {
          try {
            console.log(`[Ressourcen] Fetching events for group "${group.name}" (ID: ${group.id})`);
-            const apiEvents = await fetchEvents(group.id.toString(), false, {
+            const apiEvents = await fetchEvents(group.id.toString(), true, {
              start,
              end,
            });
            console.log(`[Ressourcen] Got ${apiEvents?.length || 0} events for group "${group.name}"`);

            if (Array.isArray(apiEvents) && apiEvents.length > 0) {
-              const event = apiEvents[0];
-              const eventTitle = event.subject || event.title || 'Unnamed Event';
-              const eventType = event.type || event.event_type || 'other';
-              const eventStart = event.startTime || event.start;
-              const eventEnd = event.endTime || event.end;
+              for (const event of apiEvents) {
+                const eventTitle = event.subject || event.title || 'Unnamed Event';
+                const eventType = event.type || event.event_type || 'other';
+                const eventStart = event.startTime || event.start;
+                const eventEnd = event.endTime || event.end;
+
+                if (!eventStart || !eventEnd) {
+                  continue;
+                }

-              if (eventStart && eventEnd) {
                const parsedStart = parseUTCDate(eventStart);
                const parsedEnd = parseUTCDate(eventEnd);

+                // Keep only events that overlap the visible range.
+                if (parsedEnd < start || parsedStart > end) {
+                  continue;
+                }
+
                // Capitalize first letter of event type
                const formattedType = eventType.charAt(0).toUpperCase() + eventType.slice(1);

@@ -138,7 +146,6 @@ const Ressourcen: React.FC = () => {
                  ResourceId: group.id,
                  EventType: eventType,
                });
-                console.log(`[Ressourcen] Group "${group.name}" has event: ${eventTitle}`);
              }
            }
          } catch (error) {
@@ -324,6 +331,16 @@ const Ressourcen: React.FC = () => {
            group={{ resources: ['Groups'], allowGroupEdit: false }}
            timeScale={{ interval: 60, slotCount: 1 }}
            rowAutoHeight={false}
+            actionComplete={(args) => {
+              if (args.requestType === 'dateNavigate' || args.requestType === 'viewNavigate') {
+                const selected = scheduleRef.current?.selectedDate;
+                if (selected) {
+                  const normalized = new Date(selected);
+                  normalized.setHours(0, 0, 0, 0);
+                  setViewDate(normalized);
+                }
+              }
+            }}
          >
            <ViewsDirective>
              <ViewDirective option="TimelineDay" displayName="Tag"></ViewDirective>
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -18,8 +18,9 @@ services:
    environment:
      - DB_CONN=mysql+pymysql://${DB_USER}:${DB_PASSWORD}@db/${DB_NAME}
      - DB_URL=mysql+pymysql://${DB_USER}:${DB_PASSWORD}@db/${DB_NAME}
-        - ENV=${ENV:-development}
-        - FLASK_SECRET_KEY=${FLASK_SECRET_KEY:-dev-secret-key-change-in-production}
+      - API_BASE_URL=http://server:8000
+      - ENV=${ENV:-development}
+      - FLASK_SECRET_KEY=${FLASK_SECRET_KEY:-dev-secret-key-change-in-production}
      - DEFAULT_SUPERADMIN_USERNAME=${DEFAULT_SUPERADMIN_USERNAME:-superadmin}
      - DEFAULT_SUPERADMIN_PASSWORD=${DEFAULT_SUPERADMIN_PASSWORD}
    # 🔧 ENTFERNT: Volume-Mount ist nur für die Entwicklung
--- a/listener/listener.py
+++ b/listener/listener.py
@@ -3,15 +3,17 @@ import json
 import logging
 import datetime
 import base64
+import re
 import requests
 import paho.mqtt.client as mqtt
 from sqlalchemy import create_engine
 from sqlalchemy.orm import sessionmaker
-from models.models import Client
+from models.models import Client, ClientLog, LogLevel, ProcessStatus, ScreenHealthStatus
 logging.basicConfig(level=logging.DEBUG, format='%(asctime)s [%(levelname)s] %(message)s')

-# Load .env in development
-if os.getenv("ENV", "development") == "development":
+# Load .env only when not already configured by Docker (API_BASE_URL not set by compose means we're outside a container)
+_api_already_set = bool(os.environ.get("API_BASE_URL"))
+if not _api_already_set and os.getenv("ENV", "development") == "development":
    try:
        from dotenv import load_dotenv
        load_dotenv(".env")
@@ -30,6 +32,288 @@ Session = sessionmaker(bind=engine)
 # API configuration
 API_BASE_URL = os.getenv("API_BASE_URL", "http://server:8000")

+# Dashboard payload migration observability
+DASHBOARD_METRICS_LOG_EVERY = int(os.getenv("DASHBOARD_METRICS_LOG_EVERY", "5"))
+DASHBOARD_PARSE_METRICS = {
+    "v2_success": 0,
+    "parse_failures": 0,
+}
+
+
+def normalize_process_status(value):
+    if value is None:
+        return None
+    if isinstance(value, ProcessStatus):
+        return value
+
+    normalized = str(value).strip().lower()
+    if not normalized:
+        return None
+
+    try:
+        return ProcessStatus(normalized)
+    except ValueError:
+        return None
+
+
+def normalize_event_id(value):
+    if value is None or isinstance(value, bool):
+        return None
+    if isinstance(value, int):
+        return value
+    if isinstance(value, float):
+        return int(value)
+
+    normalized = str(value).strip()
+    if not normalized:
+        return None
+    if normalized.isdigit():
+        return int(normalized)
+
+    match = re.search(r"(\d+)$", normalized)
+    if match:
+        return int(match.group(1))
+
+    return None
+
+
+def parse_timestamp(value):
+    if not value:
+        return None
+    if isinstance(value, (int, float)):
+        try:
+            ts_value = float(value)
+            if ts_value > 1e12:
+                ts_value = ts_value / 1000.0
+            return datetime.datetime.fromtimestamp(ts_value, datetime.UTC)
+        except (TypeError, ValueError, OverflowError):
+            return None
+    try:
+        value_str = str(value).strip()
+        if value_str.isdigit():
+            ts_value = float(value_str)
+            if ts_value > 1e12:
+                ts_value = ts_value / 1000.0
+            return datetime.datetime.fromtimestamp(ts_value, datetime.UTC)
+
+        parsed = datetime.datetime.fromisoformat(value_str.replace('Z', '+00:00'))
+        if parsed.tzinfo is None:
+            return parsed.replace(tzinfo=datetime.UTC)
+        return parsed.astimezone(datetime.UTC)
+    except ValueError:
+        return None
+
+
+def infer_screen_health_status(payload_data):
+    explicit = payload_data.get('screen_health_status')
+    if explicit:
+        try:
+            return ScreenHealthStatus[str(explicit).strip().upper()]
+        except KeyError:
+            pass
+
+    metrics = payload_data.get('health_metrics') or {}
+    if metrics.get('screen_on') is False:
+        return ScreenHealthStatus.BLACK
+
+    last_frame_update = parse_timestamp(metrics.get('last_frame_update'))
+    if last_frame_update:
+        age_seconds = (datetime.datetime.now(datetime.UTC) - last_frame_update).total_seconds()
+        if age_seconds > 30:
+            return ScreenHealthStatus.FROZEN
+        return ScreenHealthStatus.OK
+
+    return None
+
+
+def apply_monitoring_update(client_obj, *, event_id=None, process_name=None, process_pid=None,
+                            process_status=None, last_seen=None, screen_health_status=None,
+                            last_screenshot_analyzed=None):
+    if last_seen:
+        client_obj.last_alive = last_seen
+
+    normalized_event_id = normalize_event_id(event_id)
+    if normalized_event_id is not None:
+        client_obj.current_event_id = normalized_event_id
+
+    if process_name is not None:
+        client_obj.current_process = process_name
+
+    if process_pid is not None:
+        client_obj.process_pid = process_pid
+
+    normalized_status = normalize_process_status(process_status)
+    if normalized_status is not None:
+        client_obj.process_status = normalized_status
+
+    if screen_health_status is not None:
+        client_obj.screen_health_status = screen_health_status
+
+    if last_screenshot_analyzed is not None:
+        existing = client_obj.last_screenshot_analyzed
+        if existing is not None and existing.tzinfo is None:
+            existing = existing.replace(tzinfo=datetime.UTC)
+
+        candidate = last_screenshot_analyzed
+        if candidate.tzinfo is None:
+            candidate = candidate.replace(tzinfo=datetime.UTC)
+
+        if existing is None or candidate >= existing:
+            client_obj.last_screenshot_analyzed = candidate
+
+
+def _normalize_screenshot_type(raw_type):
+    if raw_type is None:
+        return None
+
+    normalized = str(raw_type).strip().lower()
+    if normalized in ("periodic", "event_start", "event_stop"):
+        return normalized
+    return None
+
+
+def _classify_dashboard_payload(data):
+    """
+    Classify dashboard payload into migration categories for observability.
+    """
+    if not isinstance(data, dict):
+        return "parse_failures", None
+
+    message_obj = data.get("message") if isinstance(data.get("message"), dict) else None
+    content_obj = data.get("content") if isinstance(data.get("content"), dict) else None
+    metadata_obj = data.get("metadata") if isinstance(data.get("metadata"), dict) else None
+    schema_version = metadata_obj.get("schema_version") if metadata_obj else None
+
+    # v2 detection: grouped blocks available with metadata.
+    if message_obj is not None and content_obj is not None and metadata_obj is not None:
+        return "v2_success", schema_version
+
+    return "parse_failures", schema_version
+
+
+def _record_dashboard_parse_metric(mode, uuid, schema_version=None, reason=None):
+    if mode not in DASHBOARD_PARSE_METRICS:
+        mode = "parse_failures"
+
+    DASHBOARD_PARSE_METRICS[mode] += 1
+    total = sum(DASHBOARD_PARSE_METRICS.values())
+
+    if mode == "v2_success":
+        if schema_version is None:
+            logging.warning(f"Dashboard payload from {uuid}: missing metadata.schema_version for grouped payload")
+        else:
+            version_text = str(schema_version).strip()
+            if not version_text.startswith("2"):
+                logging.warning(f"Dashboard payload from {uuid}: unknown schema_version={version_text}")
+
+    if mode == "parse_failures":
+        if reason:
+            logging.warning(f"Dashboard payload parse failure for {uuid}: {reason}")
+        else:
+            logging.warning(f"Dashboard payload parse failure for {uuid}")
+
+    if DASHBOARD_METRICS_LOG_EVERY > 0 and total % DASHBOARD_METRICS_LOG_EVERY == 0:
+        logging.info(
+            "Dashboard payload metrics: "
+            f"total={total}, "
+            f"v2_success={DASHBOARD_PARSE_METRICS['v2_success']}, "
+            f"parse_failures={DASHBOARD_PARSE_METRICS['parse_failures']}"
+        )
+
+
+def _validate_v2_required_fields(data, uuid):
+    """
+    Soft validation of required v2 fields for grouped dashboard payloads.
+    Logs a WARNING for each missing field. Never drops the message.
+    """
+    message_obj = data.get("message") if isinstance(data.get("message"), dict) else {}
+    metadata_obj = data.get("metadata") if isinstance(data.get("metadata"), dict) else {}
+    capture_obj = metadata_obj.get("capture") if isinstance(metadata_obj.get("capture"), dict) else {}
+
+    missing = []
+    if not message_obj.get("client_id"):
+        missing.append("message.client_id")
+    if not message_obj.get("status"):
+        missing.append("message.status")
+    if not metadata_obj.get("schema_version"):
+        missing.append("metadata.schema_version")
+    if not capture_obj.get("type"):
+        missing.append("metadata.capture.type")
+
+    if missing:
+        logging.warning(
+            f"Dashboard v2 payload from {uuid} missing required fields: {', '.join(missing)}"
+        )
+
+
+def _extract_dashboard_payload_fields(data):
+    """
+    Parse dashboard payload fields from the grouped v2 schema only.
+    """
+    if not isinstance(data, dict):
+        return {
+            "image": None,
+            "timestamp": None,
+            "screenshot_type": None,
+            "status": None,
+            "process_health": {},
+        }
+
+    # v2 grouped payload blocks
+    message_obj = data.get("message") if isinstance(data.get("message"), dict) else None
+    content_obj = data.get("content") if isinstance(data.get("content"), dict) else None
+    runtime_obj = data.get("runtime") if isinstance(data.get("runtime"), dict) else None
+    metadata_obj = data.get("metadata") if isinstance(data.get("metadata"), dict) else None
+
+    screenshot_obj = None
+    if isinstance(content_obj, dict) and isinstance(content_obj.get("screenshot"), dict):
+        screenshot_obj = content_obj.get("screenshot")
+
+    capture_obj = metadata_obj.get("capture") if metadata_obj and isinstance(metadata_obj.get("capture"), dict) else None
+
+    # Screenshot type comes from v2 metadata.capture.type.
+    screenshot_type = _normalize_screenshot_type(capture_obj.get("type") if capture_obj else None)
+
+    # Image from v2 content.screenshot.
+    image_value = None
+    for container in (screenshot_obj,):
+        if not isinstance(container, dict):
+            continue
+        for key in ("data", "image"):
+            value = container.get(key)
+            if isinstance(value, str) and value:
+                image_value = value
+                break
+        if image_value is not None:
+            break
+
+    # Timestamp precedence: v2 screenshot.timestamp -> capture.captured_at -> metadata.published_at
+    timestamp_value = None
+    timestamp_candidates = [
+        screenshot_obj.get("timestamp") if screenshot_obj else None,
+        capture_obj.get("captured_at") if capture_obj else None,
+        metadata_obj.get("published_at") if metadata_obj else None,
+    ]
+
+    for value in timestamp_candidates:
+        if value is not None:
+            timestamp_value = value
+            break
+
+    # Monitoring fields from v2 message/runtime.
+    status_value = (message_obj or {}).get("status")
+    process_health = (runtime_obj or {}).get("process_health")
+    if not isinstance(process_health, dict):
+        process_health = {}
+
+    return {
+        "image": image_value,
+        "timestamp": timestamp_value,
+        "screenshot_type": screenshot_type,
+        "status": status_value,
+        "process_health": process_health,
+    }
+

 def handle_screenshot(uuid, payload):
    """
@@ -40,13 +324,21 @@ def handle_screenshot(uuid, payload):
        # Try to parse as JSON first
        try:
            data = json.loads(payload.decode())
-            if "image" in data:
+            extracted = _extract_dashboard_payload_fields(data)
+            image_b64 = extracted["image"]
+            timestamp_value = extracted["timestamp"]
+            screenshot_type = extracted["screenshot_type"]
+            if image_b64:
                # Payload is JSON with base64 image
-                api_payload = {"image": data["image"]}
+                api_payload = {"image": image_b64}
+                if timestamp_value is not None:
+                    api_payload["timestamp"] = timestamp_value
+                if screenshot_type:
+                    api_payload["screenshot_type"] = screenshot_type
                headers = {"Content-Type": "application/json"}
                logging.debug(f"Forwarding base64 screenshot from {uuid} to API")
            else:
-                logging.warning(f"Screenshot JSON from {uuid} missing 'image' field")
+                logging.warning(f"Screenshot JSON from {uuid} missing image/data field")
                return
        except (json.JSONDecodeError, UnicodeDecodeError):
            # Payload is raw binary image data - encode to base64 for API
@@ -78,7 +370,14 @@ def on_connect(client, userdata, flags, reasonCode, properties):
        client.subscribe("infoscreen/+/heartbeat")
        client.subscribe("infoscreen/+/screenshot")
        client.subscribe("infoscreen/+/dashboard")
-        logging.info(f"MQTT connected (reasonCode: {reasonCode}); (re)subscribed to discovery, heartbeats, screenshots, and dashboards")
+        
+        # Subscribe to monitoring topics
+        client.subscribe("infoscreen/+/logs/error")
+        client.subscribe("infoscreen/+/logs/warn")
+        client.subscribe("infoscreen/+/logs/info")
+        client.subscribe("infoscreen/+/health")
+        
+        logging.info(f"MQTT connected (reasonCode: {reasonCode}); (re)subscribed to discovery, heartbeats, screenshots, dashboards, logs, and health")
    except Exception as e:
        logging.error(f"Subscribe failed on connect: {e}")

@@ -94,24 +393,37 @@ def on_message(client, userdata, msg):
            try:
                payload_text = msg.payload.decode()
                data = json.loads(payload_text)
-                shot = data.get("screenshot")
-                if isinstance(shot, dict):
-                    # Prefer 'data' field (base64) inside screenshot object
-                    image_b64 = shot.get("data")
-                    if image_b64:
-                        logging.debug(f"Dashboard enthält Screenshot für {uuid}; Weiterleitung an API")
-                        # Build a lightweight JSON with image field for API handler
-                        api_payload = json.dumps({"image": image_b64}).encode("utf-8")
-                        handle_screenshot(uuid, api_payload)
+                parse_mode, schema_version = _classify_dashboard_payload(data)
+                _record_dashboard_parse_metric(parse_mode, uuid, schema_version=schema_version)
+                if parse_mode == "v2_success":
+                    _validate_v2_required_fields(data, uuid)
+
+                extracted = _extract_dashboard_payload_fields(data)
+                image_b64 = extracted["image"]
+                ts_value = extracted["timestamp"]
+                screenshot_type = extracted["screenshot_type"]
+                if image_b64:
+                    logging.debug(f"Dashboard enthält Screenshot für {uuid}; Weiterleitung an API")
+                    # Forward original v2 payload so handle_screenshot can parse grouped fields.
+                    handle_screenshot(uuid, msg.payload)
                # Update last_alive if status present
-                if data.get("status") == "alive":
+                if extracted["status"] == "alive":
                    session = Session()
                    client_obj = session.query(Client).filter_by(uuid=uuid).first()
                    if client_obj:
-                        client_obj.last_alive = datetime.datetime.now(datetime.UTC)
+                        process_health = extracted["process_health"]
+                        apply_monitoring_update(
+                            client_obj,
+                            last_seen=datetime.datetime.now(datetime.UTC),
+                            event_id=process_health.get('event_id'),
+                            process_name=process_health.get('current_process') or process_health.get('process'),
+                            process_pid=process_health.get('process_pid') or process_health.get('pid'),
+                            process_status=process_health.get('process_status') or process_health.get('status'),
+                        )
                        session.commit()
                    session.close()
            except Exception as e:
+                _record_dashboard_parse_metric("parse_failures", uuid, reason=str(e))
                logging.error(f"Fehler beim Verarbeiten des Dashboard-Payloads von {uuid}: {e}")
            return

@@ -124,15 +436,110 @@ def on_message(client, userdata, msg):
        # Heartbeat-Handling
        if topic.startswith("infoscreen/") and topic.endswith("/heartbeat"):
            uuid = topic.split("/")[1]
+            try:
+                # Parse payload to get optional health data
+                payload_data = json.loads(msg.payload.decode())
+            except (json.JSONDecodeError, UnicodeDecodeError):
+                payload_data = {}
+            
            session = Session()
            client_obj = session.query(Client).filter_by(uuid=uuid).first()
            if client_obj:
-                client_obj.last_alive = datetime.datetime.now(datetime.UTC)
+                apply_monitoring_update(
+                    client_obj,
+                    last_seen=datetime.datetime.now(datetime.UTC),
+                    event_id=payload_data.get('current_event_id'),
+                    process_name=payload_data.get('current_process'),
+                    process_pid=payload_data.get('process_pid'),
+                    process_status=payload_data.get('process_status'),
+                )
                session.commit()
-                logging.info(
-                    f"Heartbeat von {uuid} empfangen, last_alive (UTC) aktualisiert.")
+                logging.info(f"Heartbeat von {uuid} empfangen, last_alive (UTC) aktualisiert.")
            session.close()
            return
+        
+        # Log-Handling (ERROR, WARN, INFO)
+        if topic.startswith("infoscreen/") and "/logs/" in topic:
+            parts = topic.split("/")
+            if len(parts) >= 4:
+                uuid = parts[1]
+                level_str = parts[3].upper()  # 'error', 'warn', 'info' -> 'ERROR', 'WARN', 'INFO'
+                
+                try:
+                    payload_data = json.loads(msg.payload.decode())
+                    message = payload_data.get('message', '')
+                    timestamp_str = payload_data.get('timestamp')
+                    context = payload_data.get('context', {})
+                    
+                    # Parse timestamp or use current time
+                    if timestamp_str:
+                        try:
+                            log_timestamp = datetime.datetime.fromisoformat(timestamp_str.replace('Z', '+00:00'))
+                            if log_timestamp.tzinfo is None:
+                                log_timestamp = log_timestamp.replace(tzinfo=datetime.UTC)
+                        except ValueError:
+                            log_timestamp = datetime.datetime.now(datetime.UTC)
+                    else:
+                        log_timestamp = datetime.datetime.now(datetime.UTC)
+                    
+                    # Store in database
+                    session = Session()
+                    try:
+                        log_level = LogLevel[level_str]
+                        log_entry = ClientLog(
+                            client_uuid=uuid,
+                            timestamp=log_timestamp,
+                            level=log_level,
+                            message=message,
+                            context=json.dumps(context) if context else None
+                        )
+                        session.add(log_entry)
+                        session.commit()
+                        logging.info(f"[{level_str}] {uuid}: {message}")
+                    except Exception as e:
+                        logging.error(f"Error saving log from {uuid}: {e}")
+                        session.rollback()
+                    finally:
+                        session.close()
+                    
+                except (json.JSONDecodeError, UnicodeDecodeError) as e:
+                    logging.error(f"Could not parse log payload from {uuid}: {e}")
+            return
+        
+        # Health-Handling
+        if topic.startswith("infoscreen/") and topic.endswith("/health"):
+            uuid = topic.split("/")[1]
+            try:
+                payload_data = json.loads(msg.payload.decode())
+                
+                session = Session()
+                client_obj = session.query(Client).filter_by(uuid=uuid).first()
+                if client_obj:
+                    # Update expected state
+                    expected = payload_data.get('expected_state', {})
+
+                    # Update actual state
+                    actual = payload_data.get('actual_state', {})
+                    screen_health_status = infer_screen_health_status(payload_data)
+                    apply_monitoring_update(
+                        client_obj,
+                        last_seen=datetime.datetime.now(datetime.UTC),
+                        event_id=expected.get('event_id'),
+                        process_name=actual.get('process'),
+                        process_pid=actual.get('pid'),
+                        process_status=actual.get('status'),
+                        screen_health_status=screen_health_status,
+                        last_screenshot_analyzed=parse_timestamp((payload_data.get('health_metrics') or {}).get('last_frame_update')),
+                    )
+                    session.commit()
+                    logging.debug(f"Health update from {uuid}: {actual.get('process')} ({actual.get('status')})")
+                session.close()
+                
+            except (json.JSONDecodeError, UnicodeDecodeError) as e:
+                logging.error(f"Could not parse health payload from {uuid}: {e}")
+            except Exception as e:
+                logging.error(f"Error processing health from {uuid}: {e}")
+            return

        # Discovery-Handling
        if topic == "infoscreen/discovery":
--- a/listener/test_listener_parser.py
+++ b/listener/test_listener_parser.py
@@ -0,0 +1,378 @@
+"""
+Mixed-format integration tests for the dashboard payload parser.
+
+Tests cover:
+    - Legacy top-level payload is rejected (v2-only mode)
+  - v2 grouped payload: periodic capture
+  - v2 grouped payload: event_start capture
+  - v2 grouped payload: event_stop capture
+    - Classification into v2_success / parse_failures
+  - Soft required-field validation (v2 only, never drops message)
+  - Edge cases: missing image, missing status, non-dict payload
+"""
+
+import sys
+import os
+import logging
+import importlib.util
+
+# listener/ has no __init__.py — load the module directly from its file path
+os.environ.setdefault("DB_CONN", "sqlite:///:memory:")  # prevent DB engine errors on import
+_LISTENER_PATH = os.path.join(os.path.dirname(__file__), "listener.py")
+_spec = importlib.util.spec_from_file_location("listener_module", _LISTENER_PATH)
+_mod = importlib.util.module_from_spec(_spec)
+_spec.loader.exec_module(_mod)
+
+_extract_dashboard_payload_fields = _mod._extract_dashboard_payload_fields
+_classify_dashboard_payload       = _mod._classify_dashboard_payload
+_validate_v2_required_fields      = _mod._validate_v2_required_fields
+_normalize_screenshot_type        = _mod._normalize_screenshot_type
+DASHBOARD_PARSE_METRICS           = _mod.DASHBOARD_PARSE_METRICS
+
+# ---------------------------------------------------------------------------
+# Fixtures
+# ---------------------------------------------------------------------------
+
+IMAGE_B64 = "aGVsbG8="  # base64("hello")
+
+LEGACY_PAYLOAD = {
+    "client_id": "uuid-legacy",
+    "status": "alive",
+    "screenshot": {
+        "data": IMAGE_B64,
+        "timestamp": "2026-03-30T10:00:00+00:00",
+    },
+    "screenshot_type": "periodic",
+    "process_health": {
+        "current_process": "impressive",
+        "process_pid": 1234,
+        "process_status": "running",
+        "event_id": 42,
+    },
+}
+
+def _make_v2(capture_type):
+    return {
+        "message": {
+            "client_id": "uuid-v2",
+            "status": "alive",
+        },
+        "content": {
+            "screenshot": {
+                "filename": "latest.jpg",
+                "data": IMAGE_B64,
+                "timestamp": "2026-03-30T10:15:41.123456+00:00",
+                "size": 6,
+            }
+        },
+        "runtime": {
+            "system_info": {
+                "hostname": "pi-display-01",
+                "ip": "192.168.1.42",
+                "uptime": 12345.0,
+            },
+            "process_health": {
+                "event_id": "evt-7",
+                "event_type": "presentation",
+                "current_process": "impressive",
+                "process_pid": 4123,
+                "process_status": "running",
+                "restart_count": 0,
+            },
+        },
+        "metadata": {
+            "schema_version": "2.0",
+            "producer": "simclient",
+            "published_at": "2026-03-30T10:15:42.004321+00:00",
+            "capture": {
+                "type": capture_type,
+                "captured_at": "2026-03-30T10:15:41.123456+00:00",
+                "age_s": 0.9,
+                "triggered": capture_type != "periodic",
+                "send_immediately": capture_type != "periodic",
+            },
+            "transport": {"qos": 0, "publisher": "simclient"},
+        },
+    }
+
+V2_PERIODIC   = _make_v2("periodic")
+V2_EVT_START  = _make_v2("event_start")
+V2_EVT_STOP   = _make_v2("event_stop")
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+def assert_eq(label, actual, expected):
+    assert actual == expected, f"FAIL [{label}]: expected {expected!r}, got {actual!r}"
+
+def assert_not_none(label, actual):
+    assert actual is not None, f"FAIL [{label}]: expected non-None, got None"
+
+def assert_none(label, actual):
+    assert actual is None, f"FAIL [{label}]: expected None, got {actual!r}"
+
+def assert_warns(label, fn, substring):
+    """Assert that fn() emits a logging.WARNING containing substring."""
+    records = []
+    handler = logging.handlers_collector(records)
+    logger = logging.getLogger()
+    logger.addHandler(handler)
+    try:
+        fn()
+    finally:
+        logger.removeHandler(handler)
+    warnings = [r.getMessage() for r in records if r.levelno == logging.WARNING]
+    assert any(substring in w for w in warnings), (
+        f"FAIL [{label}]: no WARNING containing {substring!r} found in {warnings}"
+    )
+
+
+class _CapturingHandler(logging.Handler):
+    def __init__(self, records):
+        super().__init__()
+        self._records = records
+
+    def emit(self, record):
+        self._records.append(record)
+
+
+def capture_warnings(fn):
+    """Run fn(), return list of WARNING message strings."""
+    records = []
+    handler = _CapturingHandler(records)
+    logger = logging.getLogger()
+    logger.addHandler(handler)
+    try:
+        fn()
+    finally:
+        logger.removeHandler(handler)
+    return [r.getMessage() for r in records if r.levelno == logging.WARNING]
+
+
+# ---------------------------------------------------------------------------
+# Tests: _normalize_screenshot_type
+# ---------------------------------------------------------------------------
+
+def test_normalize_known_types():
+    for t in ("periodic", "event_start", "event_stop"):
+        assert_eq(f"normalize_{t}", _normalize_screenshot_type(t), t)
+        assert_eq(f"normalize_{t}_upper", _normalize_screenshot_type(t.upper()), t)
+
+def test_normalize_unknown_returns_none():
+    assert_none("normalize_unknown", _normalize_screenshot_type("unknown"))
+    assert_none("normalize_none", _normalize_screenshot_type(None))
+    assert_none("normalize_empty", _normalize_screenshot_type(""))
+
+# ---------------------------------------------------------------------------
+# Tests: _classify_dashboard_payload
+# ---------------------------------------------------------------------------
+
+def test_classify_legacy():
+    mode, ver = _classify_dashboard_payload(LEGACY_PAYLOAD)
+    assert_eq("classify_legacy_mode", mode, "parse_failures")
+    assert_none("classify_legacy_version", ver)
+
+def test_classify_v2_periodic():
+    mode, ver = _classify_dashboard_payload(V2_PERIODIC)
+    assert_eq("classify_v2_periodic_mode", mode, "v2_success")
+    assert_eq("classify_v2_periodic_version", ver, "2.0")
+
+def test_classify_v2_event_start():
+    mode, ver = _classify_dashboard_payload(V2_EVT_START)
+    assert_eq("classify_v2_event_start_mode", mode, "v2_success")
+
+def test_classify_v2_event_stop():
+    mode, ver = _classify_dashboard_payload(V2_EVT_STOP)
+    assert_eq("classify_v2_event_stop_mode", mode, "v2_success")
+
+def test_classify_non_dict():
+    mode, ver = _classify_dashboard_payload("not a dict")
+    assert_eq("classify_non_dict", mode, "parse_failures")
+
+def test_classify_empty_dict():
+    mode, ver = _classify_dashboard_payload({})
+    assert_eq("classify_empty_dict", mode, "parse_failures")
+
+# ---------------------------------------------------------------------------
+# Tests: _extract_dashboard_payload_fields — legacy payload rejected in v2-only mode
+# ---------------------------------------------------------------------------
+
+def test_legacy_image_not_extracted():
+    r = _extract_dashboard_payload_fields(LEGACY_PAYLOAD)
+    assert_none("legacy_image", r["image"])
+
+def test_legacy_screenshot_type_missing():
+    r = _extract_dashboard_payload_fields(LEGACY_PAYLOAD)
+    assert_none("legacy_screenshot_type", r["screenshot_type"])
+
+def test_legacy_status_missing():
+    r = _extract_dashboard_payload_fields(LEGACY_PAYLOAD)
+    assert_none("legacy_status", r["status"])
+
+def test_legacy_process_health_empty():
+    r = _extract_dashboard_payload_fields(LEGACY_PAYLOAD)
+    assert_eq("legacy_process_health", r["process_health"], {})
+
+def test_legacy_timestamp_missing():
+    r = _extract_dashboard_payload_fields(LEGACY_PAYLOAD)
+    assert_none("legacy_timestamp", r["timestamp"])
+
+# ---------------------------------------------------------------------------
+# Tests: _extract_dashboard_payload_fields — v2 periodic
+# ---------------------------------------------------------------------------
+
+def test_v2_periodic_image():
+    r = _extract_dashboard_payload_fields(V2_PERIODIC)
+    assert_eq("v2_periodic_image", r["image"], IMAGE_B64)
+
+def test_v2_periodic_screenshot_type():
+    r = _extract_dashboard_payload_fields(V2_PERIODIC)
+    assert_eq("v2_periodic_type", r["screenshot_type"], "periodic")
+
+def test_v2_periodic_status():
+    r = _extract_dashboard_payload_fields(V2_PERIODIC)
+    assert_eq("v2_periodic_status", r["status"], "alive")
+
+def test_v2_periodic_process_health():
+    r = _extract_dashboard_payload_fields(V2_PERIODIC)
+    assert_eq("v2_periodic_pid", r["process_health"]["process_pid"], 4123)
+    assert_eq("v2_periodic_process", r["process_health"]["current_process"], "impressive")
+
+def test_v2_periodic_timestamp_prefers_screenshot():
+    r = _extract_dashboard_payload_fields(V2_PERIODIC)
+    # screenshot.timestamp must take precedence over capture.captured_at / published_at
+    assert_eq("v2_periodic_ts", r["timestamp"], "2026-03-30T10:15:41.123456+00:00")
+
+# ---------------------------------------------------------------------------
+# Tests: _extract_dashboard_payload_fields — v2 event_start
+# ---------------------------------------------------------------------------
+
+def test_v2_event_start_type():
+    r = _extract_dashboard_payload_fields(V2_EVT_START)
+    assert_eq("v2_event_start_type", r["screenshot_type"], "event_start")
+
+def test_v2_event_start_image():
+    r = _extract_dashboard_payload_fields(V2_EVT_START)
+    assert_eq("v2_event_start_image", r["image"], IMAGE_B64)
+
+# ---------------------------------------------------------------------------
+# Tests: _extract_dashboard_payload_fields — v2 event_stop
+# ---------------------------------------------------------------------------
+
+def test_v2_event_stop_type():
+    r = _extract_dashboard_payload_fields(V2_EVT_STOP)
+    assert_eq("v2_event_stop_type", r["screenshot_type"], "event_stop")
+
+def test_v2_event_stop_image():
+    r = _extract_dashboard_payload_fields(V2_EVT_STOP)
+    assert_eq("v2_event_stop_image", r["image"], IMAGE_B64)
+
+# ---------------------------------------------------------------------------
+# Tests: _extract_dashboard_payload_fields — edge cases
+# ---------------------------------------------------------------------------
+
+def test_non_dict_returns_nulls():
+    r = _extract_dashboard_payload_fields("bad")
+    assert_none("non_dict_image",   r["image"])
+    assert_none("non_dict_type",    r["screenshot_type"])
+    assert_none("non_dict_status",  r["status"])
+
+def test_missing_image_returns_none():
+    payload = {**V2_PERIODIC, "content": {"screenshot": {"timestamp": "2026-03-30T10:00:00+00:00"}}}
+    r = _extract_dashboard_payload_fields(payload)
+    assert_none("missing_image", r["image"])
+
+def test_missing_process_health_returns_empty_dict():
+    import copy
+    payload = copy.deepcopy(V2_PERIODIC)
+    del payload["runtime"]["process_health"]
+    r = _extract_dashboard_payload_fields(payload)
+    assert_eq("missing_ph", r["process_health"], {})
+
+def test_timestamp_fallback_to_captured_at_when_no_screenshot_ts():
+    import copy
+    payload = copy.deepcopy(V2_PERIODIC)
+    del payload["content"]["screenshot"]["timestamp"]
+    r = _extract_dashboard_payload_fields(payload)
+    assert_eq("ts_fallback_captured_at", r["timestamp"], "2026-03-30T10:15:41.123456+00:00")
+
+def test_timestamp_fallback_to_published_at_when_no_capture_ts():
+    import copy
+    payload = copy.deepcopy(V2_PERIODIC)
+    del payload["content"]["screenshot"]["timestamp"]
+    del payload["metadata"]["capture"]["captured_at"]
+    r = _extract_dashboard_payload_fields(payload)
+    assert_eq("ts_fallback_published_at", r["timestamp"], "2026-03-30T10:15:42.004321+00:00")
+
+# ---------------------------------------------------------------------------
+# Tests: _validate_v2_required_fields (soft — never raises)
+# ---------------------------------------------------------------------------
+
+def test_v2_valid_payload_no_warnings():
+    warnings = capture_warnings(lambda: _validate_v2_required_fields(V2_PERIODIC, "uuid-v2"))
+    assert warnings == [], f"FAIL: unexpected warnings for valid payload: {warnings}"
+
+def test_v2_missing_client_id_warns():
+    import copy
+    payload = copy.deepcopy(V2_PERIODIC)
+    del payload["message"]["client_id"]
+    warnings = capture_warnings(lambda: _validate_v2_required_fields(payload, "uuid-v2"))
+    assert any("message.client_id" in w for w in warnings), f"FAIL: {warnings}"
+
+def test_v2_missing_status_warns():
+    import copy
+    payload = copy.deepcopy(V2_PERIODIC)
+    del payload["message"]["status"]
+    warnings = capture_warnings(lambda: _validate_v2_required_fields(payload, "uuid-v2"))
+    assert any("message.status" in w for w in warnings), f"FAIL: {warnings}"
+
+def test_v2_missing_schema_version_warns():
+    import copy
+    payload = copy.deepcopy(V2_PERIODIC)
+    del payload["metadata"]["schema_version"]
+    warnings = capture_warnings(lambda: _validate_v2_required_fields(payload, "uuid-v2"))
+    assert any("metadata.schema_version" in w for w in warnings), f"FAIL: {warnings}"
+
+def test_v2_missing_capture_type_warns():
+    import copy
+    payload = copy.deepcopy(V2_PERIODIC)
+    del payload["metadata"]["capture"]["type"]
+    warnings = capture_warnings(lambda: _validate_v2_required_fields(payload, "uuid-v2"))
+    assert any("metadata.capture.type" in w for w in warnings), f"FAIL: {warnings}"
+
+def test_v2_multiple_missing_fields_all_reported():
+    import copy
+    payload = copy.deepcopy(V2_PERIODIC)
+    del payload["message"]["client_id"]
+    del payload["metadata"]["capture"]["type"]
+    warnings = capture_warnings(lambda: _validate_v2_required_fields(payload, "uuid-v2"))
+    assert len(warnings) == 1, f"FAIL: expected 1 combined warning, got {warnings}"
+    assert "message.client_id" in warnings[0], f"FAIL: {warnings}"
+    assert "metadata.capture.type" in warnings[0], f"FAIL: {warnings}"
+
+# ---------------------------------------------------------------------------
+# Runner
+# ---------------------------------------------------------------------------
+
+def run_all():
+    tests = {k: v for k, v in globals().items() if k.startswith("test_") and callable(v)}
+    passed = failed = 0
+    for name, fn in sorted(tests.items()):
+        try:
+            fn()
+            print(f"  PASS  {name}")
+            passed += 1
+        except AssertionError as e:
+            print(f"  FAIL  {name}: {e}")
+            failed += 1
+        except Exception as e:
+            print(f"  ERROR {name}: {type(e).__name__}: {e}")
+            failed += 1
+    print(f"\n{passed} passed, {failed} failed out of {passed + failed} tests")
+    return failed == 0
+
+
+if __name__ == "__main__":
+    ok = run_all()
+    sys.exit(0 if ok else 1)
--- a/models/models.py
+++ b/models/models.py
@@ -21,6 +21,27 @@ class AcademicPeriodType(enum.Enum):
    trimester = "trimester"


+class LogLevel(enum.Enum):
+    ERROR = "ERROR"
+    WARN = "WARN"
+    INFO = "INFO"
+    DEBUG = "DEBUG"
+
+
+class ProcessStatus(enum.Enum):
+    running = "running"
+    crashed = "crashed"
+    starting = "starting"
+    stopped = "stopped"
+
+
+class ScreenHealthStatus(enum.Enum):
+    OK = "OK"
+    BLACK = "BLACK"
+    FROZEN = "FROZEN"
+    UNKNOWN = "UNKNOWN"
+
+
 class User(Base):
    __tablename__ = 'users'
    id = Column(Integer, primary_key=True, autoincrement=True)
@@ -106,6 +127,31 @@ class Client(Base):
    is_active = Column(Boolean, default=True, nullable=False)
    group_id = Column(Integer, ForeignKey(
        'client_groups.id'), nullable=False, default=1)
+    
+    # Health monitoring fields
+    current_event_id = Column(Integer, nullable=True)
+    current_process = Column(String(50), nullable=True)  # 'vlc', 'chromium', 'pdf_viewer'
+    process_status = Column(Enum(ProcessStatus), nullable=True)
+    process_pid = Column(Integer, nullable=True)
+    last_screenshot_analyzed = Column(TIMESTAMP(timezone=True), nullable=True)
+    screen_health_status = Column(Enum(ScreenHealthStatus), nullable=True, server_default='UNKNOWN')
+    last_screenshot_hash = Column(String(32), nullable=True)
+
+
+class ClientLog(Base):
+    __tablename__ = 'client_logs'
+    id = Column(Integer, primary_key=True, autoincrement=True)
+    client_uuid = Column(String(36), ForeignKey('clients.uuid', ondelete='CASCADE'), nullable=False, index=True)
+    timestamp = Column(TIMESTAMP(timezone=True), nullable=False, index=True)
+    level = Column(Enum(LogLevel), nullable=False, index=True)
+    message = Column(Text, nullable=False)
+    context = Column(Text, nullable=True)  # JSON stored as text
+    created_at = Column(TIMESTAMP(timezone=True), server_default=func.current_timestamp(), nullable=False)
+    
+    __table_args__ = (
+        Index('ix_client_logs_client_timestamp', 'client_uuid', 'timestamp'),
+        Index('ix_client_logs_level_timestamp', 'level', 'timestamp'),
+    )


 class EventType(enum.Enum):
--- a/scheduler/db_utils.py
+++ b/scheduler/db_utils.py
@@ -306,6 +306,7 @@ def format_event_with_media(event):
                "autoplay": getattr(event, "autoplay", True),
                "loop": getattr(event, "loop", False),
                "volume": getattr(event, "volume", 0.8),
+                "muted": getattr(event, "muted", False),
                # Best-effort metadata to help clients decide how to stream
                "mime_type": mime_type,
                "size": size,
--- a/server/alembic/versions/c1d2e3f4g5h6_add_client_monitoring.py
+++ b/server/alembic/versions/c1d2e3f4g5h6_add_client_monitoring.py
@@ -0,0 +1,84 @@
+"""add client monitoring tables and columns
+
+Revision ID: c1d2e3f4g5h6
+Revises: 4f0b8a3e5c20
+Create Date: 2026-03-09 21:08:38.000000
+
+"""
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers, used by Alembic.
+revision = 'c1d2e3f4g5h6'
+down_revision = '4f0b8a3e5c20'
+branch_labels = None
+depends_on = None
+
+
+def upgrade():
+    bind = op.get_bind()
+    inspector = sa.inspect(bind)
+
+    # 1. Add health monitoring columns to clients table (safe on rerun)
+    existing_client_columns = {c['name'] for c in inspector.get_columns('clients')}
+    if 'current_event_id' not in existing_client_columns:
+        op.add_column('clients', sa.Column('current_event_id', sa.Integer(), nullable=True))
+    if 'current_process' not in existing_client_columns:
+        op.add_column('clients', sa.Column('current_process', sa.String(50), nullable=True))
+    if 'process_status' not in existing_client_columns:
+        op.add_column('clients', sa.Column('process_status', sa.Enum('running', 'crashed', 'starting', 'stopped', name='processstatus'), nullable=True))
+    if 'process_pid' not in existing_client_columns:
+        op.add_column('clients', sa.Column('process_pid', sa.Integer(), nullable=True))
+    if 'last_screenshot_analyzed' not in existing_client_columns:
+        op.add_column('clients', sa.Column('last_screenshot_analyzed', sa.TIMESTAMP(timezone=True), nullable=True))
+    if 'screen_health_status' not in existing_client_columns:
+        op.add_column('clients', sa.Column('screen_health_status', sa.Enum('OK', 'BLACK', 'FROZEN', 'UNKNOWN', name='screenhealthstatus'), nullable=True, server_default='UNKNOWN'))
+    if 'last_screenshot_hash' not in existing_client_columns:
+        op.add_column('clients', sa.Column('last_screenshot_hash', sa.String(32), nullable=True))
+
+    # 2. Create client_logs table (safe on rerun)
+    if not inspector.has_table('client_logs'):
+        op.create_table('client_logs',
+            sa.Column('id', sa.Integer(), autoincrement=True, nullable=False),
+            sa.Column('client_uuid', sa.String(36), nullable=False),
+            sa.Column('timestamp', sa.TIMESTAMP(timezone=True), nullable=False),
+            sa.Column('level', sa.Enum('ERROR', 'WARN', 'INFO', 'DEBUG', name='loglevel'), nullable=False),
+            sa.Column('message', sa.Text(), nullable=False),
+            sa.Column('context', sa.JSON(), nullable=True),
+            sa.Column('created_at', sa.TIMESTAMP(timezone=True), server_default=sa.func.current_timestamp(), nullable=False),
+            sa.PrimaryKeyConstraint('id'),
+            sa.ForeignKeyConstraint(['client_uuid'], ['clients.uuid'], ondelete='CASCADE'),
+            mysql_charset='utf8mb4',
+            mysql_collate='utf8mb4_unicode_ci',
+            mysql_engine='InnoDB'
+        )
+
+    # 3. Create indexes for efficient querying (safe on rerun)
+    client_log_indexes = {idx['name'] for idx in inspector.get_indexes('client_logs')} if inspector.has_table('client_logs') else set()
+    client_indexes = {idx['name'] for idx in inspector.get_indexes('clients')}
+
+    if 'ix_client_logs_client_timestamp' not in client_log_indexes:
+        op.create_index('ix_client_logs_client_timestamp', 'client_logs', ['client_uuid', 'timestamp'])
+    if 'ix_client_logs_level_timestamp' not in client_log_indexes:
+        op.create_index('ix_client_logs_level_timestamp', 'client_logs', ['level', 'timestamp'])
+    if 'ix_clients_process_status' not in client_indexes:
+        op.create_index('ix_clients_process_status', 'clients', ['process_status'])
+
+
+def downgrade():
+    # Drop indexes
+    op.drop_index('ix_clients_process_status', table_name='clients')
+    op.drop_index('ix_client_logs_level_timestamp', table_name='client_logs')
+    op.drop_index('ix_client_logs_client_timestamp', table_name='client_logs')
+    
+    # Drop table
+    op.drop_table('client_logs')
+    
+    # Drop columns from clients
+    op.drop_column('clients', 'last_screenshot_hash')
+    op.drop_column('clients', 'screen_health_status')
+    op.drop_column('clients', 'last_screenshot_analyzed')
+    op.drop_column('clients', 'process_pid')
+    op.drop_column('clients', 'process_status')
+    op.drop_column('clients', 'current_process')
+    op.drop_column('clients', 'current_event_id')
--- a/server/routes/client_logs.py
+++ b/server/routes/client_logs.py
@@ -0,0 +1,491 @@
+from flask import Blueprint, jsonify, request
+from server.database import Session
+from server.permissions import admin_or_higher, superadmin_only
+from models.models import ClientLog, Client, ClientGroup, LogLevel
+from sqlalchemy import desc, func
+from datetime import datetime, timedelta, timezone
+import json
+import os
+import glob
+
+from server.serializers import dict_to_camel_case
+
+client_logs_bp = Blueprint("client_logs", __name__, url_prefix="/api/client-logs")
+PRIORITY_SCREENSHOT_TTL_SECONDS = int(os.environ.get("PRIORITY_SCREENSHOT_TTL_SECONDS", "120"))
+
+
+def _grace_period_seconds():
+    env = os.environ.get("ENV", "production").lower()
+    if env in ("development", "dev"):
+        return int(os.environ.get("HEARTBEAT_GRACE_PERIOD_DEV", "180"))
+    return int(os.environ.get("HEARTBEAT_GRACE_PERIOD_PROD", "170"))
+
+
+def _to_utc(dt):
+    if dt is None:
+        return None
+    if dt.tzinfo is None:
+        return dt.replace(tzinfo=timezone.utc)
+    return dt.astimezone(timezone.utc)
+
+
+def _is_client_alive(last_alive, is_active):
+    if not last_alive or not is_active:
+        return False
+    return (datetime.now(timezone.utc) - _to_utc(last_alive)) <= timedelta(seconds=_grace_period_seconds())
+
+
+def _safe_context(raw_context):
+    if not raw_context:
+        return {}
+    try:
+        return json.loads(raw_context)
+    except (TypeError, json.JSONDecodeError):
+        return {"raw": raw_context}
+
+
+def _serialize_log_entry(log, include_client_uuid=False):
+    if not log:
+        return None
+
+    entry = {
+        "id": log.id,
+        "timestamp": log.timestamp.isoformat() if log.timestamp else None,
+        "level": log.level.value if log.level else None,
+        "message": log.message,
+        "context": _safe_context(log.context),
+    }
+    if include_client_uuid:
+        entry["client_uuid"] = log.client_uuid
+    return entry
+
+
+def _determine_client_status(is_alive, process_status, screen_health_status, log_counts):
+    if not is_alive:
+        return "offline"
+    if process_status == "crashed" or screen_health_status in ("BLACK", "FROZEN"):
+        return "critical"
+    if log_counts.get("ERROR", 0) > 0:
+        return "critical"
+    if process_status in ("starting", "stopped") or log_counts.get("WARN", 0) > 0:
+        return "warning"
+    return "healthy"
+
+
+def _infer_last_screenshot_ts(client_uuid):
+    screenshots_dir = os.path.join(os.path.dirname(__file__), "..", "screenshots")
+
+    candidate_files = []
+    latest_file = os.path.join(screenshots_dir, f"{client_uuid}.jpg")
+    if os.path.exists(latest_file):
+        candidate_files.append(latest_file)
+
+    candidate_files.extend(glob.glob(os.path.join(screenshots_dir, f"{client_uuid}_*.jpg")))
+    if not candidate_files:
+        return None
+
+    try:
+        newest_path = max(candidate_files, key=os.path.getmtime)
+        return datetime.fromtimestamp(os.path.getmtime(newest_path), timezone.utc)
+    except Exception:
+        return None
+
+
+def _load_screenshot_metadata(client_uuid):
+    screenshots_dir = os.path.join(os.path.dirname(__file__), "..", "screenshots")
+    metadata_path = os.path.join(screenshots_dir, f"{client_uuid}_meta.json")
+    if not os.path.exists(metadata_path):
+        return {}
+
+    try:
+        with open(metadata_path, "r", encoding="utf-8") as metadata_file:
+            data = json.load(metadata_file)
+        return data if isinstance(data, dict) else {}
+    except Exception:
+        return {}
+
+
+def _is_priority_screenshot_active(priority_received_at):
+    if not priority_received_at:
+        return False
+
+    try:
+        normalized = str(priority_received_at).replace("Z", "+00:00")
+        parsed = datetime.fromisoformat(normalized)
+        parsed_utc = _to_utc(parsed)
+    except Exception:
+        return False
+
+    return (datetime.now(timezone.utc) - parsed_utc) <= timedelta(seconds=PRIORITY_SCREENSHOT_TTL_SECONDS)
+
+
+@client_logs_bp.route("/test", methods=["GET"])
+def test_client_logs():
+    """Test endpoint to verify logging infrastructure (no auth required)"""
+    session = Session()
+    try:
+        # Count total logs
+        total_logs = session.query(func.count(ClientLog.id)).scalar()
+        
+        # Count by level
+        error_count = session.query(func.count(ClientLog.id)).filter_by(level=LogLevel.ERROR).scalar()
+        warn_count = session.query(func.count(ClientLog.id)).filter_by(level=LogLevel.WARN).scalar()
+        info_count = session.query(func.count(ClientLog.id)).filter_by(level=LogLevel.INFO).scalar()
+        
+        # Get last 5 logs
+        recent_logs = session.query(ClientLog).order_by(desc(ClientLog.timestamp)).limit(5).all()
+        
+        recent = []
+        for log in recent_logs:
+            recent.append({
+                "client_uuid": log.client_uuid,
+                "level": log.level.value if log.level else None,
+                "message": log.message,
+                "timestamp": log.timestamp.isoformat() if log.timestamp else None
+            })
+        
+        session.close()
+        return jsonify({
+            "status": "ok",
+            "infrastructure": "working",
+            "total_logs": total_logs,
+            "counts": {
+                "ERROR": error_count,
+                "WARN": warn_count,
+                "INFO": info_count
+            },
+            "recent_5": recent
+        })
+    except Exception as e:
+        session.close()
+        return jsonify({"status": "error", "message": str(e)}), 500
+
+
+@client_logs_bp.route("/<uuid>/logs", methods=["GET"])
+@admin_or_higher
+def get_client_logs(uuid):
+    """
+    Get logs for a specific client
+    Query params:
+      - level: ERROR, WARN, INFO, DEBUG (optional)
+      - limit: number of entries (default 50, max 500)
+      - since: ISO timestamp (optional)
+    
+    Example: /api/client-logs/abc-123/logs?level=ERROR&limit=100
+    """
+    session = Session()
+    try:
+        # Verify client exists
+        client = session.query(Client).filter_by(uuid=uuid).first()
+        if not client:
+            session.close()
+            return jsonify({"error": "Client not found"}), 404
+        
+        # Parse query parameters
+        level_param = request.args.get('level')
+        limit = min(int(request.args.get('limit', 50)), 500)
+        since_param = request.args.get('since')
+        
+        # Build query
+        query = session.query(ClientLog).filter_by(client_uuid=uuid)
+        
+        # Filter by log level
+        if level_param:
+            try:
+                level_enum = LogLevel[level_param.upper()]
+                query = query.filter_by(level=level_enum)
+            except KeyError:
+                session.close()
+                return jsonify({"error": f"Invalid level: {level_param}. Must be ERROR, WARN, INFO, or DEBUG"}), 400
+        
+        # Filter by timestamp
+        if since_param:
+            try:
+                # Handle both with and without 'Z' suffix
+                since_str = since_param.replace('Z', '+00:00')
+                since_dt = datetime.fromisoformat(since_str)
+                if since_dt.tzinfo is None:
+                    since_dt = since_dt.replace(tzinfo=timezone.utc)
+                query = query.filter(ClientLog.timestamp >= since_dt)
+            except ValueError:
+                session.close()
+                return jsonify({"error": "Invalid timestamp format. Use ISO 8601"}), 400
+        
+        # Execute query
+        logs = query.order_by(desc(ClientLog.timestamp)).limit(limit).all()
+        
+        # Format results
+        result = []
+        for log in logs:
+            result.append(_serialize_log_entry(log))
+        
+        session.close()
+        return jsonify({
+            "client_uuid": uuid,
+            "logs": result,
+            "count": len(result),
+            "limit": limit
+        })
+    
+    except Exception as e:
+        session.close()
+        return jsonify({"error": f"Server error: {str(e)}"}), 500
+
+
+@client_logs_bp.route("/summary", methods=["GET"])
+@admin_or_higher
+def get_logs_summary():
+    """
+    Get summary of errors/warnings across all clients in last 24 hours
+    Returns count of ERROR, WARN, INFO logs per client
+    
+    Example response:
+    {
+      "summary": {
+        "client-uuid-1": {"ERROR": 5, "WARN": 12, "INFO": 45},
+        "client-uuid-2": {"ERROR": 0, "WARN": 3, "INFO": 20}
+      },
+      "period_hours": 24,
+      "timestamp": "2026-03-09T21:00:00Z"
+    }
+    """
+    session = Session()
+    try:
+        # Get hours parameter (default 24, max 168 = 1 week)
+        hours = min(int(request.args.get('hours', 24)), 168)
+        since = datetime.now(timezone.utc) - timedelta(hours=hours)
+        
+        # Query log counts grouped by client and level
+        stats = session.query(
+            ClientLog.client_uuid,
+            ClientLog.level,
+            func.count(ClientLog.id).label('count')
+        ).filter(
+            ClientLog.timestamp >= since
+        ).group_by(
+            ClientLog.client_uuid,
+            ClientLog.level
+        ).all()
+        
+        # Build summary dictionary
+        summary = {}
+        for stat in stats:
+            uuid = stat.client_uuid
+            if uuid not in summary:
+                # Initialize all levels to 0
+                summary[uuid] = {
+                    "ERROR": 0,
+                    "WARN": 0,
+                    "INFO": 0,
+                    "DEBUG": 0
+                }
+            
+            summary[uuid][stat.level.value] = stat.count
+        
+        # Get client info for enrichment
+        clients = session.query(Client.uuid, Client.hostname, Client.description).all()
+        client_info = {c.uuid: {"hostname": c.hostname, "description": c.description} for c in clients}
+        
+        # Enrich summary with client info
+        enriched_summary = {}
+        for uuid, counts in summary.items():
+            enriched_summary[uuid] = {
+                "counts": counts,
+                "info": client_info.get(uuid, {})
+            }
+        
+        session.close()
+        return jsonify({
+            "summary": enriched_summary,
+            "period_hours": hours,
+            "since": since.isoformat(),
+            "timestamp": datetime.now(timezone.utc).isoformat()
+        })
+    
+    except Exception as e:
+        session.close()
+        return jsonify({"error": f"Server error: {str(e)}"}), 500
+
+
+@client_logs_bp.route("/monitoring-overview", methods=["GET"])
+@superadmin_only
+def get_monitoring_overview():
+    """Return a dashboard-friendly monitoring overview for all clients."""
+    session = Session()
+    try:
+        hours = min(int(request.args.get("hours", 24)), 168)
+        since = datetime.now(timezone.utc) - timedelta(hours=hours)
+
+        clients = (
+            session.query(Client, ClientGroup.name.label("group_name"))
+            .outerjoin(ClientGroup, Client.group_id == ClientGroup.id)
+            .order_by(ClientGroup.name.asc(), Client.description.asc(), Client.hostname.asc(), Client.uuid.asc())
+            .all()
+        )
+
+        log_stats = (
+            session.query(
+                ClientLog.client_uuid,
+                ClientLog.level,
+                func.count(ClientLog.id).label("count"),
+            )
+            .filter(ClientLog.timestamp >= since)
+            .group_by(ClientLog.client_uuid, ClientLog.level)
+            .all()
+        )
+
+        counts_by_client = {}
+        for stat in log_stats:
+            if stat.client_uuid not in counts_by_client:
+                counts_by_client[stat.client_uuid] = {
+                    "ERROR": 0,
+                    "WARN": 0,
+                    "INFO": 0,
+                    "DEBUG": 0,
+                }
+            counts_by_client[stat.client_uuid][stat.level.value] = stat.count
+
+        clients_payload = []
+        summary_counts = {
+            "total_clients": 0,
+            "online_clients": 0,
+            "offline_clients": 0,
+            "healthy_clients": 0,
+            "warning_clients": 0,
+            "critical_clients": 0,
+            "error_logs": 0,
+            "warn_logs": 0,
+            "active_priority_screenshots": 0,
+        }
+
+        for client, group_name in clients:
+            log_counts = counts_by_client.get(
+                client.uuid,
+                {"ERROR": 0, "WARN": 0, "INFO": 0, "DEBUG": 0},
+            )
+            is_alive = _is_client_alive(client.last_alive, client.is_active)
+            process_status = client.process_status.value if client.process_status else None
+            screen_health_status = client.screen_health_status.value if client.screen_health_status else None
+            status = _determine_client_status(is_alive, process_status, screen_health_status, log_counts)
+
+            latest_log = (
+                session.query(ClientLog)
+                .filter_by(client_uuid=client.uuid)
+                .order_by(desc(ClientLog.timestamp))
+                .first()
+            )
+            latest_error = (
+                session.query(ClientLog)
+                .filter_by(client_uuid=client.uuid, level=LogLevel.ERROR)
+                .order_by(desc(ClientLog.timestamp))
+                .first()
+            )
+
+            screenshot_ts = client.last_screenshot_analyzed or _infer_last_screenshot_ts(client.uuid)
+            screenshot_meta = _load_screenshot_metadata(client.uuid)
+            latest_screenshot_type = screenshot_meta.get("latest_screenshot_type") or "periodic"
+            priority_screenshot_type = screenshot_meta.get("last_priority_screenshot_type")
+            priority_screenshot_received_at = screenshot_meta.get("last_priority_received_at")
+            has_active_priority = _is_priority_screenshot_active(priority_screenshot_received_at)
+            screenshot_url = f"/screenshots/{client.uuid}/priority" if has_active_priority else f"/screenshots/{client.uuid}"
+
+            clients_payload.append({
+                "uuid": client.uuid,
+                "hostname": client.hostname,
+                "description": client.description,
+                "ip": client.ip,
+                "model": client.model,
+                "group_id": client.group_id,
+                "group_name": group_name,
+                "registration_time": client.registration_time.isoformat() if client.registration_time else None,
+                "last_alive": client.last_alive.isoformat() if client.last_alive else None,
+                "is_alive": is_alive,
+                "status": status,
+                "current_event_id": client.current_event_id,
+                "current_process": client.current_process,
+                "process_status": process_status,
+                "process_pid": client.process_pid,
+                "screen_health_status": screen_health_status,
+                "last_screenshot_analyzed": screenshot_ts.isoformat() if screenshot_ts else None,
+                "last_screenshot_hash": client.last_screenshot_hash,
+                "latest_screenshot_type": latest_screenshot_type,
+                "priority_screenshot_type": priority_screenshot_type,
+                "priority_screenshot_received_at": priority_screenshot_received_at,
+                "has_active_priority_screenshot": has_active_priority,
+                "screenshot_url": screenshot_url,
+                "log_counts_24h": {
+                    "error": log_counts["ERROR"],
+                    "warn": log_counts["WARN"],
+                    "info": log_counts["INFO"],
+                    "debug": log_counts["DEBUG"],
+                },
+                "latest_log": _serialize_log_entry(latest_log),
+                "latest_error": _serialize_log_entry(latest_error),
+            })
+
+            summary_counts["total_clients"] += 1
+            summary_counts["error_logs"] += log_counts["ERROR"]
+            summary_counts["warn_logs"] += log_counts["WARN"]
+            if has_active_priority:
+                summary_counts["active_priority_screenshots"] += 1
+            if is_alive:
+                summary_counts["online_clients"] += 1
+            else:
+                summary_counts["offline_clients"] += 1
+            if status == "healthy":
+                summary_counts["healthy_clients"] += 1
+            elif status == "warning":
+                summary_counts["warning_clients"] += 1
+            elif status == "critical":
+                summary_counts["critical_clients"] += 1
+
+        payload = {
+            "summary": summary_counts,
+            "period_hours": hours,
+            "grace_period_seconds": _grace_period_seconds(),
+            "since": since.isoformat(),
+            "timestamp": datetime.now(timezone.utc).isoformat(),
+            "clients": clients_payload,
+        }
+        session.close()
+        return jsonify(dict_to_camel_case(payload))
+
+    except Exception as e:
+        session.close()
+        return jsonify({"error": f"Server error: {str(e)}"}), 500
+
+
+@client_logs_bp.route("/recent-errors", methods=["GET"])
+@admin_or_higher
+def get_recent_errors():
+    """
+    Get recent ERROR logs across all clients
+    Query params:
+      - limit: number of entries (default 20, max 100)
+    
+    Useful for system-wide error monitoring
+    """
+    session = Session()
+    try:
+        limit = min(int(request.args.get('limit', 20)), 100)
+        
+        # Get recent errors from all clients
+        logs = session.query(ClientLog).filter_by(
+            level=LogLevel.ERROR
+        ).order_by(
+            desc(ClientLog.timestamp)
+        ).limit(limit).all()
+        
+        result = []
+        for log in logs:
+            result.append(_serialize_log_entry(log, include_client_uuid=True))
+        
+        session.close()
+        return jsonify({
+            "errors": result,
+            "count": len(result)
+        })
+    
+    except Exception as e:
+        session.close()
+        return jsonify({"error": f"Server error: {str(e)}"}), 500
--- a/server/routes/clients.py
+++ b/server/routes/clients.py
@@ -4,10 +4,58 @@ from flask import Blueprint, request, jsonify
 from server.permissions import admin_or_higher
 from server.mqtt_helper import publish_client_group, delete_client_group_message, publish_multiple_client_groups
 import sys
+import os
+import glob
+import base64
+import hashlib
+import json
+from datetime import datetime, timezone
 sys.path.append('/workspace')

 clients_bp = Blueprint("clients", __name__, url_prefix="/api/clients")

+VALID_SCREENSHOT_TYPES = {"periodic", "event_start", "event_stop"}
+
+
+def _normalize_screenshot_type(raw_type):
+    if raw_type is None:
+        return "periodic"
+    normalized = str(raw_type).strip().lower()
+    if normalized in VALID_SCREENSHOT_TYPES:
+        return normalized
+    return "periodic"
+
+
+def _parse_screenshot_timestamp(raw_timestamp):
+    if raw_timestamp is None:
+        return None
+    try:
+        if isinstance(raw_timestamp, (int, float)):
+            ts_value = float(raw_timestamp)
+            if ts_value > 1e12:
+                ts_value = ts_value / 1000.0
+            return datetime.fromtimestamp(ts_value, timezone.utc)
+
+        if isinstance(raw_timestamp, str):
+            ts = raw_timestamp.strip()
+            if not ts:
+                return None
+            if ts.isdigit():
+                ts_value = float(ts)
+                if ts_value > 1e12:
+                    ts_value = ts_value / 1000.0
+                return datetime.fromtimestamp(ts_value, timezone.utc)
+
+            ts_normalized = ts.replace("Z", "+00:00") if ts.endswith("Z") else ts
+            parsed = datetime.fromisoformat(ts_normalized)
+            if parsed.tzinfo is None:
+                return parsed.replace(tzinfo=timezone.utc)
+            return parsed.astimezone(timezone.utc)
+    except Exception:
+        return None
+
+    return None
+

@clients_bp.route("/sync-all-groups", methods=["POST"])
@admin_or_higher
@@ -281,24 +329,24 @@ def upload_screenshot(uuid):
    Screenshots are stored as {uuid}.jpg in the screenshots folder.
    Keeps last 20 screenshots per client (auto-cleanup).
    """
-    import os
-    import base64
-    import glob
-    from datetime import datetime
-
    session = Session()
    client = session.query(Client).filter_by(uuid=uuid).first()
    if not client:
        session.close()
        return jsonify({"error": "Client nicht gefunden"}), 404
-    session.close()

    try:
+        screenshot_timestamp = None
+        screenshot_type = "periodic"
+
        # Handle JSON payload with base64-encoded image
        if request.is_json:
            data = request.get_json()
            if "image" not in data:
                return jsonify({"error": "Missing 'image' field in JSON payload"}), 400
+
+            screenshot_timestamp = _parse_screenshot_timestamp(data.get("timestamp"))
+            screenshot_type = _normalize_screenshot_type(data.get("screenshot_type") or data.get("screenshotType"))
            
            # Decode base64 image
            image_data = base64.b64decode(data["image"])
@@ -314,8 +362,9 @@ def upload_screenshot(uuid):
        os.makedirs(screenshots_dir, exist_ok=True)

        # Store screenshot with timestamp to track latest
-        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
-        filename = f"{uuid}_{timestamp}.jpg"
+        now_utc = screenshot_timestamp or datetime.now(timezone.utc)
+        timestamp = now_utc.strftime("%Y%m%d_%H%M%S_%f")
+        filename = f"{uuid}_{timestamp}_{screenshot_type}.jpg"
        filepath = os.path.join(screenshots_dir, filename)

        with open(filepath, "wb") as f:
@@ -326,9 +375,42 @@ def upload_screenshot(uuid):
        with open(latest_filepath, "wb") as f:
            f.write(image_data)

+        # Keep a dedicated copy for high-priority event screenshots.
+        if screenshot_type in ("event_start", "event_stop"):
+            priority_filepath = os.path.join(screenshots_dir, f"{uuid}_priority.jpg")
+            with open(priority_filepath, "wb") as f:
+                f.write(image_data)
+
+        metadata_path = os.path.join(screenshots_dir, f"{uuid}_meta.json")
+        metadata = {}
+        if os.path.exists(metadata_path):
+            try:
+                with open(metadata_path, "r", encoding="utf-8") as meta_file:
+                    metadata = json.load(meta_file)
+            except Exception:
+                metadata = {}
+
+        metadata.update({
+            "latest_screenshot_type": screenshot_type,
+            "latest_received_at": now_utc.isoformat(),
+        })
+        if screenshot_type in ("event_start", "event_stop"):
+            metadata["last_priority_screenshot_type"] = screenshot_type
+            metadata["last_priority_received_at"] = now_utc.isoformat()
+
+        with open(metadata_path, "w", encoding="utf-8") as meta_file:
+            json.dump(metadata, meta_file)
+
+        # Update screenshot receive timestamp for monitoring dashboard
+        client.last_screenshot_analyzed = now_utc
+        client.last_screenshot_hash = hashlib.md5(image_data).hexdigest()
+        session.commit()
+
        # Cleanup: keep only last 20 timestamped screenshots per client
        pattern = os.path.join(screenshots_dir, f"{uuid}_*.jpg")
-        existing_screenshots = sorted(glob.glob(pattern))
+        existing_screenshots = sorted(
+            [path for path in glob.glob(pattern) if not path.endswith("_priority.jpg")]
+        )
        
        # Keep last 20, delete older ones
        max_screenshots = 20
@@ -345,11 +427,15 @@ def upload_screenshot(uuid):
            "success": True,
            "message": f"Screenshot received for client {uuid}",
            "filename": filename,
-            "size": len(image_data)
+            "size": len(image_data),
+            "screenshot_type": screenshot_type,
        }), 200

    except Exception as e:
+        session.rollback()
        return jsonify({"error": f"Failed to process screenshot: {str(e)}"}), 500
+    finally:
+        session.close()


@clients_bp.route("/<uuid>", methods=["DELETE"])
--- a/server/routes/events.py
+++ b/server/routes/events.py
@@ -104,6 +104,9 @@ def get_events():
            "end_time": e.end.isoformat() if e.end else None,
            "is_all_day": False,
            "media_id": e.event_media_id,
+            "slideshow_interval": e.slideshow_interval,
+            "page_progress": e.page_progress,
+            "auto_progress": e.auto_progress,
            "type": e.event_type.value if e.event_type else None,
            "icon": get_icon_for_type(e.event_type.value if e.event_type else None),
            # Recurrence metadata
@@ -267,6 +270,8 @@ def detach_event_occurrence(event_id, occurrence_date):
            'event_type': master.event_type,
            'event_media_id': master.event_media_id,
            'slideshow_interval': getattr(master, 'slideshow_interval', None),
+            'page_progress': getattr(master, 'page_progress', None),
+            'auto_progress': getattr(master, 'auto_progress', None),
            'created_by': master.created_by,
        }

@@ -318,6 +323,8 @@ def detach_event_occurrence(event_id, occurrence_date):
            event_type=master_data['event_type'],
            event_media_id=master_data['event_media_id'],
            slideshow_interval=master_data['slideshow_interval'],
+            page_progress=data.get("page_progress", master_data['page_progress']),
+            auto_progress=data.get("auto_progress", master_data['auto_progress']),
            recurrence_rule=None,
            recurrence_end=None,
            skip_holidays=False,
@@ -361,11 +368,15 @@ def create_event():
    event_type = data["event_type"]
    event_media_id = None
    slideshow_interval = None
+    page_progress = None
+    auto_progress = None

    # Präsentation: event_media_id und slideshow_interval übernehmen
    if event_type == "presentation":
        event_media_id = data.get("event_media_id")
        slideshow_interval = data.get("slideshow_interval")
+        page_progress = data.get("page_progress")
+        auto_progress = data.get("auto_progress")
        if not event_media_id:
            return jsonify({"error": "event_media_id required for presentation"}), 400

@@ -443,6 +454,8 @@ def create_event():
        is_active=True,
    event_media_id=event_media_id,
    slideshow_interval=slideshow_interval,
+    page_progress=page_progress,
+    auto_progress=auto_progress,
    autoplay=autoplay,
    loop=loop,
    volume=volume,
@@ -519,6 +532,10 @@ def update_event(event_id):
    event.event_type = data.get("event_type", event.event_type)
    event.event_media_id = data.get("event_media_id", event.event_media_id)
    event.slideshow_interval = data.get("slideshow_interval", event.slideshow_interval)
+    if "page_progress" in data:
+        event.page_progress = data.get("page_progress")
+    if "auto_progress" in data:
+        event.auto_progress = data.get("auto_progress")
    # Video-specific fields
    if "autoplay" in data:
        event.autoplay = data.get("autoplay")
--- a/server/wsgi.py
+++ b/server/wsgi.py
@@ -8,6 +8,7 @@ from server.routes.holidays import holidays_bp
 from server.routes.academic_periods import academic_periods_bp
 from server.routes.groups import groups_bp
 from server.routes.clients import clients_bp
+from server.routes.client_logs import client_logs_bp
 from server.routes.auth import auth_bp
 from server.routes.users import users_bp
 from server.routes.system_settings import system_settings_bp
@@ -46,6 +47,7 @@ else:
 app.register_blueprint(auth_bp)
 app.register_blueprint(users_bp)
 app.register_blueprint(clients_bp)
+app.register_blueprint(client_logs_bp)
 app.register_blueprint(groups_bp)
 app.register_blueprint(events_bp)
 app.register_blueprint(event_exceptions_bp)
@@ -66,13 +68,31 @@ def index():
    return "Hello from Infoscreen‐API!"


+@app.route("/screenshots/<uuid>/priority")
+def get_priority_screenshot(uuid):
+    normalized_uuid = uuid[:-4] if uuid.lower().endswith('.jpg') else uuid
+    priority_filename = f"{normalized_uuid}_priority.jpg"
+    priority_path = os.path.join("screenshots", priority_filename)
+    if os.path.exists(priority_path):
+        return send_from_directory("screenshots", priority_filename)
+    return get_screenshot(uuid)
+
+
@app.route("/screenshots/<uuid>")
+@app.route("/screenshots/<uuid>.jpg")
 def get_screenshot(uuid):
-    pattern = os.path.join("screenshots", f"{uuid}*.jpg")
+    normalized_uuid = uuid[:-4] if uuid.lower().endswith('.jpg') else uuid
+    latest_filename = f"{normalized_uuid}.jpg"
+    latest_path = os.path.join("screenshots", latest_filename)
+    if os.path.exists(latest_path):
+        return send_from_directory("screenshots", latest_filename)
+
+    pattern = os.path.join("screenshots", f"{normalized_uuid}_*.jpg")
    files = glob.glob(pattern)
    if not files:
        # Dummy-Bild als Redirect oder direkt als Response
        return jsonify({"error": "Screenshot not found", "dummy": "https://placehold.co/400x300?text=No+Screenshot"}), 404
+    files.sort(reverse=True)
    filename = os.path.basename(files[0])
    return send_from_directory("screenshots", filename)
Author	SHA1	Message	Date
Olaf	a58e9d3fca	feat(listener): migrate dashboard MQTT payload to v2-only grouped schema - Replace _extract_image_and_timestamp() with v2-only _extract_dashboard_payload_fields() - Add _classify_dashboard_payload() + parse metrics (v2_success, parse_failures) - Add soft _validate_v2_required_fields() for warning-only field checks - Remove legacy fallback after soak confirmed legacy_fallback=0 - Fix: forward msg.payload directly to handle_screenshot() to avoid re-wrap bug - Add 33 parser tests in listener/test_listener_parser.py - Add MQTT_PAYLOAD_MIGRATION_GUIDE.md documenting the 10-step migration process - Update README.md and copilot-instructions.md to reflect v2-only schema	2026-03-30 14:18:34 +00:00
Olaf	90ccbdf920	fix(dashboard): restore event visibility and fix lint errors in App.tsx Appointments: no longer hide existing events on holiday dates Resources: load all overlapping events per group, include inactive/past events, and reload on date/view navigation App.tsx: replace any types in password input handlers with typed event shapes	2026-03-30 09:51:22 +00:00
Olaf	24cdf07279	feat(monitoring): add priority screenshot pipeline with screenshot_type + docs cleanup Implement end-to-end support for typed screenshots and priority rendering in monitoring. Added - Accept and forward screenshot_type from MQTT screenshot/dashboard payloads (periodic, event_start, event_stop) - Extend screenshot upload handling to persist typed screenshots and metadata - Add dedicated priority screenshot serving endpoint with fallback behavior - Extend monitoring overview with priority screenshot fields and summary count - Add configurable PRIORITY_SCREENSHOT_TTL_SECONDS window for active priority state Fixed - Ensure screenshot cache-busting updates reliably via screenshot hash updates - Preserve normal periodic screenshot flow while introducing event_start/event_stop priority path Improved - Monitoring dashboard now displays screenshot type badges - Adaptive polling: faster refresh while priority screenshots are active - Priority screenshot presentation is surfaced immediately to operators Docs - Update README and copilot-instructions to match new screenshot_type behavior, priority endpoint, TTL config, monitoring fields, and retention model - Remove redundant/duplicate documentation blocks and improve troubleshooting section clarity	2026-03-29 13:13:13 +00:00
Olaf	9c330f984f	feat(monitoring): complete monitoring pipeline and fix presentation flag persistence add superadmin monitoring dashboard with protected route, menu entry, and monitoring data client add monitoring overview API endpoint and improve log serialization/aggregation for dashboard use extend listener health/log handling with robust status/event/timestamp normalization and screenshot payload extraction improve screenshot persistence and retrieval (timestamp-aware uploads, latest screenshot endpoint fallback) fix page_progress and auto_progress persistence/serialization across create, update, and detached occurrence flows align technical and project docs to reflect implemented monitoring and no-version-bump backend changes add documentation sync log entry and include minor compose env indentation cleanup	2026-03-24 11:18:33 +00:00
olafn	3107d0f671	feat(monitoring): add server-side client logging and health infrastructure - add Alembic migration c1d2e3f4g5h6 for client monitoring: - create client_logs table with FK to clients.uuid and performance indexes - extend clients with process/health tracking fields - extend data model with ClientLog, LogLevel, ProcessStatus, and ScreenHealthStatus - enhance listener MQTT handling: - subscribe to logs and health topics - persist client logs from infoscreen/{uuid}/logs/{level} - process health payloads and enrich heartbeat-derived client state - add monitoring API blueprint server/routes/client_logs.py: - GET /api/client-logs/<uuid>/logs - GET /api/client-logs/summary - GET /api/client-logs/recent-errors - GET /api/client-logs/test - register client_logs blueprint in server/wsgi.py - align compose/dev runtime for listener live-code execution - add client-side implementation docs: - CLIENT_MONITORING_SPECIFICATION.md - CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md - update TECH-CHANGELOG.md and copilot-instructions.md: - document monitoring changes - codify post-release technical-notes/no-version-bump convention	2026-03-10 07:33:38 +00:00