feat(monitoring): add priority screenshot pipeline with screenshot_type + docs cleanup

Implement end-to-end support for typed screenshots and priority rendering in monitoring. Added - Accept and forward screenshot_type from MQTT screenshot/dashboard payloads (periodic, event_start, event_stop) - Extend screenshot upload handling to persist typed screenshots and metadata - Add dedicated priority screenshot serving endpoint with fallback behavior - Extend monitoring overview with priority screenshot fields and summary count - Add configurable PRIORITY_SCREENSHOT_TTL_SECONDS window for active priority state Fixed - Ensure screenshot cache-busting updates reliably via screenshot hash updates - Preserve normal periodic screenshot flow while introducing event_start/event_stop priority path Improved - Monitoring dashboard now displays screenshot type badges - Adaptive polling: faster refresh while priority screenshots are active - Priority screenshot presentation is surfaced immediately to operators Docs - Update README and copilot-instructions to match new screenshot_type behavior, priority endpoint, TTL config, monitoring fields, and retention model - Remove redundant/duplicate documentation blocks and improve troubleshooting section clarity
2026-03-29 13:13:13 +00:00
parent 9c330f984f
commit 24cdf07279
10 changed files with 258 additions and 57 deletions
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -51,7 +51,10 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro

 ### Screenshot retention
 - Screenshots sent via dashboard MQTT are stored in `server/screenshots/`.
- For each client, only the latest and last 20 timestamped screenshots are kept; older files are deleted automatically on each upload.
+- Screenshot payloads support `screenshot_type` with values `periodic`, `event_start`, `event_stop`.
+- `periodic` is the normal heartbeat/dashboard screenshot path; `event_start` and `event_stop` are high-priority screenshots for monitoring.
+- For each client, the API keeps `{uuid}.jpg` as latest and the last 20 timestamped screenshots (`{uuid}_..._{type}.jpg`), deleting older timestamped files automatically.
+- For high-priority screenshots, the API additionally maintains `{uuid}_priority.jpg` and metadata in `{uuid}_meta.json` (`latest_screenshot_type`, `last_priority_*`).

  ## Recent changes since last commit

@@ -61,6 +64,11 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
    - End-to-end monitoring pipeline completed: MQTT logs/health → listener persistence → monitoring APIs → superadmin dashboard
    - API now serves aggregated monitoring via `GET /api/client-logs/monitoring-overview` and system-wide recent errors via `GET /api/client-logs/recent-errors`
    - Monitoring dashboard (`dashboard/src/monitoring.tsx`) is active and displays client health states, screenshots, process metadata, and recent log activity
+  - **Screenshot Priority Pipeline (no version bump)**:
+    - Listener forwards `screenshot_type` from MQTT screenshot/dashboard payloads to `POST /api/clients/<uuid>/screenshot`.
+    - API stores typed screenshots, tracks latest/priority metadata, and serves priority images via `GET /screenshots/<uuid>/priority`.
+    - Monitoring overview exposes screenshot priority state (`latestScreenshotType`, `priorityScreenshotType`, `priorityScreenshotReceivedAt`, `hasActivePriorityScreenshot`) and `summary.activePriorityScreenshots`.
+    - Monitoring UI shows screenshot type badges and switches to faster refresh while priority screenshots are active.
  - **Presentation Flags Persistence Fix**:
    - Fixed persistence for presentation `page_progress` and `auto_progress` to ensure values are reliably stored and returned across create/update paths and detached occurrences

@@ -129,7 +137,6 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
 ## Service boundaries & data flow
 - Database connection string is passed as `DB_CONN` (mysql+pymysql) to Python services.
  - API builds its engine in `server/database.py` (loads `.env` only in development).
-  - Scheduler loads `DB_CONN` in `scheduler/db_utils.py`. Recurring events are expanded for the next 7 days, and event exceptions (skipped dates, detached occurrences) are respected. Only recurring events with recurrence_end in the future remain active. The scheduler publishes only events that are active at the current time and clears retained topics (publishes `[]`) for groups without active events. Time comparisons are UTC and naive timestamps are normalized.
  - Listener also creates its own engine for writes to `clients`.
  - Scheduler queries a future window (default: 7 days) to expand recurring events using RFC 5545 rules, applies event exceptions (skipped dates, detached occurrences), and publishes only events that are active at the current time (UTC). When a group has no active events, the scheduler clears its retained topic by publishing an empty list. Time comparisons are UTC; naive timestamps are normalized. Logging is concise; conversion lookups are cached and logged only once per media.
 - MQTT topics (paho-mqtt v2, use Callback API v2):
@@ -139,7 +146,7 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
  - Per-client group assignment (retained): `infoscreen/{uuid}/group_id` via `server/mqtt_helper.py`.
  - Client logs: `infoscreen/{uuid}/logs/{error|warn|info}` with JSON payload (timestamp, message, context); QoS 1 for ERROR/WARN, QoS 0 for INFO.
  - Client health: `infoscreen/{uuid}/health` with metrics (expected_state, actual_state, health_metrics); QoS 0, published every 5 seconds.
- Screenshots: server-side folders `server/received_screenshots/` and `server/screenshots/`; Nginx exposes `/screenshots/{uuid}.jpg` via `server/wsgi.py` route.
+- Screenshots: server-side folder `server/screenshots/`; API serves `/screenshots/{uuid}.jpg` (latest) and `/screenshots/{uuid}/priority` (active high-priority fallback to latest).

 - Dev Container guidance: If extensions reappear inside the container, remove UI-only extensions from `devcontainer.json` `extensions` and map them in `remote.extensionKind` as `"ui"`.

@@ -210,6 +217,7 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
    - `GET /api/client-logs/<uuid>/logs` – Query client logs with filters (level, limit, since); admin_or_higher
    - `GET /api/client-logs/summary` – Log counts by level per client (last 24h); admin_or_higher
    - `GET /api/client-logs/recent-errors` – System-wide error monitoring; admin_or_higher
+    - `GET /api/client-logs/monitoring-overview` – Includes screenshot priority fields per client plus `summary.activePriorityScreenshots`; superadmin_only
    - `GET /api/client-logs/test` – Infrastructure validation (no auth); returns recent logs with counts

  Documentation maintenance: keep this file aligned with real patterns; update when routes/session/UTC rules change. Avoid long prose; link exact paths.
@@ -272,7 +280,8 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
  - Superadmin-only dashboard for client monitoring and diagnostics; menu item is hidden for lower roles and the route redirects non-superadmins.
  - Uses `GET /api/client-logs/monitoring-overview` for aggregated live status, `GET /api/client-logs/recent-errors` for system-wide errors, and `GET /api/client-logs/<uuid>/logs` for per-client details.
  - Shows per-client status (`healthy`, `warning`, `critical`, `offline`) based on heartbeat freshness, process state, screen state, and recent log counts.
-  - Displays latest screenshot preview from `/screenshots/{uuid}.jpg`, current process metadata, and recent ERROR/WARN activity.
+  - Displays latest screenshot preview and active priority screenshot (`/screenshots/{uuid}/priority` when active), screenshot type badges, current process metadata, and recent ERROR/WARN activity.
+  - Uses adaptive refresh: normal interval in steady state, faster polling while `activePriorityScreenshots > 0`.

 - Settings page (`dashboard/src/settings.tsx`):
  - Structure: Syncfusion TabComponent with role-gated tabs
@@ -351,6 +360,7 @@ Note: Syncfusion usage in the dashboard is already documented above; if a UI for
 - VITE_API_URL — Dashboard build-time base URL (prod); in dev the Vite proxy serves `/api` to `server:8000`.
 - HEARTBEAT_GRACE_PERIOD_DEV / HEARTBEAT_GRACE_PERIOD_PROD — Groups "alive" window (defaults 180s dev / 170s prod). Clients send heartbeats every ~65s; grace periods allow 2 missed heartbeats plus safety margin.
 - REFRESH_SECONDS — Optional scheduler republish interval; `0` disables periodic refresh.
+- PRIORITY_SCREENSHOT_TTL_SECONDS — Optional monitoring priority window in seconds (default `120`); controls when event screenshots are considered active priority.

 ## Conventions & gotchas
 - **Datetime Handling**:
@@ -360,7 +370,6 @@ Note: Syncfusion usage in the dashboard is already documented above; if a UI for
  - Frontend **must** append 'Z' before parsing: `const utcStr = dateStr.endsWith('Z') ? dateStr : dateStr + 'Z'; new Date(utcStr);`
  - Display in local timezone using `toLocaleTimeString('de-DE', { hour: '2-digit', minute: '2-digit' })`
  - When sending to API, use `date.toISOString()` which includes 'Z' and is UTC
-  - Frontend must append `Z` to API strings before parsing; backend compares in UTC and returns ISO without `Z`.
 - **JSON Naming Convention**:
  - Backend uses snake_case internally (Python convention)
  - API returns camelCase JSON (web standard): `startTime`, `endTime`, `groupId`, etc.