Compare commits
5 Commits
7746e26385
...
mqtt-trans
| Author | SHA1 | Date | |
|---|---|---|---|
| a58e9d3fca | |||
| 90ccbdf920 | |||
| 24cdf07279 | |||
| 9c330f984f | |||
| 3107d0f671 |
68
.github/copilot-instructions.md
vendored
68
.github/copilot-instructions.md
vendored
@@ -34,6 +34,7 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
|
|||||||
- `dashboard/src/settings.tsx` — settings UI (nested tabs; system defaults for presentations and videos)
|
- `dashboard/src/settings.tsx` — settings UI (nested tabs; system defaults for presentations and videos)
|
||||||
- `dashboard/src/ressourcen.tsx` — timeline view showing all groups' active events in parallel
|
- `dashboard/src/ressourcen.tsx` — timeline view showing all groups' active events in parallel
|
||||||
- `dashboard/src/ressourcen.css` — timeline and resource view styling
|
- `dashboard/src/ressourcen.css` — timeline and resource view styling
|
||||||
|
- `dashboard/src/monitoring.tsx` — superadmin-only monitoring dashboard for client health, screenshots, and logs
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -50,11 +51,32 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
|
|||||||
|
|
||||||
### Screenshot retention
|
### Screenshot retention
|
||||||
- Screenshots sent via dashboard MQTT are stored in `server/screenshots/`.
|
- Screenshots sent via dashboard MQTT are stored in `server/screenshots/`.
|
||||||
- For each client, only the latest and last 20 timestamped screenshots are kept; older files are deleted automatically on each upload.
|
- Screenshot payloads support `screenshot_type` with values `periodic`, `event_start`, `event_stop`.
|
||||||
|
- `periodic` is the normal heartbeat/dashboard screenshot path; `event_start` and `event_stop` are high-priority screenshots for monitoring.
|
||||||
|
- For each client, the API keeps `{uuid}.jpg` as latest and the last 20 timestamped screenshots (`{uuid}_..._{type}.jpg`), deleting older timestamped files automatically.
|
||||||
|
- For high-priority screenshots, the API additionally maintains `{uuid}_priority.jpg` and metadata in `{uuid}_meta.json` (`latest_screenshot_type`, `last_priority_*`).
|
||||||
|
|
||||||
## Recent changes since last commit
|
## Recent changes since last commit
|
||||||
|
|
||||||
### Latest (January 2026)
|
### Latest (March 2026)
|
||||||
|
|
||||||
|
- **Monitoring System Completion (no version bump)**:
|
||||||
|
- End-to-end monitoring pipeline completed: MQTT logs/health → listener persistence → monitoring APIs → superadmin dashboard
|
||||||
|
- API now serves aggregated monitoring via `GET /api/client-logs/monitoring-overview` and system-wide recent errors via `GET /api/client-logs/recent-errors`
|
||||||
|
- Monitoring dashboard (`dashboard/src/monitoring.tsx`) is active and displays client health states, screenshots, process metadata, and recent log activity
|
||||||
|
- **Screenshot Priority Pipeline (no version bump)**:
|
||||||
|
- Listener forwards `screenshot_type` from MQTT screenshot/dashboard payloads to `POST /api/clients/<uuid>/screenshot`.
|
||||||
|
- API stores typed screenshots, tracks latest/priority metadata, and serves priority images via `GET /screenshots/<uuid>/priority`.
|
||||||
|
- Monitoring overview exposes screenshot priority state (`latestScreenshotType`, `priorityScreenshotType`, `priorityScreenshotReceivedAt`, `hasActivePriorityScreenshot`) and `summary.activePriorityScreenshots`.
|
||||||
|
- Monitoring UI shows screenshot type badges and switches to faster refresh while priority screenshots are active.
|
||||||
|
- **MQTT Dashboard Payload v2 Cutover (no version bump)**:
|
||||||
|
- Dashboard payload parsing in `listener/listener.py` is now v2-only (`message`, `content`, `runtime`, `metadata`).
|
||||||
|
- Legacy top-level dashboard fallback was removed after migration soak (`legacy_fallback=0`).
|
||||||
|
- Listener observability summarizes parser health using `v2_success` and `parse_failures` counters.
|
||||||
|
- **Presentation Flags Persistence Fix**:
|
||||||
|
- Fixed persistence for presentation `page_progress` and `auto_progress` to ensure values are reliably stored and returned across create/update paths and detached occurrences
|
||||||
|
|
||||||
|
### Earlier (January 2026)
|
||||||
|
|
||||||
- **Ressourcen Page (Timeline View)**:
|
- **Ressourcen Page (Timeline View)**:
|
||||||
- New 'Ressourcen' page with parallel timeline view showing active events for all room groups
|
- New 'Ressourcen' page with parallel timeline view showing active events for all room groups
|
||||||
@@ -119,15 +141,17 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
|
|||||||
## Service boundaries & data flow
|
## Service boundaries & data flow
|
||||||
- Database connection string is passed as `DB_CONN` (mysql+pymysql) to Python services.
|
- Database connection string is passed as `DB_CONN` (mysql+pymysql) to Python services.
|
||||||
- API builds its engine in `server/database.py` (loads `.env` only in development).
|
- API builds its engine in `server/database.py` (loads `.env` only in development).
|
||||||
- Scheduler loads `DB_CONN` in `scheduler/db_utils.py`. Recurring events are expanded for the next 7 days, and event exceptions (skipped dates, detached occurrences) are respected. Only recurring events with recurrence_end in the future remain active. The scheduler publishes only events that are active at the current time and clears retained topics (publishes `[]`) for groups without active events. Time comparisons are UTC and naive timestamps are normalized.
|
|
||||||
- Listener also creates its own engine for writes to `clients`.
|
- Listener also creates its own engine for writes to `clients`.
|
||||||
- Scheduler queries a future window (default: 7 days) to expand recurring events using RFC 5545 rules, applies event exceptions (skipped dates, detached occurrences), and publishes only events that are active at the current time (UTC). When a group has no active events, the scheduler clears its retained topic by publishing an empty list. Time comparisons are UTC; naive timestamps are normalized. Logging is concise; conversion lookups are cached and logged only once per media.
|
- Scheduler queries a future window (default: 7 days) to expand recurring events using RFC 5545 rules, applies event exceptions (skipped dates, detached occurrences), and publishes only events that are active at the current time (UTC). When a group has no active events, the scheduler clears its retained topic by publishing an empty list. Time comparisons are UTC; naive timestamps are normalized. Logging is concise; conversion lookups are cached and logged only once per media.
|
||||||
- MQTT topics (paho-mqtt v2, use Callback API v2):
|
- MQTT topics (paho-mqtt v2, use Callback API v2):
|
||||||
- Discovery: `infoscreen/discovery` (JSON includes `uuid`, hw/ip data). ACK to `infoscreen/{uuid}/discovery_ack`. See `listener/listener.py`.
|
- Discovery: `infoscreen/discovery` (JSON includes `uuid`, hw/ip data). ACK to `infoscreen/{uuid}/discovery_ack`. See `listener/listener.py`.
|
||||||
- Heartbeat: `infoscreen/{uuid}/heartbeat` updates `Client.last_alive` (UTC).
|
- Heartbeat: `infoscreen/{uuid}/heartbeat` updates `Client.last_alive` (UTC); enhanced payload includes `current_process`, `process_pid`, `process_status`, `current_event_id`.
|
||||||
- Event lists (retained): `infoscreen/events/{group_id}` from `scheduler/scheduler.py`.
|
- Event lists (retained): `infoscreen/events/{group_id}` from `scheduler/scheduler.py`.
|
||||||
- Per-client group assignment (retained): `infoscreen/{uuid}/group_id` via `server/mqtt_helper.py`.
|
- Per-client group assignment (retained): `infoscreen/{uuid}/group_id` via `server/mqtt_helper.py`.
|
||||||
- Screenshots: server-side folders `server/received_screenshots/` and `server/screenshots/`; Nginx exposes `/screenshots/{uuid}.jpg` via `server/wsgi.py` route.
|
- Client logs: `infoscreen/{uuid}/logs/{error|warn|info}` with JSON payload (timestamp, message, context); QoS 1 for ERROR/WARN, QoS 0 for INFO.
|
||||||
|
- Client health: `infoscreen/{uuid}/health` with metrics (expected_state, actual_state, health_metrics); QoS 0, published every 5 seconds.
|
||||||
|
- Dashboard screenshots: `infoscreen/{uuid}/dashboard` uses grouped v2 payload blocks (`message`, `content`, `runtime`, `metadata`); listener reads screenshot data from `content.screenshot` and capture type from `metadata.capture.type`.
|
||||||
|
- Screenshots: server-side folder `server/screenshots/`; API serves `/screenshots/{uuid}.jpg` (latest) and `/screenshots/{uuid}/priority` (active high-priority fallback to latest).
|
||||||
|
|
||||||
- Dev Container guidance: If extensions reappear inside the container, remove UI-only extensions from `devcontainer.json` `extensions` and map them in `remote.extensionKind` as `"ui"`.
|
- Dev Container guidance: If extensions reappear inside the container, remove UI-only extensions from `devcontainer.json` `extensions` and map them in `remote.extensionKind` as `"ui"`.
|
||||||
|
|
||||||
@@ -146,6 +170,11 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
|
|||||||
- `locked_until`: TIMESTAMP placeholder for account lockout (infrastructure in place, not yet enforced)
|
- `locked_until`: TIMESTAMP placeholder for account lockout (infrastructure in place, not yet enforced)
|
||||||
- `deactivated_at`, `deactivated_by`: Soft-delete audit trail (FK self-reference); soft deactivation is the default, hard delete superadmin-only
|
- `deactivated_at`, `deactivated_by`: Soft-delete audit trail (FK self-reference); soft deactivation is the default, hard delete superadmin-only
|
||||||
- Role hierarchy (privilege escalation enforced): `user` < `editor` < `admin` < `superadmin`
|
- Role hierarchy (privilege escalation enforced): `user` < `editor` < `admin` < `superadmin`
|
||||||
|
- Client monitoring (migration: `c1d2e3f4g5h6_add_client_monitoring.py`):
|
||||||
|
- `ClientLog` model: Centralized log storage with fields (id, client_uuid, timestamp, level, message, context, created_at); FK to clients.uuid (CASCADE)
|
||||||
|
- `Client` model extended: 7 health monitoring fields (`current_event_id`, `current_process`, `process_status`, `process_pid`, `last_screenshot_analyzed`, `screen_health_status`, `last_screenshot_hash`)
|
||||||
|
- Enums: `LogLevel` (ERROR, WARN, INFO, DEBUG), `ProcessStatus` (running, crashed, starting, stopped), `ScreenHealthStatus` (OK, BLACK, FROZEN, UNKNOWN)
|
||||||
|
- Indexes: (client_uuid, timestamp DESC), (level, timestamp DESC), (created_at DESC) for performance
|
||||||
- System settings: `system_settings` key–value store via `SystemSetting` for global configuration (e.g., WebUntis/Vertretungsplan supplement-table). Managed through routes in `server/routes/system_settings.py`.
|
- System settings: `system_settings` key–value store via `SystemSetting` for global configuration (e.g., WebUntis/Vertretungsplan supplement-table). Managed through routes in `server/routes/system_settings.py`.
|
||||||
- Presentation defaults (system-wide):
|
- Presentation defaults (system-wide):
|
||||||
- `presentation_interval` (seconds, default "10")
|
- `presentation_interval` (seconds, default "10")
|
||||||
@@ -189,6 +218,12 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
|
|||||||
- `PUT /api/users/<id>/password` — admin password reset (requires backend check to reject self-reset for consistency)
|
- `PUT /api/users/<id>/password` — admin password reset (requires backend check to reject self-reset for consistency)
|
||||||
- `DELETE /api/users/<id>` — hard delete (superadmin only, with self-deletion check)
|
- `DELETE /api/users/<id>` — hard delete (superadmin only, with self-deletion check)
|
||||||
- Auth routes (`server/routes/auth.py`): Enhanced to track login events (sets `last_login_at`, resets `failed_login_attempts` on success; increments `failed_login_attempts` and `last_failed_login_at` on failure). Self-service password change via `PUT /api/auth/change-password` requires current password verification.
|
- Auth routes (`server/routes/auth.py`): Enhanced to track login events (sets `last_login_at`, resets `failed_login_attempts` on success; increments `failed_login_attempts` and `last_failed_login_at` on failure). Self-service password change via `PUT /api/auth/change-password` requires current password verification.
|
||||||
|
- Client logs (`server/routes/client_logs.py`): Centralized log retrieval for monitoring:
|
||||||
|
- `GET /api/client-logs/<uuid>/logs` – Query client logs with filters (level, limit, since); admin_or_higher
|
||||||
|
- `GET /api/client-logs/summary` – Log counts by level per client (last 24h); admin_or_higher
|
||||||
|
- `GET /api/client-logs/recent-errors` – System-wide error monitoring; admin_or_higher
|
||||||
|
- `GET /api/client-logs/monitoring-overview` – Includes screenshot priority fields per client plus `summary.activePriorityScreenshots`; superadmin_only
|
||||||
|
- `GET /api/client-logs/test` – Infrastructure validation (no auth); returns recent logs with counts
|
||||||
|
|
||||||
Documentation maintenance: keep this file aligned with real patterns; update when routes/session/UTC rules change. Avoid long prose; link exact paths.
|
Documentation maintenance: keep this file aligned with real patterns; update when routes/session/UTC rules change. Avoid long prose; link exact paths.
|
||||||
|
|
||||||
@@ -246,6 +281,13 @@ Keep docs synced with code. When you change services/MQTT/API/UTC/env or dev/pro
|
|||||||
- API client in `dashboard/src/apiUsers.ts` for all user operations (listUsers, getUser, createUser, updateUser, resetUserPassword, deleteUser)
|
- API client in `dashboard/src/apiUsers.ts` for all user operations (listUsers, getUser, createUser, updateUser, resetUserPassword, deleteUser)
|
||||||
- Menu visibility: "Benutzer" menu item only visible to admin+ (role-gated in App.tsx)
|
- Menu visibility: "Benutzer" menu item only visible to admin+ (role-gated in App.tsx)
|
||||||
|
|
||||||
|
- Monitoring page (`dashboard/src/monitoring.tsx`):
|
||||||
|
- Superadmin-only dashboard for client monitoring and diagnostics; menu item is hidden for lower roles and the route redirects non-superadmins.
|
||||||
|
- Uses `GET /api/client-logs/monitoring-overview` for aggregated live status, `GET /api/client-logs/recent-errors` for system-wide errors, and `GET /api/client-logs/<uuid>/logs` for per-client details.
|
||||||
|
- Shows per-client status (`healthy`, `warning`, `critical`, `offline`) based on heartbeat freshness, process state, screen state, and recent log counts.
|
||||||
|
- Displays latest screenshot preview and active priority screenshot (`/screenshots/{uuid}/priority` when active), screenshot type badges, current process metadata, and recent ERROR/WARN activity.
|
||||||
|
- Uses adaptive refresh: normal interval in steady state, faster polling while `activePriorityScreenshots > 0`.
|
||||||
|
|
||||||
- Settings page (`dashboard/src/settings.tsx`):
|
- Settings page (`dashboard/src/settings.tsx`):
|
||||||
- Structure: Syncfusion TabComponent with role-gated tabs
|
- Structure: Syncfusion TabComponent with role-gated tabs
|
||||||
- 📅 Academic Calendar (all users)
|
- 📅 Academic Calendar (all users)
|
||||||
@@ -323,6 +365,7 @@ Note: Syncfusion usage in the dashboard is already documented above; if a UI for
|
|||||||
- VITE_API_URL — Dashboard build-time base URL (prod); in dev the Vite proxy serves `/api` to `server:8000`.
|
- VITE_API_URL — Dashboard build-time base URL (prod); in dev the Vite proxy serves `/api` to `server:8000`.
|
||||||
- HEARTBEAT_GRACE_PERIOD_DEV / HEARTBEAT_GRACE_PERIOD_PROD — Groups "alive" window (defaults 180s dev / 170s prod). Clients send heartbeats every ~65s; grace periods allow 2 missed heartbeats plus safety margin.
|
- HEARTBEAT_GRACE_PERIOD_DEV / HEARTBEAT_GRACE_PERIOD_PROD — Groups "alive" window (defaults 180s dev / 170s prod). Clients send heartbeats every ~65s; grace periods allow 2 missed heartbeats plus safety margin.
|
||||||
- REFRESH_SECONDS — Optional scheduler republish interval; `0` disables periodic refresh.
|
- REFRESH_SECONDS — Optional scheduler republish interval; `0` disables periodic refresh.
|
||||||
|
- PRIORITY_SCREENSHOT_TTL_SECONDS — Optional monitoring priority window in seconds (default `120`); controls when event screenshots are considered active priority.
|
||||||
|
|
||||||
## Conventions & gotchas
|
## Conventions & gotchas
|
||||||
- **Datetime Handling**:
|
- **Datetime Handling**:
|
||||||
@@ -332,7 +375,6 @@ Note: Syncfusion usage in the dashboard is already documented above; if a UI for
|
|||||||
- Frontend **must** append 'Z' before parsing: `const utcStr = dateStr.endsWith('Z') ? dateStr : dateStr + 'Z'; new Date(utcStr);`
|
- Frontend **must** append 'Z' before parsing: `const utcStr = dateStr.endsWith('Z') ? dateStr : dateStr + 'Z'; new Date(utcStr);`
|
||||||
- Display in local timezone using `toLocaleTimeString('de-DE', { hour: '2-digit', minute: '2-digit' })`
|
- Display in local timezone using `toLocaleTimeString('de-DE', { hour: '2-digit', minute: '2-digit' })`
|
||||||
- When sending to API, use `date.toISOString()` which includes 'Z' and is UTC
|
- When sending to API, use `date.toISOString()` which includes 'Z' and is UTC
|
||||||
- Frontend must append `Z` to API strings before parsing; backend compares in UTC and returns ISO without `Z`.
|
|
||||||
- **JSON Naming Convention**:
|
- **JSON Naming Convention**:
|
||||||
- Backend uses snake_case internally (Python convention)
|
- Backend uses snake_case internally (Python convention)
|
||||||
- API returns camelCase JSON (web standard): `startTime`, `endTime`, `groupId`, etc.
|
- API returns camelCase JSON (web standard): `startTime`, `endTime`, `groupId`, etc.
|
||||||
@@ -364,7 +406,8 @@ Docs maintenance guardrails (solo-friendly): Update this file alongside code cha
|
|||||||
## Quick examples
|
## Quick examples
|
||||||
- Add client description persists to DB and publishes group via MQTT: see `PUT /api/clients/<uuid>/description` in `routes/clients.py`.
|
- Add client description persists to DB and publishes group via MQTT: see `PUT /api/clients/<uuid>/description` in `routes/clients.py`.
|
||||||
- Bulk group assignment emits retained messages for each client: `PUT /api/clients/group`.
|
- Bulk group assignment emits retained messages for each client: `PUT /api/clients/group`.
|
||||||
- Listener heartbeat path: `infoscreen/<uuid>/heartbeat` → sets `clients.last_alive`.
|
- Listener heartbeat path: `infoscreen/<uuid>/heartbeat` → sets `clients.last_alive` and captures process health data.
|
||||||
|
- Client monitoring flow: Client publishes to `infoscreen/{uuid}/logs/error` and `infoscreen/{uuid}/health` → listener stores/updates monitoring state → API serves `/api/client-logs/monitoring-overview`, `/api/client-logs/recent-errors`, and `/api/client-logs/<uuid>/logs` → superadmin monitoring dashboard displays live status.
|
||||||
|
|
||||||
## Scheduler payloads: presentation extras
|
## Scheduler payloads: presentation extras
|
||||||
- Presentation event payloads now include `page_progress` and `auto_progress` in addition to `slide_interval` and media files. These are sourced from per-event fields in the database (with system defaults applied on event creation).
|
- Presentation event payloads now include `page_progress` and `auto_progress` in addition to `slide_interval` and media files. These are sourced from per-event fields in the database (with system defaults applied on event creation).
|
||||||
@@ -393,3 +436,14 @@ Questions or unclear areas? Tell us if you need: exact devcontainer debugging st
|
|||||||
- Breaking changes must be prefixed with `BREAKING:`
|
- Breaking changes must be prefixed with `BREAKING:`
|
||||||
- Keep ≤ 8–10 bullets; summarize or group micro-changes
|
- Keep ≤ 8–10 bullets; summarize or group micro-changes
|
||||||
- JSON hygiene: valid JSON, no trailing commas, don’t edit historical entries except typos
|
- JSON hygiene: valid JSON, no trailing commas, don’t edit historical entries except typos
|
||||||
|
|
||||||
|
## Versioning Convention (Tech vs UI)
|
||||||
|
|
||||||
|
- Use one unified app version across technical and user-facing release notes.
|
||||||
|
- `dashboard/public/program-info.json` is user-facing and should list only user-visible changes.
|
||||||
|
- `TECH-CHANGELOG.md` can include deeper technical details for the same released version.
|
||||||
|
- If server/infrastructure work is implemented but not yet released or not user-visible, document it under the latest released section as:
|
||||||
|
- `Backend technical work (post-release notes; no version bump)`
|
||||||
|
- Do not create a new version header in `TECH-CHANGELOG.md` for internal milestones alone.
|
||||||
|
- Bump version numbers when a release is actually cut/deployed (or when user-facing release notes are published), not for intermediate backend-only steps.
|
||||||
|
- When UI integration lands later, include the user-visible part in the next release version and reference prior post-release technical groundwork when useful.
|
||||||
|
|||||||
@@ -98,3 +98,6 @@ exit 0 # warn only; do not block commit
|
|||||||
- MQTT workers: `listener/listener.py`, `scheduler/scheduler.py`, `server/mqtt_helper.py`
|
- MQTT workers: `listener/listener.py`, `scheduler/scheduler.py`, `server/mqtt_helper.py`
|
||||||
- Frontend: `dashboard/vite.config.ts`, `dashboard/package.json`, `dashboard/src/*`
|
- Frontend: `dashboard/vite.config.ts`, `dashboard/package.json`, `dashboard/src/*`
|
||||||
- Dev/Prod docs: `deployment.md`, `.env.example`
|
- Dev/Prod docs: `deployment.md`, `.env.example`
|
||||||
|
|
||||||
|
## Documentation sync log
|
||||||
|
- 2026-03-24: Synced docs for completed monitoring rollout and presentation flag persistence fix (`page_progress` / `auto_progress`). Updated `.github/copilot-instructions.md`, `README.md`, `TECH-CHANGELOG.md`, `DEV-CHANGELOG.md`, and `CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md` without a user-version bump.
|
||||||
|
|||||||
757
CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md
Normal file
757
CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md
Normal file
@@ -0,0 +1,757 @@
|
|||||||
|
# 🚀 Client Monitoring Implementation Guide
|
||||||
|
|
||||||
|
**Phase-based implementation guide for basic monitoring in development phase**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ✅ Phase 1: Server-Side Database Foundation
|
||||||
|
**Status:** ✅ COMPLETE
|
||||||
|
**Dependencies:** None - Already implemented
|
||||||
|
**Time estimate:** Completed
|
||||||
|
|
||||||
|
### ✅ Step 1.1: Database Migration
|
||||||
|
**File:** `server/alembic/versions/c1d2e3f4g5h6_add_client_monitoring.py`
|
||||||
|
**What it does:**
|
||||||
|
- Creates `client_logs` table for centralized logging
|
||||||
|
- Adds health monitoring columns to `clients` table
|
||||||
|
- Creates indexes for efficient querying
|
||||||
|
|
||||||
|
**To apply:**
|
||||||
|
```bash
|
||||||
|
cd /workspace/server
|
||||||
|
alembic upgrade head
|
||||||
|
```
|
||||||
|
|
||||||
|
### ✅ Step 1.2: Update Data Models
|
||||||
|
**File:** `models/models.py`
|
||||||
|
**What was added:**
|
||||||
|
- New enums: `LogLevel`, `ProcessStatus`, `ScreenHealthStatus`
|
||||||
|
- Updated `Client` model with health tracking fields
|
||||||
|
- New `ClientLog` model for log storage
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🔧 Phase 2: Server-Side Backend Logic
|
||||||
|
**Status:** ✅ COMPLETE
|
||||||
|
**Dependencies:** Phase 1 complete
|
||||||
|
**Time estimate:** 2-3 hours
|
||||||
|
|
||||||
|
### Step 2.1: Extend MQTT Listener
|
||||||
|
**File:** `listener/listener.py`
|
||||||
|
**What to add:**
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Add new topic subscriptions in on_connect():
|
||||||
|
client.subscribe("infoscreen/+/logs/error")
|
||||||
|
client.subscribe("infoscreen/+/logs/warn")
|
||||||
|
client.subscribe("infoscreen/+/logs/info") # Dev mode only
|
||||||
|
client.subscribe("infoscreen/+/health")
|
||||||
|
|
||||||
|
# Add new handler in on_message():
|
||||||
|
def handle_log_message(uuid, level, payload):
|
||||||
|
"""Store client log in database"""
|
||||||
|
from models.models import ClientLog, LogLevel
|
||||||
|
from server.database import Session
|
||||||
|
import json
|
||||||
|
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
log_entry = ClientLog(
|
||||||
|
client_uuid=uuid,
|
||||||
|
timestamp=payload.get('timestamp', datetime.now(timezone.utc)),
|
||||||
|
level=LogLevel[level],
|
||||||
|
message=payload.get('message', ''),
|
||||||
|
context=json.dumps(payload.get('context', {}))
|
||||||
|
)
|
||||||
|
session.add(log_entry)
|
||||||
|
session.commit()
|
||||||
|
print(f"[LOG] {uuid} {level}: {payload.get('message', '')}")
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error saving log: {e}")
|
||||||
|
session.rollback()
|
||||||
|
finally:
|
||||||
|
session.close()
|
||||||
|
|
||||||
|
def handle_health_message(uuid, payload):
|
||||||
|
"""Update client health status"""
|
||||||
|
from models.models import Client, ProcessStatus
|
||||||
|
from server.database import Session
|
||||||
|
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
client = session.query(Client).filter_by(uuid=uuid).first()
|
||||||
|
if client:
|
||||||
|
client.current_event_id = payload.get('expected_state', {}).get('event_id')
|
||||||
|
client.current_process = payload.get('actual_state', {}).get('process')
|
||||||
|
|
||||||
|
status_str = payload.get('actual_state', {}).get('status')
|
||||||
|
if status_str:
|
||||||
|
client.process_status = ProcessStatus[status_str]
|
||||||
|
|
||||||
|
client.process_pid = payload.get('actual_state', {}).get('pid')
|
||||||
|
session.commit()
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error updating health: {e}")
|
||||||
|
session.rollback()
|
||||||
|
finally:
|
||||||
|
session.close()
|
||||||
|
```
|
||||||
|
|
||||||
|
**Topic routing logic:**
|
||||||
|
```python
|
||||||
|
# In on_message callback, add routing:
|
||||||
|
if topic.endswith('/logs/error'):
|
||||||
|
handle_log_message(uuid, 'ERROR', payload)
|
||||||
|
elif topic.endswith('/logs/warn'):
|
||||||
|
handle_log_message(uuid, 'WARN', payload)
|
||||||
|
elif topic.endswith('/logs/info'):
|
||||||
|
handle_log_message(uuid, 'INFO', payload)
|
||||||
|
elif topic.endswith('/health'):
|
||||||
|
handle_health_message(uuid, payload)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2.2: Create API Routes
|
||||||
|
**File:** `server/routes/client_logs.py` (NEW)
|
||||||
|
|
||||||
|
```python
|
||||||
|
from flask import Blueprint, jsonify, request
|
||||||
|
from server.database import Session
|
||||||
|
from server.permissions import admin_or_higher
|
||||||
|
from models.models import ClientLog, Client
|
||||||
|
from sqlalchemy import desc
|
||||||
|
import json
|
||||||
|
|
||||||
|
client_logs_bp = Blueprint("client_logs", __name__, url_prefix="/api/client-logs")
|
||||||
|
|
||||||
|
@client_logs_bp.route("/<uuid>/logs", methods=["GET"])
|
||||||
|
@admin_or_higher
|
||||||
|
def get_client_logs(uuid):
|
||||||
|
"""
|
||||||
|
Get logs for a specific client
|
||||||
|
Query params:
|
||||||
|
- level: ERROR, WARN, INFO, DEBUG (optional)
|
||||||
|
- limit: number of entries (default 50, max 500)
|
||||||
|
- since: ISO timestamp (optional)
|
||||||
|
"""
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
level = request.args.get('level')
|
||||||
|
limit = min(int(request.args.get('limit', 50)), 500)
|
||||||
|
since = request.args.get('since')
|
||||||
|
|
||||||
|
query = session.query(ClientLog).filter_by(client_uuid=uuid)
|
||||||
|
|
||||||
|
if level:
|
||||||
|
from models.models import LogLevel
|
||||||
|
query = query.filter_by(level=LogLevel[level])
|
||||||
|
|
||||||
|
if since:
|
||||||
|
from datetime import datetime
|
||||||
|
since_dt = datetime.fromisoformat(since.replace('Z', '+00:00'))
|
||||||
|
query = query.filter(ClientLog.timestamp >= since_dt)
|
||||||
|
|
||||||
|
logs = query.order_by(desc(ClientLog.timestamp)).limit(limit).all()
|
||||||
|
|
||||||
|
result = []
|
||||||
|
for log in logs:
|
||||||
|
result.append({
|
||||||
|
"id": log.id,
|
||||||
|
"timestamp": log.timestamp.isoformat() if log.timestamp else None,
|
||||||
|
"level": log.level.value if log.level else None,
|
||||||
|
"message": log.message,
|
||||||
|
"context": json.loads(log.context) if log.context else {}
|
||||||
|
})
|
||||||
|
|
||||||
|
session.close()
|
||||||
|
return jsonify({"logs": result, "count": len(result)})
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
session.close()
|
||||||
|
return jsonify({"error": str(e)}), 500
|
||||||
|
|
||||||
|
@client_logs_bp.route("/summary", methods=["GET"])
|
||||||
|
@admin_or_higher
|
||||||
|
def get_logs_summary():
|
||||||
|
"""Get summary of errors/warnings across all clients"""
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
from sqlalchemy import func
|
||||||
|
from models.models import LogLevel
|
||||||
|
from datetime import datetime, timedelta
|
||||||
|
|
||||||
|
# Last 24 hours
|
||||||
|
since = datetime.utcnow() - timedelta(hours=24)
|
||||||
|
|
||||||
|
stats = session.query(
|
||||||
|
ClientLog.client_uuid,
|
||||||
|
ClientLog.level,
|
||||||
|
func.count(ClientLog.id).label('count')
|
||||||
|
).filter(
|
||||||
|
ClientLog.timestamp >= since
|
||||||
|
).group_by(
|
||||||
|
ClientLog.client_uuid,
|
||||||
|
ClientLog.level
|
||||||
|
).all()
|
||||||
|
|
||||||
|
result = {}
|
||||||
|
for stat in stats:
|
||||||
|
uuid = stat.client_uuid
|
||||||
|
if uuid not in result:
|
||||||
|
result[uuid] = {"ERROR": 0, "WARN": 0, "INFO": 0}
|
||||||
|
result[uuid][stat.level.value] = stat.count
|
||||||
|
|
||||||
|
session.close()
|
||||||
|
return jsonify({"summary": result, "period_hours": 24})
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
session.close()
|
||||||
|
return jsonify({"error": str(e)}), 500
|
||||||
|
```
|
||||||
|
|
||||||
|
**Register in `server/wsgi.py`:**
|
||||||
|
```python
|
||||||
|
from server.routes.client_logs import client_logs_bp
|
||||||
|
app.register_blueprint(client_logs_bp)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2.3: Add Health Data to Heartbeat Handler
|
||||||
|
**File:** `listener/listener.py` (extend existing heartbeat handler)
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Modify existing heartbeat handler to capture health data
|
||||||
|
def on_message(client, userdata, message):
|
||||||
|
topic = message.topic
|
||||||
|
|
||||||
|
# Existing heartbeat logic...
|
||||||
|
if '/heartbeat' in topic:
|
||||||
|
uuid = extract_uuid_from_topic(topic)
|
||||||
|
try:
|
||||||
|
payload = json.loads(message.payload.decode())
|
||||||
|
|
||||||
|
# Update last_alive (existing)
|
||||||
|
session = Session()
|
||||||
|
client_obj = session.query(Client).filter_by(uuid=uuid).first()
|
||||||
|
if client_obj:
|
||||||
|
client_obj.last_alive = datetime.now(timezone.utc)
|
||||||
|
|
||||||
|
# NEW: Update health data if present in heartbeat
|
||||||
|
if 'process_status' in payload:
|
||||||
|
client_obj.process_status = ProcessStatus[payload['process_status']]
|
||||||
|
if 'current_process' in payload:
|
||||||
|
client_obj.current_process = payload['current_process']
|
||||||
|
if 'process_pid' in payload:
|
||||||
|
client_obj.process_pid = payload['process_pid']
|
||||||
|
if 'current_event_id' in payload:
|
||||||
|
client_obj.current_event_id = payload['current_event_id']
|
||||||
|
|
||||||
|
session.commit()
|
||||||
|
session.close()
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error processing heartbeat: {e}")
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🖥️ Phase 3: Client-Side Implementation
|
||||||
|
**Status:** ✅ COMPLETE
|
||||||
|
**Dependencies:** Phase 2 complete
|
||||||
|
**Time estimate:** 3-4 hours
|
||||||
|
|
||||||
|
### Step 3.1: Create Client Watchdog Script
|
||||||
|
**File:** `client/watchdog.py` (NEW - on client device)
|
||||||
|
|
||||||
|
```python
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Client-side process watchdog
|
||||||
|
Monitors VLC, Chromium, PDF viewer and reports health to server
|
||||||
|
"""
|
||||||
|
import psutil
|
||||||
|
import paho.mqtt.client as mqtt
|
||||||
|
import json
|
||||||
|
import time
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
|
||||||
|
class MediaWatchdog:
|
||||||
|
def __init__(self, client_uuid, mqtt_broker, mqtt_port=1883):
|
||||||
|
self.uuid = client_uuid
|
||||||
|
self.mqtt_client = mqtt.Client()
|
||||||
|
self.mqtt_client.connect(mqtt_broker, mqtt_port, 60)
|
||||||
|
self.mqtt_client.loop_start()
|
||||||
|
|
||||||
|
self.current_process = None
|
||||||
|
self.current_event_id = None
|
||||||
|
self.restart_attempts = 0
|
||||||
|
self.MAX_RESTARTS = 3
|
||||||
|
|
||||||
|
def send_log(self, level, message, context=None):
|
||||||
|
"""Send log message to server via MQTT"""
|
||||||
|
topic = f"infoscreen/{self.uuid}/logs/{level.lower()}"
|
||||||
|
payload = {
|
||||||
|
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||||
|
"message": message,
|
||||||
|
"context": context or {}
|
||||||
|
}
|
||||||
|
self.mqtt_client.publish(topic, json.dumps(payload), qos=1)
|
||||||
|
print(f"[{level}] {message}")
|
||||||
|
|
||||||
|
def send_health(self, process_name, pid, status, event_id=None):
|
||||||
|
"""Send health status to server"""
|
||||||
|
topic = f"infoscreen/{self.uuid}/health"
|
||||||
|
payload = {
|
||||||
|
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||||
|
"expected_state": {
|
||||||
|
"event_id": event_id
|
||||||
|
},
|
||||||
|
"actual_state": {
|
||||||
|
"process": process_name,
|
||||||
|
"pid": pid,
|
||||||
|
"status": status # 'running', 'crashed', 'starting', 'stopped'
|
||||||
|
}
|
||||||
|
}
|
||||||
|
self.mqtt_client.publish(topic, json.dumps(payload), qos=1, retain=False)
|
||||||
|
|
||||||
|
def is_process_running(self, process_name):
|
||||||
|
"""Check if a process is running"""
|
||||||
|
for proc in psutil.process_iter(['name', 'pid']):
|
||||||
|
try:
|
||||||
|
if process_name.lower() in proc.info['name'].lower():
|
||||||
|
return proc.info['pid']
|
||||||
|
except (psutil.NoSuchProcess, psutil.AccessDenied):
|
||||||
|
pass
|
||||||
|
return None
|
||||||
|
|
||||||
|
def monitor_loop(self):
|
||||||
|
"""Main monitoring loop"""
|
||||||
|
print(f"Watchdog started for client {self.uuid}")
|
||||||
|
self.send_log("INFO", "Watchdog service started", {"uuid": self.uuid})
|
||||||
|
|
||||||
|
while True:
|
||||||
|
try:
|
||||||
|
# Check expected process (would be set by main event handler)
|
||||||
|
if self.current_process:
|
||||||
|
pid = self.is_process_running(self.current_process)
|
||||||
|
|
||||||
|
if pid:
|
||||||
|
# Process is running
|
||||||
|
self.send_health(
|
||||||
|
self.current_process,
|
||||||
|
pid,
|
||||||
|
"running",
|
||||||
|
self.current_event_id
|
||||||
|
)
|
||||||
|
self.restart_attempts = 0 # Reset on success
|
||||||
|
else:
|
||||||
|
# Process crashed
|
||||||
|
self.send_log(
|
||||||
|
"ERROR",
|
||||||
|
f"Process {self.current_process} crashed or stopped",
|
||||||
|
{
|
||||||
|
"event_id": self.current_event_id,
|
||||||
|
"process": self.current_process,
|
||||||
|
"restart_attempt": self.restart_attempts
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
if self.restart_attempts < self.MAX_RESTARTS:
|
||||||
|
self.send_log("WARN", f"Attempting restart ({self.restart_attempts + 1}/{self.MAX_RESTARTS})")
|
||||||
|
self.restart_attempts += 1
|
||||||
|
# TODO: Implement restart logic (call event handler)
|
||||||
|
else:
|
||||||
|
self.send_log("ERROR", "Max restart attempts exceeded", {
|
||||||
|
"event_id": self.current_event_id
|
||||||
|
})
|
||||||
|
|
||||||
|
time.sleep(5) # Check every 5 seconds
|
||||||
|
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
print("Watchdog stopped by user")
|
||||||
|
break
|
||||||
|
except Exception as e:
|
||||||
|
self.send_log("ERROR", f"Watchdog error: {str(e)}", {
|
||||||
|
"exception": str(e),
|
||||||
|
"traceback": str(sys.exc_info())
|
||||||
|
})
|
||||||
|
time.sleep(10) # Wait longer on error
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
import sys
|
||||||
|
if len(sys.argv) < 3:
|
||||||
|
print("Usage: python watchdog.py <client_uuid> <mqtt_broker>")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
uuid = sys.argv[1]
|
||||||
|
broker = sys.argv[2]
|
||||||
|
|
||||||
|
watchdog = MediaWatchdog(uuid, broker)
|
||||||
|
watchdog.monitor_loop()
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3.2: Integrate with Existing Event Handler
|
||||||
|
**File:** `client/event_handler.py` (modify existing)
|
||||||
|
|
||||||
|
```python
|
||||||
|
# When starting a new event, notify watchdog
|
||||||
|
def play_event(event_data):
|
||||||
|
event_type = event_data.get('event_type')
|
||||||
|
event_id = event_data.get('id')
|
||||||
|
|
||||||
|
if event_type == 'video':
|
||||||
|
process_name = 'vlc'
|
||||||
|
# Start VLC...
|
||||||
|
elif event_type == 'website':
|
||||||
|
process_name = 'chromium'
|
||||||
|
# Start Chromium...
|
||||||
|
elif event_type == 'presentation':
|
||||||
|
process_name = 'pdf_viewer' # or your PDF tool
|
||||||
|
# Start PDF viewer...
|
||||||
|
|
||||||
|
# Notify watchdog about expected process
|
||||||
|
watchdog.current_process = process_name
|
||||||
|
watchdog.current_event_id = event_id
|
||||||
|
watchdog.restart_attempts = 0
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3.3: Enhanced Heartbeat Payload
|
||||||
|
**File:** `client/heartbeat.py` (modify existing)
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Modify existing heartbeat to include process status
|
||||||
|
def send_heartbeat(mqtt_client, uuid):
|
||||||
|
# Get current process status
|
||||||
|
current_process = None
|
||||||
|
process_pid = None
|
||||||
|
process_status = "stopped"
|
||||||
|
|
||||||
|
# Check if expected process is running
|
||||||
|
if watchdog.current_process:
|
||||||
|
pid = watchdog.is_process_running(watchdog.current_process)
|
||||||
|
if pid:
|
||||||
|
current_process = watchdog.current_process
|
||||||
|
process_pid = pid
|
||||||
|
process_status = "running"
|
||||||
|
|
||||||
|
payload = {
|
||||||
|
"uuid": uuid,
|
||||||
|
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||||
|
# Existing fields...
|
||||||
|
# NEW health fields:
|
||||||
|
"current_process": current_process,
|
||||||
|
"process_pid": process_pid,
|
||||||
|
"process_status": process_status,
|
||||||
|
"current_event_id": watchdog.current_event_id
|
||||||
|
}
|
||||||
|
|
||||||
|
mqtt_client.publish(f"infoscreen/{uuid}/heartbeat", json.dumps(payload))
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🎨 Phase 4: Dashboard UI Integration
|
||||||
|
**Status:** ✅ COMPLETE
|
||||||
|
**Dependencies:** Phases 2 & 3 complete
|
||||||
|
**Time estimate:** 2-3 hours
|
||||||
|
|
||||||
|
### Step 4.1: Create Log Viewer Component
|
||||||
|
**File:** `dashboard/src/ClientLogs.tsx` (NEW)
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
import React from 'react';
|
||||||
|
import { GridComponent, ColumnsDirective, ColumnDirective, Page, Inject } from '@syncfusion/ej2-react-grids';
|
||||||
|
|
||||||
|
interface LogEntry {
|
||||||
|
id: number;
|
||||||
|
timestamp: string;
|
||||||
|
level: 'ERROR' | 'WARN' | 'INFO' | 'DEBUG';
|
||||||
|
message: string;
|
||||||
|
context: any;
|
||||||
|
}
|
||||||
|
|
||||||
|
interface ClientLogsProps {
|
||||||
|
clientUuid: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
export const ClientLogs: React.FC<ClientLogsProps> = ({ clientUuid }) => {
|
||||||
|
const [logs, setLogs] = React.useState<LogEntry[]>([]);
|
||||||
|
const [loading, setLoading] = React.useState(false);
|
||||||
|
|
||||||
|
const loadLogs = async (level?: string) => {
|
||||||
|
setLoading(true);
|
||||||
|
try {
|
||||||
|
const params = new URLSearchParams({ limit: '50' });
|
||||||
|
if (level) params.append('level', level);
|
||||||
|
|
||||||
|
const response = await fetch(`/api/client-logs/${clientUuid}/logs?${params}`);
|
||||||
|
const data = await response.json();
|
||||||
|
setLogs(data.logs);
|
||||||
|
} catch (err) {
|
||||||
|
console.error('Failed to load logs:', err);
|
||||||
|
} finally {
|
||||||
|
setLoading(false);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
React.useEffect(() => {
|
||||||
|
loadLogs();
|
||||||
|
const interval = setInterval(() => loadLogs(), 30000); // Refresh every 30s
|
||||||
|
return () => clearInterval(interval);
|
||||||
|
}, [clientUuid]);
|
||||||
|
|
||||||
|
const levelTemplate = (props: any) => {
|
||||||
|
const colors = {
|
||||||
|
ERROR: 'text-red-600 bg-red-100',
|
||||||
|
WARN: 'text-yellow-600 bg-yellow-100',
|
||||||
|
INFO: 'text-blue-600 bg-blue-100',
|
||||||
|
DEBUG: 'text-gray-600 bg-gray-100'
|
||||||
|
};
|
||||||
|
return (
|
||||||
|
<span className={`px-2 py-1 rounded ${colors[props.level as keyof typeof colors]}`}>
|
||||||
|
{props.level}
|
||||||
|
</span>
|
||||||
|
);
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div>
|
||||||
|
<div className="mb-4 flex gap-2">
|
||||||
|
<button onClick={() => loadLogs()} className="e-btn e-primary">All</button>
|
||||||
|
<button onClick={() => loadLogs('ERROR')} className="e-btn e-danger">Errors</button>
|
||||||
|
<button onClick={() => loadLogs('WARN')} className="e-btn e-warning">Warnings</button>
|
||||||
|
<button onClick={() => loadLogs('INFO')} className="e-btn e-info">Info</button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<GridComponent
|
||||||
|
dataSource={logs}
|
||||||
|
allowPaging={true}
|
||||||
|
pageSettings={{ pageSize: 20 }}
|
||||||
|
>
|
||||||
|
<ColumnsDirective>
|
||||||
|
<ColumnDirective field='timestamp' headerText='Time' width='180' format='yMd HH:mm:ss' />
|
||||||
|
<ColumnDirective field='level' headerText='Level' width='100' template={levelTemplate} />
|
||||||
|
<ColumnDirective field='message' headerText='Message' width='400' />
|
||||||
|
</ColumnsDirective>
|
||||||
|
<Inject services={[Page]} />
|
||||||
|
</GridComponent>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4.2: Add Health Indicators to Client Cards
|
||||||
|
**File:** `dashboard/src/clients.tsx` (modify existing)
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// Add health indicator to client card
|
||||||
|
const getHealthBadge = (client: Client) => {
|
||||||
|
if (!client.process_status) {
|
||||||
|
return <span className="badge badge-secondary">Unknown</span>;
|
||||||
|
}
|
||||||
|
|
||||||
|
const badges = {
|
||||||
|
running: <span className="badge badge-success">✓ Running</span>,
|
||||||
|
crashed: <span className="badge badge-danger">✗ Crashed</span>,
|
||||||
|
starting: <span className="badge badge-warning">⟳ Starting</span>,
|
||||||
|
stopped: <span className="badge badge-secondary">■ Stopped</span>
|
||||||
|
};
|
||||||
|
|
||||||
|
return badges[client.process_status] || null;
|
||||||
|
};
|
||||||
|
|
||||||
|
// In client card render:
|
||||||
|
<div className="client-card">
|
||||||
|
<h3>{client.hostname || client.uuid}</h3>
|
||||||
|
<div>Status: {getHealthBadge(client)}</div>
|
||||||
|
<div>Process: {client.current_process || 'None'}</div>
|
||||||
|
<div>Event ID: {client.current_event_id || 'None'}</div>
|
||||||
|
<button onClick={() => showLogs(client.uuid)}>View Logs</button>
|
||||||
|
</div>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4.3: Add System Health Dashboard (Superadmin)
|
||||||
|
**File:** `dashboard/src/SystemMonitor.tsx` (NEW)
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
import React from 'react';
|
||||||
|
import { ClientLogs } from './ClientLogs';
|
||||||
|
|
||||||
|
export const SystemMonitor: React.FC = () => {
|
||||||
|
const [summary, setSummary] = React.useState<any>({});
|
||||||
|
|
||||||
|
const loadSummary = async () => {
|
||||||
|
const response = await fetch('/api/client-logs/summary');
|
||||||
|
const data = await response.json();
|
||||||
|
setSummary(data.summary);
|
||||||
|
};
|
||||||
|
|
||||||
|
React.useEffect(() => {
|
||||||
|
loadSummary();
|
||||||
|
const interval = setInterval(loadSummary, 30000);
|
||||||
|
return () => clearInterval(interval);
|
||||||
|
}, []);
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="system-monitor">
|
||||||
|
<h2>System Health Monitor (Superadmin)</h2>
|
||||||
|
|
||||||
|
<div className="alert-panel">
|
||||||
|
<h3>Active Issues</h3>
|
||||||
|
{Object.entries(summary).map(([uuid, stats]: [string, any]) => (
|
||||||
|
stats.ERROR > 0 || stats.WARN > 5 ? (
|
||||||
|
<div key={uuid} className="alert">
|
||||||
|
🔴 {uuid}: {stats.ERROR} errors, {stats.WARN} warnings (24h)
|
||||||
|
</div>
|
||||||
|
) : null
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* Real-time log stream */}
|
||||||
|
<div className="log-stream">
|
||||||
|
<h3>Recent Logs (All Clients)</h3>
|
||||||
|
{/* Implement real-time log aggregation */}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🧪 Phase 5: Testing & Validation
|
||||||
|
**Status:** ✅ COMPLETE
|
||||||
|
**Dependencies:** All previous phases
|
||||||
|
**Time estimate:** 1-2 hours
|
||||||
|
|
||||||
|
### Step 5.1: Server-Side Tests
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Test database migration
|
||||||
|
cd /workspace/server
|
||||||
|
alembic upgrade head
|
||||||
|
alembic downgrade -1
|
||||||
|
alembic upgrade head
|
||||||
|
|
||||||
|
# Test API endpoints
|
||||||
|
curl -X GET "http://localhost:8000/api/client-logs/<uuid>/logs?limit=10"
|
||||||
|
curl -X GET "http://localhost:8000/api/client-logs/summary"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5.2: Client-Side Tests
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# On client device
|
||||||
|
python3 watchdog.py <your-uuid> <mqtt-broker-ip>
|
||||||
|
|
||||||
|
# Simulate process crash
|
||||||
|
pkill vlc # Should trigger error log and restart attempt
|
||||||
|
|
||||||
|
# Check MQTT messages
|
||||||
|
mosquitto_sub -h <broker> -t "infoscreen/+/logs/#" -v
|
||||||
|
mosquitto_sub -h <broker> -t "infoscreen/+/health" -v
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5.3: Dashboard Tests
|
||||||
|
|
||||||
|
1. Open dashboard and navigate to Clients page
|
||||||
|
2. Verify health indicators show correct status
|
||||||
|
3. Click "View Logs" and verify logs appear
|
||||||
|
4. Navigate to System Monitor (superadmin)
|
||||||
|
5. Verify summary statistics are correct
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📝 Configuration Summary
|
||||||
|
|
||||||
|
### Environment Variables
|
||||||
|
|
||||||
|
**Server (docker-compose.yml):**
|
||||||
|
```yaml
|
||||||
|
- LOG_RETENTION_DAYS=90 # How long to keep logs
|
||||||
|
- DEBUG_MODE=true # Enable INFO level logging via MQTT
|
||||||
|
```
|
||||||
|
|
||||||
|
**Client:**
|
||||||
|
```bash
|
||||||
|
export MQTT_BROKER="your-server-ip"
|
||||||
|
export CLIENT_UUID="abc-123-def"
|
||||||
|
export WATCHDOG_ENABLED=true
|
||||||
|
```
|
||||||
|
|
||||||
|
### MQTT Topics Reference
|
||||||
|
|
||||||
|
| Topic Pattern | Direction | Purpose |
|
||||||
|
|--------------|-----------|---------|
|
||||||
|
| `infoscreen/{uuid}/logs/error` | Client → Server | Error messages |
|
||||||
|
| `infoscreen/{uuid}/logs/warn` | Client → Server | Warning messages |
|
||||||
|
| `infoscreen/{uuid}/logs/info` | Client → Server | Info (dev only) |
|
||||||
|
| `infoscreen/{uuid}/health` | Client → Server | Health metrics |
|
||||||
|
| `infoscreen/{uuid}/heartbeat` | Client → Server | Enhanced heartbeat |
|
||||||
|
|
||||||
|
### Database Tables
|
||||||
|
|
||||||
|
**client_logs:**
|
||||||
|
- Stores all centralized logs
|
||||||
|
- Indexed by client_uuid, timestamp, level
|
||||||
|
- Auto-cleanup after 90 days (recommended)
|
||||||
|
|
||||||
|
**clients (extended):**
|
||||||
|
- `current_event_id`: Which event should be playing
|
||||||
|
- `current_process`: Expected process name
|
||||||
|
- `process_status`: running/crashed/starting/stopped
|
||||||
|
- `process_pid`: Process ID
|
||||||
|
- `screen_health_status`: OK/BLACK/FROZEN/UNKNOWN
|
||||||
|
- `last_screenshot_analyzed`: Last analysis time
|
||||||
|
- `last_screenshot_hash`: For frozen detection
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🎯 Next Steps After Implementation
|
||||||
|
|
||||||
|
1. **Deploy Phase 1-2** to staging environment
|
||||||
|
2. **Test with 1-2 pilot clients** before full rollout
|
||||||
|
3. **Monitor traffic & performance** (should be minimal)
|
||||||
|
4. **Fine-tune log levels** based on actual noise
|
||||||
|
5. **Add alerting** (email/Slack when errors > threshold)
|
||||||
|
6. **Implement screenshot analysis** (Phase 2 enhancement)
|
||||||
|
7. **Add trending/analytics** (which clients are least reliable)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🚨 Troubleshooting
|
||||||
|
|
||||||
|
**Logs not appearing in database:**
|
||||||
|
- Check MQTT broker logs: `docker logs infoscreen-mqtt`
|
||||||
|
- Verify listener subscriptions: Check `listener/listener.py` logs
|
||||||
|
- Test MQTT manually: `mosquitto_pub -h broker -t "infoscreen/test/logs/error" -m '{"message":"test"}'`
|
||||||
|
|
||||||
|
**High database growth:**
|
||||||
|
- Check log_retention cleanup cronjob
|
||||||
|
- Reduce INFO level logging frequency
|
||||||
|
- Add sampling (log every 10th occurrence instead of all)
|
||||||
|
|
||||||
|
**Client watchdog not detecting crashes:**
|
||||||
|
- Verify psutil can see processes: `ps aux | grep vlc`
|
||||||
|
- Check permissions (may need sudo for some process checks)
|
||||||
|
- Increase monitor loop frequency for faster detection
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ✅ Completion Checklist
|
||||||
|
|
||||||
|
- [x] Phase 1: Database migration applied
|
||||||
|
- [x] Phase 2: Listener extended for log topics
|
||||||
|
- [x] Phase 2: API endpoints created and tested
|
||||||
|
- [x] Phase 3: Client watchdog implemented
|
||||||
|
- [x] Phase 3: Enhanced heartbeat deployed
|
||||||
|
- [x] Phase 4: Dashboard log viewer working
|
||||||
|
- [x] Phase 4: Health indicators visible
|
||||||
|
- [x] Phase 5: End-to-end testing complete
|
||||||
|
- [x] Documentation updated with new features
|
||||||
|
- [x] Production deployment plan created
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Last Updated:** 2026-03-24
|
||||||
|
**Author:** GitHub Copilot
|
||||||
|
**For:** Infoscreen 2025 Project
|
||||||
979
CLIENT_MONITORING_SPECIFICATION.md
Normal file
979
CLIENT_MONITORING_SPECIFICATION.md
Normal file
@@ -0,0 +1,979 @@
|
|||||||
|
# Client-Side Monitoring Specification
|
||||||
|
|
||||||
|
**Version:** 1.0
|
||||||
|
**Date:** 2026-03-10
|
||||||
|
**For:** Infoscreen Client Implementation
|
||||||
|
**Server Endpoint:** `192.168.43.201:8000` (or your production server)
|
||||||
|
**MQTT Broker:** `192.168.43.201:1883` (or your production MQTT broker)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Overview
|
||||||
|
|
||||||
|
Each infoscreen client must implement health monitoring and logging capabilities to report status to the central server via MQTT.
|
||||||
|
|
||||||
|
### 1.1 Goals
|
||||||
|
- **Detect failures:** Process crashes, frozen screens, content mismatches
|
||||||
|
- **Provide visibility:** Real-time health status visible on server dashboard
|
||||||
|
- **Enable remote diagnosis:** Centralized log storage for debugging
|
||||||
|
- **Auto-recovery:** Attempt automatic restart on failure
|
||||||
|
|
||||||
|
### 1.2 Architecture
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────┐
|
||||||
|
│ Infoscreen Client │
|
||||||
|
│ │
|
||||||
|
│ ┌──────────────┐ ┌──────────────┐ │
|
||||||
|
│ │ Media Player │ │ Watchdog │ │
|
||||||
|
│ │ (VLC/Chrome) │◄───│ Monitor │ │
|
||||||
|
│ └──────────────┘ └──────┬───────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌──────────────┐ │ │
|
||||||
|
│ │ Event Mgr │ │ │
|
||||||
|
│ │ (receives │ │ │
|
||||||
|
│ │ schedule) │◄───────────┘ │
|
||||||
|
│ └──────┬───────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ┌──────▼───────────────────────┐ │
|
||||||
|
│ │ MQTT Client │ │
|
||||||
|
│ │ - Heartbeat (every 60s) │ │
|
||||||
|
│ │ - Logs (error/warn/info) │ │
|
||||||
|
│ │ - Health metrics (every 5s) │ │
|
||||||
|
│ └──────┬────────────────────────┘ │
|
||||||
|
└─────────┼──────────────────────────────┘
|
||||||
|
│
|
||||||
|
│ MQTT over TCP
|
||||||
|
▼
|
||||||
|
┌─────────────┐
|
||||||
|
│ MQTT Broker │
|
||||||
|
│ (server) │
|
||||||
|
└─────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### 1.3 Current Compatibility Notes
|
||||||
|
- The server now accepts both the original specification payloads and the currently implemented Phase 3 client payloads.
|
||||||
|
- `infoscreen/{uuid}/health` may currently contain a reduced payload with only `expected_state.event_id` and `actual_state.process|pid|status`. Additional `health_metrics` fields from this specification remain recommended.
|
||||||
|
- `event_id` is still specified as an integer. For compatibility with the current Phase 3 client, the server also tolerates string values such as `event_123` and extracts the numeric suffix where possible.
|
||||||
|
- If the client sends `process_health` inside `infoscreen/{uuid}/dashboard`, the server treats it as a fallback source for `current_process`, `process_pid`, `process_status`, and `current_event_id`.
|
||||||
|
- Long term, the preferred client payload remains the structure in this specification so the server can surface richer monitoring data such as screen state and resource metrics.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. MQTT Protocol Specification
|
||||||
|
|
||||||
|
### 2.1 Connection Parameters
|
||||||
|
```
|
||||||
|
Broker: 192.168.43.201 (or DNS hostname)
|
||||||
|
Port: 1883 (standard MQTT)
|
||||||
|
Protocol: MQTT v3.1.1
|
||||||
|
Client ID: "infoscreen-{client_uuid}"
|
||||||
|
Clean Session: false (retain subscriptions)
|
||||||
|
Keep Alive: 60 seconds
|
||||||
|
Username/Password: (if configured on broker)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2.2 QoS Levels
|
||||||
|
- **Heartbeat:** QoS 0 (fire and forget, high frequency)
|
||||||
|
- **Logs (ERROR/WARN):** QoS 1 (at least once delivery, important)
|
||||||
|
- **Logs (INFO):** QoS 0 (optional, high volume)
|
||||||
|
- **Health metrics:** QoS 0 (frequent, latest value matters)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Topic Structure & Payload Formats
|
||||||
|
|
||||||
|
### 3.1 Log Messages
|
||||||
|
|
||||||
|
#### Topic Pattern:
|
||||||
|
```
|
||||||
|
infoscreen/{client_uuid}/logs/{level}
|
||||||
|
```
|
||||||
|
|
||||||
|
Where `{level}` is one of: `error`, `warn`, `info`
|
||||||
|
|
||||||
|
#### Payload Format (JSON):
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"timestamp": "2026-03-10T07:30:00Z",
|
||||||
|
"message": "Human-readable error description",
|
||||||
|
"context": {
|
||||||
|
"event_id": 42,
|
||||||
|
"process": "vlc",
|
||||||
|
"error_code": "NETWORK_TIMEOUT",
|
||||||
|
"additional_key": "any relevant data"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Field Specifications:
|
||||||
|
| Field | Type | Required | Description |
|
||||||
|
|-------|------|----------|-------------|
|
||||||
|
| `timestamp` | string (ISO 8601 UTC) | Yes | When the event occurred. Use `YYYY-MM-DDTHH:MM:SSZ` format |
|
||||||
|
| `message` | string | Yes | Human-readable description of the event (max 1000 chars) |
|
||||||
|
| `context` | object | No | Additional structured data (will be stored as JSON) |
|
||||||
|
|
||||||
|
#### Example Topics:
|
||||||
|
```
|
||||||
|
infoscreen/9b8d1856-ff34-4864-a726-12de072d0f77/logs/error
|
||||||
|
infoscreen/9b8d1856-ff34-4864-a726-12de072d0f77/logs/warn
|
||||||
|
infoscreen/9b8d1856-ff34-4864-a726-12de072d0f77/logs/info
|
||||||
|
```
|
||||||
|
|
||||||
|
#### When to Send Logs:
|
||||||
|
|
||||||
|
**ERROR (Always send):**
|
||||||
|
- Process crashed (VLC/Chromium/PDF viewer terminated unexpectedly)
|
||||||
|
- Content failed to load (404, network timeout, corrupt file)
|
||||||
|
- Hardware failure detected (display off, audio device missing)
|
||||||
|
- Exception caught in main event loop
|
||||||
|
- Maximum restart attempts exceeded
|
||||||
|
|
||||||
|
**WARN (Always send):**
|
||||||
|
- Process restarted automatically (after crash)
|
||||||
|
- High resource usage (CPU >80%, RAM >90%)
|
||||||
|
- Slow performance (frame drops, lag)
|
||||||
|
- Non-critical failures (screenshot capture failed, cache full)
|
||||||
|
- Fallback content displayed (primary source unavailable)
|
||||||
|
|
||||||
|
**INFO (Send in development, optional in production):**
|
||||||
|
- Process started successfully
|
||||||
|
- Event transition (switched from video to presentation)
|
||||||
|
- Content loaded successfully
|
||||||
|
- Watchdog service started/stopped
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3.2 Health Metrics
|
||||||
|
|
||||||
|
#### Topic Pattern:
|
||||||
|
```
|
||||||
|
infoscreen/{client_uuid}/health
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Payload Format (JSON):
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"timestamp": "2026-03-10T07:30:00Z",
|
||||||
|
"expected_state": {
|
||||||
|
"event_id": 42,
|
||||||
|
"event_type": "video",
|
||||||
|
"media_file": "presentation.mp4",
|
||||||
|
"started_at": "2026-03-10T07:15:00Z"
|
||||||
|
},
|
||||||
|
"actual_state": {
|
||||||
|
"process": "vlc",
|
||||||
|
"pid": 1234,
|
||||||
|
"status": "running",
|
||||||
|
"uptime_seconds": 900,
|
||||||
|
"position": 45.3,
|
||||||
|
"duration": 180.0
|
||||||
|
},
|
||||||
|
"health_metrics": {
|
||||||
|
"screen_on": true,
|
||||||
|
"last_frame_update": "2026-03-10T07:29:58Z",
|
||||||
|
"frames_dropped": 2,
|
||||||
|
"network_errors": 0,
|
||||||
|
"cpu_percent": 15.3,
|
||||||
|
"memory_mb": 234
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Field Specifications:
|
||||||
|
|
||||||
|
**expected_state:**
|
||||||
|
| Field | Type | Required | Description |
|
||||||
|
|-------|------|----------|-------------|
|
||||||
|
| `event_id` | integer | Yes | Current event ID from scheduler |
|
||||||
|
| `event_type` | string | Yes | `presentation`, `video`, `website`, `webuntis`, `message` |
|
||||||
|
| `media_file` | string | No | Filename or URL of current content |
|
||||||
|
| `started_at` | string (ISO 8601) | Yes | When this event started playing |
|
||||||
|
|
||||||
|
**actual_state:**
|
||||||
|
| Field | Type | Required | Description |
|
||||||
|
|-------|------|----------|-------------|
|
||||||
|
| `process` | string | Yes | `vlc`, `chromium`, `pdf_viewer`, `none` |
|
||||||
|
| `pid` | integer | No | Process ID (if running) |
|
||||||
|
| `status` | string | Yes | `running`, `crashed`, `starting`, `stopped` |
|
||||||
|
| `uptime_seconds` | integer | No | How long process has been running |
|
||||||
|
| `position` | float | No | Current playback position (seconds, for video/audio) |
|
||||||
|
| `duration` | float | No | Total content duration (seconds) |
|
||||||
|
|
||||||
|
**health_metrics:**
|
||||||
|
| Field | Type | Required | Description |
|
||||||
|
|-------|------|----------|-------------|
|
||||||
|
| `screen_on` | boolean | Yes | Is display powered on? |
|
||||||
|
| `last_frame_update` | string (ISO 8601) | No | Last time screen content changed |
|
||||||
|
| `frames_dropped` | integer | No | Video frames dropped (performance indicator) |
|
||||||
|
| `network_errors` | integer | No | Count of network errors in last interval |
|
||||||
|
| `cpu_percent` | float | No | CPU usage (0-100) |
|
||||||
|
| `memory_mb` | integer | No | RAM usage in megabytes |
|
||||||
|
|
||||||
|
#### Sending Frequency:
|
||||||
|
- **Normal operation:** Every 5 seconds
|
||||||
|
- **During startup/transition:** Every 1 second
|
||||||
|
- **After error:** Immediately + every 2 seconds until recovered
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3.3 Enhanced Heartbeat
|
||||||
|
|
||||||
|
The existing heartbeat topic should be enhanced to include process status.
|
||||||
|
|
||||||
|
#### Topic Pattern:
|
||||||
|
```
|
||||||
|
infoscreen/{client_uuid}/heartbeat
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Enhanced Payload Format (JSON):
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"uuid": "9b8d1856-ff34-4864-a726-12de072d0f77",
|
||||||
|
"timestamp": "2026-03-10T07:30:00Z",
|
||||||
|
"current_process": "vlc",
|
||||||
|
"process_pid": 1234,
|
||||||
|
"process_status": "running",
|
||||||
|
"current_event_id": 42
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### New Fields (add to existing heartbeat):
|
||||||
|
| Field | Type | Required | Description |
|
||||||
|
|-------|------|----------|-------------|
|
||||||
|
| `current_process` | string | No | Name of active media player process |
|
||||||
|
| `process_pid` | integer | No | Process ID |
|
||||||
|
| `process_status` | string | No | `running`, `crashed`, `starting`, `stopped` |
|
||||||
|
| `current_event_id` | integer | No | Event ID currently being displayed |
|
||||||
|
|
||||||
|
#### Sending Frequency:
|
||||||
|
- Keep existing: **Every 60 seconds**
|
||||||
|
- Include new fields if available
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Process Monitoring Requirements
|
||||||
|
|
||||||
|
### 4.1 Processes to Monitor
|
||||||
|
|
||||||
|
| Media Type | Process Name | How to Detect |
|
||||||
|
|------------|--------------|---------------|
|
||||||
|
| Video | `vlc` | `ps aux \| grep vlc` or `pgrep vlc` |
|
||||||
|
| Website/WebUntis | `chromium` or `chromium-browser` | `pgrep chromium` |
|
||||||
|
| PDF Presentation | `evince`, `okular`, or custom viewer | `pgrep {viewer_name}` |
|
||||||
|
|
||||||
|
### 4.2 Monitoring Checks (Every 5 seconds)
|
||||||
|
|
||||||
|
#### Check 1: Process Alive
|
||||||
|
```
|
||||||
|
Goal: Verify expected process is running
|
||||||
|
Method:
|
||||||
|
- Get list of running processes (psutil or `ps`)
|
||||||
|
- Check if expected process name exists
|
||||||
|
- Match PID if known
|
||||||
|
Result:
|
||||||
|
- If missing → status = "crashed"
|
||||||
|
- If found → status = "running"
|
||||||
|
Action on crash:
|
||||||
|
- Send ERROR log immediately
|
||||||
|
- Attempt restart (max 3 attempts)
|
||||||
|
- Send WARN log on each restart
|
||||||
|
- If max restarts exceeded → send ERROR log, display fallback
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Check 2: Process Responsive
|
||||||
|
```
|
||||||
|
Goal: Detect frozen processes
|
||||||
|
Method:
|
||||||
|
- For VLC: Query HTTP interface (status.json)
|
||||||
|
- For Chromium: Use DevTools Protocol (CDP)
|
||||||
|
- For custom viewers: Check last screen update time
|
||||||
|
Result:
|
||||||
|
- If same frame >30 seconds → likely frozen
|
||||||
|
- If playback position not advancing → frozen
|
||||||
|
Action on freeze:
|
||||||
|
- Send WARN log
|
||||||
|
- Force refresh (reload page, seek video, next slide)
|
||||||
|
- If refresh fails → restart process
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Check 3: Content Match
|
||||||
|
```
|
||||||
|
Goal: Verify correct content is displayed
|
||||||
|
Method:
|
||||||
|
- Compare expected event_id with actual media/URL
|
||||||
|
- Check scheduled time window (is event still active?)
|
||||||
|
Result:
|
||||||
|
- Mismatch → content error
|
||||||
|
Action:
|
||||||
|
- Send WARN log
|
||||||
|
- Reload correct event from scheduler
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Process Control Interface Requirements
|
||||||
|
|
||||||
|
### 5.1 VLC Control
|
||||||
|
|
||||||
|
**Requirement:** Enable VLC HTTP interface for monitoring
|
||||||
|
|
||||||
|
**Launch Command:**
|
||||||
|
```bash
|
||||||
|
vlc --intf http --http-host 127.0.0.1 --http-port 8080 --http-password "vlc_password" \
|
||||||
|
--fullscreen --loop /path/to/video.mp4
|
||||||
|
```
|
||||||
|
|
||||||
|
**Status Query:**
|
||||||
|
```bash
|
||||||
|
curl http://127.0.0.1:8080/requests/status.json --user ":vlc_password"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response Fields to Monitor:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"state": "playing", // "playing", "paused", "stopped"
|
||||||
|
"position": 0.25, // 0.0-1.0 (25% through)
|
||||||
|
"time": 45, // seconds into playback
|
||||||
|
"length": 180, // total duration in seconds
|
||||||
|
"volume": 256 // 0-512
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5.2 Chromium Control
|
||||||
|
|
||||||
|
**Requirement:** Enable Chrome DevTools Protocol (CDP)
|
||||||
|
|
||||||
|
**Launch Command:**
|
||||||
|
```bash
|
||||||
|
chromium --remote-debugging-port=9222 --kiosk --app=https://example.com
|
||||||
|
```
|
||||||
|
|
||||||
|
**Status Query:**
|
||||||
|
```bash
|
||||||
|
curl http://127.0.0.1:9222/json
|
||||||
|
```
|
||||||
|
|
||||||
|
**Response Fields to Monitor:**
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"url": "https://example.com",
|
||||||
|
"title": "Page Title",
|
||||||
|
"type": "page"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Advanced:** Use CDP WebSocket for events (page load, navigation, errors)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5.3 PDF Viewer (Custom or Standard)
|
||||||
|
|
||||||
|
**Option A: Standard Viewer (e.g., Evince)**
|
||||||
|
- No built-in API
|
||||||
|
- Monitor via process check + screenshot comparison
|
||||||
|
|
||||||
|
**Option B: Custom Python Viewer**
|
||||||
|
- Implement REST API for status queries
|
||||||
|
- Track: current page, total pages, last transition time
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Watchdog Service Architecture
|
||||||
|
|
||||||
|
### 6.1 Service Components
|
||||||
|
|
||||||
|
**Component 1: Process Monitor Thread**
|
||||||
|
```
|
||||||
|
Responsibilities:
|
||||||
|
- Check process alive every 5 seconds
|
||||||
|
- Detect crashes and frozen processes
|
||||||
|
- Attempt automatic restart
|
||||||
|
- Send health metrics via MQTT
|
||||||
|
|
||||||
|
State Machine:
|
||||||
|
IDLE → STARTING → RUNNING → (if crash) → RESTARTING → RUNNING
|
||||||
|
→ (if max restarts) → FAILED
|
||||||
|
```
|
||||||
|
|
||||||
|
**Component 2: MQTT Publisher Thread**
|
||||||
|
```
|
||||||
|
Responsibilities:
|
||||||
|
- Maintain MQTT connection
|
||||||
|
- Send heartbeat every 60 seconds
|
||||||
|
- Send logs on-demand (queued from other components)
|
||||||
|
- Send health metrics every 5 seconds
|
||||||
|
- Reconnect on connection loss
|
||||||
|
```
|
||||||
|
|
||||||
|
**Component 3: Event Manager Integration**
|
||||||
|
```
|
||||||
|
Responsibilities:
|
||||||
|
- Receive event schedule from server
|
||||||
|
- Notify watchdog of expected process/content
|
||||||
|
- Launch media player processes
|
||||||
|
- Handle event transitions
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6.2 Service Lifecycle
|
||||||
|
|
||||||
|
**On Startup:**
|
||||||
|
1. Load configuration (client UUID, MQTT broker, etc.)
|
||||||
|
2. Connect to MQTT broker
|
||||||
|
3. Send INFO log: "Watchdog service started"
|
||||||
|
4. Wait for first event from scheduler
|
||||||
|
|
||||||
|
**During Operation:**
|
||||||
|
1. Monitor loop runs every 5 seconds
|
||||||
|
2. Check expected vs actual process state
|
||||||
|
3. Send health metrics
|
||||||
|
4. Handle failures (log + restart)
|
||||||
|
|
||||||
|
**On Shutdown:**
|
||||||
|
1. Send INFO log: "Watchdog service stopping"
|
||||||
|
2. Gracefully stop monitored processes
|
||||||
|
3. Disconnect from MQTT
|
||||||
|
4. Exit cleanly
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Auto-Recovery Logic
|
||||||
|
|
||||||
|
### 7.1 Restart Strategy
|
||||||
|
|
||||||
|
**Step 1: Detect Failure**
|
||||||
|
```
|
||||||
|
Trigger: Process not found in process list
|
||||||
|
Action:
|
||||||
|
- Log ERROR: "Process {name} crashed"
|
||||||
|
- Increment restart counter
|
||||||
|
- Check if within retry limit (max 3)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 2: Attempt Restart**
|
||||||
|
```
|
||||||
|
If restart_attempts < MAX_RESTARTS:
|
||||||
|
- Log WARN: "Attempting restart ({attempt}/{MAX_RESTARTS})"
|
||||||
|
- Kill any zombie processes
|
||||||
|
- Wait 2 seconds (cooldown)
|
||||||
|
- Launch process with same parameters
|
||||||
|
- Wait 5 seconds for startup
|
||||||
|
- Verify process is running
|
||||||
|
- If success: reset restart counter, log INFO
|
||||||
|
- If fail: increment counter, repeat
|
||||||
|
```
|
||||||
|
|
||||||
|
**Step 3: Permanent Failure**
|
||||||
|
```
|
||||||
|
If restart_attempts >= MAX_RESTARTS:
|
||||||
|
- Log ERROR: "Max restart attempts exceeded, failing over"
|
||||||
|
- Display fallback content (static image with error message)
|
||||||
|
- Send notification to server (separate alert topic, optional)
|
||||||
|
- Wait for manual intervention or scheduler event change
|
||||||
|
```
|
||||||
|
|
||||||
|
### 7.2 Restart Cooldown
|
||||||
|
|
||||||
|
**Purpose:** Prevent rapid restart loops that waste resources
|
||||||
|
|
||||||
|
**Implementation:**
|
||||||
|
```
|
||||||
|
After each restart attempt:
|
||||||
|
- Wait 2 seconds before next restart
|
||||||
|
- After 3 failures: wait 30 seconds before trying again
|
||||||
|
- Reset counter on successful run >5 minutes
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Resource Monitoring
|
||||||
|
|
||||||
|
### 8.1 System Metrics to Track
|
||||||
|
|
||||||
|
**CPU Usage:**
|
||||||
|
```
|
||||||
|
Method: Read /proc/stat or use psutil.cpu_percent()
|
||||||
|
Frequency: Every 5 seconds
|
||||||
|
Threshold: Warn if >80% for >60 seconds
|
||||||
|
```
|
||||||
|
|
||||||
|
**Memory Usage:**
|
||||||
|
```
|
||||||
|
Method: Read /proc/meminfo or use psutil.virtual_memory()
|
||||||
|
Frequency: Every 5 seconds
|
||||||
|
Threshold: Warn if >90% for >30 seconds
|
||||||
|
```
|
||||||
|
|
||||||
|
**Display Status:**
|
||||||
|
```
|
||||||
|
Method: Check DPMS state or xset query
|
||||||
|
Frequency: Every 30 seconds
|
||||||
|
Threshold: Error if display off (unexpected)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Network Connectivity:**
|
||||||
|
```
|
||||||
|
Method: Ping server or check MQTT connection
|
||||||
|
Frequency: Every 60 seconds
|
||||||
|
Threshold: Warn if no server connectivity
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Development vs Production Mode
|
||||||
|
|
||||||
|
### 9.1 Development Mode
|
||||||
|
|
||||||
|
**Enable via:** Environment variable `DEBUG=true` or `ENV=development`
|
||||||
|
|
||||||
|
**Behavior:**
|
||||||
|
- Send INFO level logs
|
||||||
|
- More verbose logging to console
|
||||||
|
- Shorter monitoring intervals (faster feedback)
|
||||||
|
- Screenshot capture every 30 seconds
|
||||||
|
- No rate limiting on logs
|
||||||
|
|
||||||
|
### 9.2 Production Mode
|
||||||
|
|
||||||
|
**Enable via:** `ENV=production`
|
||||||
|
|
||||||
|
**Behavior:**
|
||||||
|
- Send only ERROR and WARN logs
|
||||||
|
- Minimal console output
|
||||||
|
- Standard monitoring intervals
|
||||||
|
- Screenshot capture every 60 seconds
|
||||||
|
- Rate limiting: max 10 logs per minute per level
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Configuration File Format
|
||||||
|
|
||||||
|
### 10.1 Recommended Config: JSON
|
||||||
|
|
||||||
|
**File:** `/etc/infoscreen/config.json` or `~/.config/infoscreen/config.json`
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"client": {
|
||||||
|
"uuid": "9b8d1856-ff34-4864-a726-12de072d0f77",
|
||||||
|
"hostname": "infoscreen-room-101"
|
||||||
|
},
|
||||||
|
"mqtt": {
|
||||||
|
"broker": "192.168.43.201",
|
||||||
|
"port": 1883,
|
||||||
|
"username": "",
|
||||||
|
"password": "",
|
||||||
|
"keepalive": 60
|
||||||
|
},
|
||||||
|
"monitoring": {
|
||||||
|
"enabled": true,
|
||||||
|
"health_interval_seconds": 5,
|
||||||
|
"heartbeat_interval_seconds": 60,
|
||||||
|
"max_restart_attempts": 3,
|
||||||
|
"restart_cooldown_seconds": 2
|
||||||
|
},
|
||||||
|
"logging": {
|
||||||
|
"level": "INFO",
|
||||||
|
"send_info_logs": false,
|
||||||
|
"console_output": true,
|
||||||
|
"local_log_file": "/var/log/infoscreen/watchdog.log"
|
||||||
|
},
|
||||||
|
"processes": {
|
||||||
|
"vlc": {
|
||||||
|
"http_port": 8080,
|
||||||
|
"http_password": "vlc_password"
|
||||||
|
},
|
||||||
|
"chromium": {
|
||||||
|
"debug_port": 9222
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 11. Error Scenarios & Expected Behavior
|
||||||
|
|
||||||
|
### Scenario 1: VLC Crashes Mid-Video
|
||||||
|
```
|
||||||
|
1. Watchdog detects: process_status = "crashed"
|
||||||
|
2. Send ERROR log: "VLC process crashed"
|
||||||
|
3. Attempt 1: Restart VLC with same video, seek to last position
|
||||||
|
4. If success: Send INFO log "VLC restarted successfully"
|
||||||
|
5. If fail: Repeat 2 more times
|
||||||
|
6. After 3 failures: Send ERROR "Max restarts exceeded", show fallback
|
||||||
|
```
|
||||||
|
|
||||||
|
### Scenario 2: Network Timeout Loading Website
|
||||||
|
```
|
||||||
|
1. Chromium fails to load page (CDP reports error)
|
||||||
|
2. Send WARN log: "Page load timeout"
|
||||||
|
3. Attempt reload (Chromium refresh)
|
||||||
|
4. If success after 10s: Continue monitoring
|
||||||
|
5. If timeout again: Send ERROR, try restarting Chromium
|
||||||
|
```
|
||||||
|
|
||||||
|
### Scenario 3: Display Powers Off (Hardware)
|
||||||
|
```
|
||||||
|
1. DPMS check detects display off
|
||||||
|
2. Send ERROR log: "Display powered off"
|
||||||
|
3. Attempt to wake display (xset dpms force on)
|
||||||
|
4. If success: Send INFO log
|
||||||
|
5. If fail: Hardware issue, alert admin
|
||||||
|
```
|
||||||
|
|
||||||
|
### Scenario 4: High CPU Usage
|
||||||
|
```
|
||||||
|
1. CPU >80% for 60 seconds
|
||||||
|
2. Send WARN log: "High CPU usage: 85%"
|
||||||
|
3. Check if expected (e.g., video playback is normal)
|
||||||
|
4. If unexpected: investigate process causing it
|
||||||
|
5. If critical (>95%): consider restarting offending process
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 12. Testing & Validation
|
||||||
|
|
||||||
|
### 12.1 Manual Tests (During Development)
|
||||||
|
|
||||||
|
**Test 1: Process Crash Simulation**
|
||||||
|
```bash
|
||||||
|
# Start video, then kill VLC manually
|
||||||
|
killall vlc
|
||||||
|
# Expected: ERROR log sent, automatic restart within 5 seconds
|
||||||
|
```
|
||||||
|
|
||||||
|
**Test 2: MQTT Connectivity**
|
||||||
|
```bash
|
||||||
|
# Subscribe to all client topics on server
|
||||||
|
mosquitto_sub -h 192.168.43.201 -t "infoscreen/{uuid}/#" -v
|
||||||
|
# Expected: See heartbeat every 60s, health every 5s
|
||||||
|
```
|
||||||
|
|
||||||
|
**Test 3: Log Levels**
|
||||||
|
```bash
|
||||||
|
# Trigger error condition and verify log appears in database
|
||||||
|
curl http://192.168.43.201:8000/api/client-logs/test
|
||||||
|
# Expected: See new log entry with correct level/message
|
||||||
|
```
|
||||||
|
|
||||||
|
### 12.2 Acceptance Criteria
|
||||||
|
|
||||||
|
✅ **Client must:**
|
||||||
|
1. Send heartbeat every 60 seconds without gaps
|
||||||
|
2. Send ERROR log within 5 seconds of process crash
|
||||||
|
3. Attempt automatic restart (max 3 times)
|
||||||
|
4. Report health metrics every 5 seconds
|
||||||
|
5. Survive MQTT broker restart (reconnect automatically)
|
||||||
|
6. Survive network interruption (buffer logs, send when reconnected)
|
||||||
|
7. Use correct timestamp format (ISO 8601 UTC)
|
||||||
|
8. Only send logs for real client UUID (FK constraint)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 13. Python Libraries (Recommended)
|
||||||
|
|
||||||
|
**For process monitoring:**
|
||||||
|
- `psutil` - Cross-platform process and system utilities
|
||||||
|
|
||||||
|
**For MQTT:**
|
||||||
|
- `paho-mqtt` - Official MQTT client (use v2.x with Callback API v2)
|
||||||
|
|
||||||
|
**For VLC control:**
|
||||||
|
- `requests` - HTTP client for status queries
|
||||||
|
|
||||||
|
**For Chromium control:**
|
||||||
|
- `websocket-client` or `pychrome` - Chrome DevTools Protocol
|
||||||
|
|
||||||
|
**For datetime:**
|
||||||
|
- `datetime` (stdlib) - Use `datetime.now(timezone.utc).isoformat()`
|
||||||
|
|
||||||
|
**Example requirements.txt:**
|
||||||
|
```
|
||||||
|
paho-mqtt>=2.0.0
|
||||||
|
psutil>=5.9.0
|
||||||
|
requests>=2.31.0
|
||||||
|
python-dateutil>=2.8.0
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 14. Security Considerations
|
||||||
|
|
||||||
|
### 14.1 MQTT Security
|
||||||
|
- If broker requires auth, store credentials in config file with restricted permissions (`chmod 600`)
|
||||||
|
- Consider TLS/SSL for MQTT (port 8883) if on untrusted network
|
||||||
|
- Use unique client ID to prevent impersonation
|
||||||
|
|
||||||
|
### 14.2 Process Control APIs
|
||||||
|
- VLC HTTP password should be random, not default
|
||||||
|
- Chromium debug port should bind to `127.0.0.1` only (not `0.0.0.0`)
|
||||||
|
- Restrict file system access for media player processes
|
||||||
|
|
||||||
|
### 14.3 Log Content
|
||||||
|
- **Do not log:** Passwords, API keys, personal data
|
||||||
|
- **Sanitize:** File paths (strip user directories), URLs (remove query params with tokens)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 15. Performance Targets
|
||||||
|
|
||||||
|
| Metric | Target | Acceptable | Critical |
|
||||||
|
|--------|--------|------------|----------|
|
||||||
|
| Health check interval | 5s | 10s | 30s |
|
||||||
|
| Crash detection time | <5s | <10s | <30s |
|
||||||
|
| Restart time | <10s | <20s | <60s |
|
||||||
|
| MQTT publish latency | <100ms | <500ms | <2s |
|
||||||
|
| CPU usage (watchdog) | <2% | <5% | <10% |
|
||||||
|
| RAM usage (watchdog) | <50MB | <100MB | <200MB |
|
||||||
|
| Log message size | <1KB | <10KB | <100KB |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 16. Troubleshooting Guide (For Client Development)
|
||||||
|
|
||||||
|
### Issue: Logs not appearing in server database
|
||||||
|
**Check:**
|
||||||
|
1. Is MQTT broker reachable? (`mosquitto_pub` test from client)
|
||||||
|
2. Is client UUID correct and exists in `clients` table?
|
||||||
|
3. Is timestamp format correct (ISO 8601 with 'Z')?
|
||||||
|
4. Check server listener logs for errors
|
||||||
|
|
||||||
|
### Issue: Health metrics not updating
|
||||||
|
**Check:**
|
||||||
|
1. Is health loop running? (check watchdog service status)
|
||||||
|
2. Is MQTT connected? (check connection status in logs)
|
||||||
|
3. Is payload JSON valid? (use JSON validator)
|
||||||
|
|
||||||
|
### Issue: Process restarts in loop
|
||||||
|
**Check:**
|
||||||
|
1. Is media file/URL accessible?
|
||||||
|
2. Is process command correct? (test manually)
|
||||||
|
3. Check process exit code (crash reason)
|
||||||
|
4. Increase restart cooldown to avoid rapid loops
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 17. Complete Message Flow Diagram
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────┐
|
||||||
|
│ Infoscreen Client │
|
||||||
|
│ │
|
||||||
|
│ Event Occurs: │
|
||||||
|
│ - Process crashed │
|
||||||
|
│ - High CPU usage │
|
||||||
|
│ - Content loaded │
|
||||||
|
│ │
|
||||||
|
│ ┌────────────────┐ │
|
||||||
|
│ │ Decision Logic │ │
|
||||||
|
│ │ - Is it ERROR?│ │
|
||||||
|
│ │ - Is it WARN? │ │
|
||||||
|
│ │ - Is it INFO? │ │
|
||||||
|
│ └────────┬───────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌────────────────────────────────┐ │
|
||||||
|
│ │ Build JSON Payload │ │
|
||||||
|
│ │ { │ │
|
||||||
|
│ │ "timestamp": "...", │ │
|
||||||
|
│ │ "message": "...", │ │
|
||||||
|
│ │ "context": {...} │ │
|
||||||
|
│ │ } │ │
|
||||||
|
│ └────────┬───────────────────────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌────────────────────────────────┐ │
|
||||||
|
│ │ MQTT Publish │ │
|
||||||
|
│ │ Topic: infoscreen/{uuid}/logs/error │
|
||||||
|
│ │ QoS: 1 │ │
|
||||||
|
│ └────────┬───────────────────────┘ │
|
||||||
|
└───────────┼──────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
│ TCP/IP (MQTT Protocol)
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────┐
|
||||||
|
│ MQTT Broker │
|
||||||
|
│ (Mosquitto) │
|
||||||
|
└──────┬───────┘
|
||||||
|
│
|
||||||
|
│ Topic: infoscreen/+/logs/#
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────────────────┐
|
||||||
|
│ Listener Service │
|
||||||
|
│ (Python) │
|
||||||
|
│ │
|
||||||
|
│ - Parse JSON │
|
||||||
|
│ - Validate UUID │
|
||||||
|
│ - Store in database │
|
||||||
|
└──────┬───────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────────────────┐
|
||||||
|
│ MariaDB Database │
|
||||||
|
│ │
|
||||||
|
│ Table: client_logs │
|
||||||
|
│ - client_uuid │
|
||||||
|
│ - timestamp │
|
||||||
|
│ - level │
|
||||||
|
│ - message │
|
||||||
|
│ - context (JSON) │
|
||||||
|
└──────┬───────────────────────┘
|
||||||
|
│
|
||||||
|
│ SQL Query
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────────────────┐
|
||||||
|
│ API Server (Flask) │
|
||||||
|
│ │
|
||||||
|
│ GET /api/client-logs/{uuid}/logs
|
||||||
|
│ GET /api/client-logs/summary
|
||||||
|
└──────┬───────────────────────┘
|
||||||
|
│
|
||||||
|
│ HTTP/JSON
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌──────────────────────────────┐
|
||||||
|
│ Dashboard (React) │
|
||||||
|
│ │
|
||||||
|
│ - Display logs │
|
||||||
|
│ - Filter by level │
|
||||||
|
│ - Show health status │
|
||||||
|
└───────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 18. Quick Reference Card
|
||||||
|
|
||||||
|
### MQTT Topics Summary
|
||||||
|
```
|
||||||
|
infoscreen/{uuid}/logs/error → Critical failures
|
||||||
|
infoscreen/{uuid}/logs/warn → Non-critical issues
|
||||||
|
infoscreen/{uuid}/logs/info → Informational (dev mode)
|
||||||
|
infoscreen/{uuid}/health → Health metrics (every 5s)
|
||||||
|
infoscreen/{uuid}/heartbeat → Enhanced heartbeat (every 60s)
|
||||||
|
```
|
||||||
|
|
||||||
|
### JSON Timestamp Format
|
||||||
|
```python
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
timestamp = datetime.now(timezone.utc).isoformat()
|
||||||
|
# Output: "2026-03-10T07:30:00+00:00" or "2026-03-10T07:30:00Z"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Process Status Values
|
||||||
|
```
|
||||||
|
"running" - Process is alive and responding
|
||||||
|
"crashed" - Process terminated unexpectedly
|
||||||
|
"starting" - Process is launching (startup phase)
|
||||||
|
"stopped" - Process intentionally stopped
|
||||||
|
```
|
||||||
|
|
||||||
|
### Restart Logic
|
||||||
|
```
|
||||||
|
Max attempts: 3
|
||||||
|
Cooldown: 2 seconds between attempts
|
||||||
|
Reset: After 5 minutes of successful operation
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 19. Contact & Support
|
||||||
|
|
||||||
|
**Server API Documentation:**
|
||||||
|
- Base URL: `http://192.168.43.201:8000`
|
||||||
|
- Health check: `GET /health`
|
||||||
|
- Test logs: `GET /api/client-logs/test` (no auth)
|
||||||
|
- Full API docs: See `CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md` on server
|
||||||
|
|
||||||
|
**MQTT Broker:**
|
||||||
|
- Host: `192.168.43.201`
|
||||||
|
- Port: `1883` (standard), `9001` (WebSocket)
|
||||||
|
- Test tool: `mosquitto_pub` / `mosquitto_sub`
|
||||||
|
|
||||||
|
**Database Schema:**
|
||||||
|
- Table: `client_logs`
|
||||||
|
- Foreign Key: `client_uuid` → `clients.uuid` (ON DELETE CASCADE)
|
||||||
|
- Constraint: UUID must exist in clients table before logging
|
||||||
|
|
||||||
|
**Server-Side Logs:**
|
||||||
|
```bash
|
||||||
|
# View listener logs (processes MQTT messages)
|
||||||
|
docker compose logs -f listener
|
||||||
|
|
||||||
|
# View server logs (API requests)
|
||||||
|
docker compose logs -f server
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 20. Appendix: Example Implementations
|
||||||
|
|
||||||
|
### A. Minimal Python Watchdog (Pseudocode)
|
||||||
|
|
||||||
|
```python
|
||||||
|
import time
|
||||||
|
import json
|
||||||
|
import psutil
|
||||||
|
import paho.mqtt.client as mqtt
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
|
||||||
|
class MinimalWatchdog:
|
||||||
|
def __init__(self, client_uuid, mqtt_broker):
|
||||||
|
self.uuid = client_uuid
|
||||||
|
self.mqtt_client = mqtt.Client(callback_api_version=mqtt.CallbackAPIVersion.VERSION2)
|
||||||
|
self.mqtt_client.connect(mqtt_broker, 1883, 60)
|
||||||
|
self.mqtt_client.loop_start()
|
||||||
|
|
||||||
|
self.expected_process = None
|
||||||
|
self.restart_attempts = 0
|
||||||
|
self.MAX_RESTARTS = 3
|
||||||
|
|
||||||
|
def send_log(self, level, message, context=None):
|
||||||
|
topic = f"infoscreen/{self.uuid}/logs/{level}"
|
||||||
|
payload = {
|
||||||
|
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||||
|
"message": message,
|
||||||
|
"context": context or {}
|
||||||
|
}
|
||||||
|
self.mqtt_client.publish(topic, json.dumps(payload), qos=1)
|
||||||
|
|
||||||
|
def is_process_running(self, process_name):
|
||||||
|
for proc in psutil.process_iter(['name']):
|
||||||
|
if process_name in proc.info['name']:
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
def monitor_loop(self):
|
||||||
|
while True:
|
||||||
|
if self.expected_process:
|
||||||
|
if not self.is_process_running(self.expected_process):
|
||||||
|
self.send_log("error", f"{self.expected_process} crashed")
|
||||||
|
if self.restart_attempts < self.MAX_RESTARTS:
|
||||||
|
self.restart_process()
|
||||||
|
else:
|
||||||
|
self.send_log("error", "Max restarts exceeded")
|
||||||
|
|
||||||
|
time.sleep(5)
|
||||||
|
|
||||||
|
# Usage:
|
||||||
|
watchdog = MinimalWatchdog("9b8d1856-ff34-4864-a726-12de072d0f77", "192.168.43.201")
|
||||||
|
watchdog.expected_process = "vlc"
|
||||||
|
watchdog.monitor_loop()
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**END OF SPECIFICATION**
|
||||||
|
|
||||||
|
Questions? Refer to:
|
||||||
|
- `CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md` (server repo)
|
||||||
|
- Server API: `http://192.168.43.201:8000/api/client-logs/test`
|
||||||
|
- MQTT test: `mosquitto_sub -h 192.168.43.201 -t infoscreen/#`
|
||||||
@@ -5,6 +5,10 @@ This changelog tracks all changes made in the development workspace, including i
|
|||||||
---
|
---
|
||||||
|
|
||||||
## Unreleased (development workspace)
|
## Unreleased (development workspace)
|
||||||
|
- Monitoring system completion: End-to-end monitoring pipeline is active (MQTT logs/health → listener persistence → monitoring APIs → superadmin dashboard).
|
||||||
|
- Monitoring API: Added/active endpoints `GET /api/client-logs/monitoring-overview` and `GET /api/client-logs/recent-errors`; per-client logs via `GET /api/client-logs/<uuid>/logs`.
|
||||||
|
- Dashboard monitoring UI: Superadmin monitoring page is integrated and displays client health status, screenshots, process metadata, and recent error activity.
|
||||||
|
- Bugfix: Presentation flags `page_progress` and `auto_progress` now persist reliably across create/update and detached-occurrence flows.
|
||||||
- Frontend (Settings → Events): Added Presentations defaults (slideshow interval, page-progress, auto-progress) with load/save via `/api/system-settings`; UI uses Syncfusion controls.
|
- Frontend (Settings → Events): Added Presentations defaults (slideshow interval, page-progress, auto-progress) with load/save via `/api/system-settings`; UI uses Syncfusion controls.
|
||||||
- Backend defaults: Seeded `presentation_interval` ("10"), `presentation_page_progress` ("true"), `presentation_auto_progress` ("true") in `server/init_defaults.py` when missing.
|
- Backend defaults: Seeded `presentation_interval` ("10"), `presentation_page_progress` ("true"), `presentation_auto_progress` ("true") in `server/init_defaults.py` when missing.
|
||||||
- Data model: Added per-event fields `page_progress` and `auto_progress` on `Event`; Alembic migration applied successfully.
|
- Data model: Added per-event fields `page_progress` and `auto_progress` on `Event`; Alembic migration applied successfully.
|
||||||
|
|||||||
194
MQTT_PAYLOAD_MIGRATION_GUIDE.md
Normal file
194
MQTT_PAYLOAD_MIGRATION_GUIDE.md
Normal file
@@ -0,0 +1,194 @@
|
|||||||
|
# MQTT Payload Migration Guide
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
This guide describes a practical migration from the current dashboard screenshot payload to a grouped schema, with client-side implementation first and server-side migration second.
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
- Environment: development and alpha systems (no production installs)
|
||||||
|
- Message topic: infoscreen/<client_id>/dashboard
|
||||||
|
- Capture types to preserve: periodic, event_start, event_stop
|
||||||
|
|
||||||
|
## Target Schema (v2)
|
||||||
|
The canonical message should be grouped into four logical blocks in this order:
|
||||||
|
|
||||||
|
1. message
|
||||||
|
2. content
|
||||||
|
3. runtime
|
||||||
|
4. metadata
|
||||||
|
|
||||||
|
Example shape:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"message": {
|
||||||
|
"client_id": "<uuid>",
|
||||||
|
"status": "alive"
|
||||||
|
},
|
||||||
|
"content": {
|
||||||
|
"screenshot": {
|
||||||
|
"filename": "latest.jpg",
|
||||||
|
"data": "<base64>",
|
||||||
|
"timestamp": "2026-03-30T10:15:41.123456+00:00",
|
||||||
|
"size": 183245
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"runtime": {
|
||||||
|
"system_info": {
|
||||||
|
"hostname": "pi-display-01",
|
||||||
|
"ip": "192.168.1.42",
|
||||||
|
"uptime": 123456.7
|
||||||
|
},
|
||||||
|
"process_health": {
|
||||||
|
"event_id": "evt-123",
|
||||||
|
"event_type": "presentation",
|
||||||
|
"current_process": "impressive",
|
||||||
|
"process_pid": 4123,
|
||||||
|
"process_status": "running",
|
||||||
|
"restart_count": 0
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"metadata": {
|
||||||
|
"schema_version": "2.0",
|
||||||
|
"producer": "simclient",
|
||||||
|
"published_at": "2026-03-30T10:15:42.004321+00:00",
|
||||||
|
"capture": {
|
||||||
|
"type": "periodic",
|
||||||
|
"captured_at": "2026-03-30T10:15:41.123456+00:00",
|
||||||
|
"age_s": 0.9,
|
||||||
|
"triggered": false,
|
||||||
|
"send_immediately": false
|
||||||
|
},
|
||||||
|
"transport": {
|
||||||
|
"qos": 0,
|
||||||
|
"publisher": "simclient"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step-by-Step: Client-Side First
|
||||||
|
|
||||||
|
1. Create a migration branch.
|
||||||
|
- Example: feature/payload-v2
|
||||||
|
|
||||||
|
2. Freeze a baseline sample from MQTT.
|
||||||
|
- Capture one payload via mosquitto_sub and store it for comparison.
|
||||||
|
|
||||||
|
3. Implement one canonical payload builder.
|
||||||
|
- Centralize JSON assembly in one function only.
|
||||||
|
- Do not duplicate payload construction across code paths.
|
||||||
|
|
||||||
|
4. Add versioned metadata.
|
||||||
|
- Set metadata.schema_version = "2.0".
|
||||||
|
- Add metadata.producer = "simclient".
|
||||||
|
- Add metadata.published_at in UTC ISO format.
|
||||||
|
|
||||||
|
5. Map existing data into grouped blocks.
|
||||||
|
- client_id/status -> message
|
||||||
|
- screenshot object -> content.screenshot
|
||||||
|
- system_info/process_health -> runtime
|
||||||
|
- capture mode and freshness -> metadata.capture
|
||||||
|
|
||||||
|
6. Preserve existing capture semantics.
|
||||||
|
- Keep type values unchanged: periodic, event_start, event_stop.
|
||||||
|
- Keep UTC ISO timestamps.
|
||||||
|
- Keep screenshot encoding and size behavior unchanged.
|
||||||
|
|
||||||
|
7. Optional short-term compatibility mode (recommended for one sprint).
|
||||||
|
- Either:
|
||||||
|
- Keep current legacy fields in parallel, or
|
||||||
|
- Add a legacy block with old field names.
|
||||||
|
- Goal: prevent immediate server breakage while parser updates are merged.
|
||||||
|
|
||||||
|
8. Improve publish logs for verification.
|
||||||
|
- Log schema_version, metadata.capture.type, metadata.capture.age_s.
|
||||||
|
|
||||||
|
9. Validate all three capture paths end-to-end.
|
||||||
|
- periodic capture
|
||||||
|
- event_start trigger capture
|
||||||
|
- event_stop trigger capture
|
||||||
|
|
||||||
|
10. Lock the client contract.
|
||||||
|
- Save one validated JSON sample per capture type.
|
||||||
|
- Use those samples in server parser tests.
|
||||||
|
|
||||||
|
## Step-by-Step: Server-Side Migration
|
||||||
|
|
||||||
|
1. Add support for grouped v2 parsing.
|
||||||
|
- Parse from message/content/runtime/metadata first.
|
||||||
|
|
||||||
|
2. Add fallback parser for legacy payload (temporary).
|
||||||
|
- If grouped keys are absent, parse old top-level keys.
|
||||||
|
|
||||||
|
3. Normalize to one internal server model.
|
||||||
|
- Convert both parser paths into one DTO/entity used by dashboard logic.
|
||||||
|
|
||||||
|
4. Validate required fields.
|
||||||
|
- Required:
|
||||||
|
- message.client_id
|
||||||
|
- message.status
|
||||||
|
- metadata.schema_version
|
||||||
|
- metadata.capture.type
|
||||||
|
- Optional:
|
||||||
|
- runtime.process_health
|
||||||
|
- content.screenshot (if no screenshot available)
|
||||||
|
|
||||||
|
5. Update dashboard consumers.
|
||||||
|
- Read grouped fields from internal model (not raw old keys).
|
||||||
|
|
||||||
|
6. Add migration observability.
|
||||||
|
- Counters:
|
||||||
|
- v2 parse success
|
||||||
|
- legacy fallback usage
|
||||||
|
- parse failures
|
||||||
|
- Warning log for unknown schema_version.
|
||||||
|
|
||||||
|
7. Run mixed-format integration tests.
|
||||||
|
- New client -> new server
|
||||||
|
- Legacy client -> new server (fallback path)
|
||||||
|
|
||||||
|
8. Cut over to v2 preferred.
|
||||||
|
- Keep fallback for short soak period only.
|
||||||
|
|
||||||
|
9. Remove fallback and legacy assumptions.
|
||||||
|
- After stability window, remove old parser path.
|
||||||
|
|
||||||
|
10. Final cleanup.
|
||||||
|
- Keep one schema doc and test fixtures.
|
||||||
|
- Remove temporary compatibility switches.
|
||||||
|
|
||||||
|
## Legacy to v2 Field Mapping
|
||||||
|
|
||||||
|
| Legacy field | v2 field |
|
||||||
|
|---|---|
|
||||||
|
| client_id | message.client_id |
|
||||||
|
| status | message.status |
|
||||||
|
| screenshot | content.screenshot |
|
||||||
|
| screenshot_type | metadata.capture.type |
|
||||||
|
| screenshot_age_s | metadata.capture.age_s |
|
||||||
|
| timestamp | metadata.published_at |
|
||||||
|
| system_info | runtime.system_info |
|
||||||
|
| process_health | runtime.process_health |
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
1. All capture types parse and display correctly.
|
||||||
|
- periodic
|
||||||
|
- event_start
|
||||||
|
- event_stop
|
||||||
|
|
||||||
|
2. Screenshot payload integrity is unchanged.
|
||||||
|
- filename, data, timestamp, size remain valid.
|
||||||
|
|
||||||
|
3. Metadata is centrally visible at message end.
|
||||||
|
- schema_version, capture metadata, transport metadata all inside metadata.
|
||||||
|
|
||||||
|
4. No regression in dashboard update timing.
|
||||||
|
- Triggered screenshots still publish quickly.
|
||||||
|
|
||||||
|
## Suggested Timeline (Dev Only)
|
||||||
|
|
||||||
|
1. Day 1: client v2 payload implementation + local tests
|
||||||
|
2. Day 2: server v2 parser + fallback
|
||||||
|
3. Day 3-5: soak in dev, monitor parse metrics
|
||||||
|
4. Day 6+: remove fallback and finalize v2-only
|
||||||
533
PHASE_3_CLIENT_MONITORING_IMPLEMENTATION.md
Normal file
533
PHASE_3_CLIENT_MONITORING_IMPLEMENTATION.md
Normal file
@@ -0,0 +1,533 @@
|
|||||||
|
# Phase 3: Client-Side Monitoring Implementation
|
||||||
|
|
||||||
|
**Status**: ✅ COMPLETE
|
||||||
|
**Date**: 11. März 2026
|
||||||
|
**Architecture**: Two-process design with health-state bridge
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This document describes the **Phase 3** client-side monitoring implementation integrated into the existing infoscreen-dev codebase. The implementation adds:
|
||||||
|
|
||||||
|
1. ✅ **Health-state tracking** for all display processes (Impressive, Chromium, VLC)
|
||||||
|
2. ✅ **Tiered logging**: Local rotating logs + selective MQTT transmission
|
||||||
|
3. ✅ **Process crash detection** with bounded restart attempts
|
||||||
|
4. ✅ **MQTT health/log topics** feeding the monitoring server
|
||||||
|
5. ✅ **Impressive-aware process mapping** (presentations → impressive, websites → chromium, videos → vlc)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
### Two-Process Design
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────┐
|
||||||
|
│ simclient.py (MQTT Client) │
|
||||||
|
│ - Discovers device, sends heartbeat │
|
||||||
|
│ - Downloads presentation files │
|
||||||
|
│ - Reads health state from display_manager │
|
||||||
|
│ - Publishes health/log messages to MQTT │
|
||||||
|
│ - Sends screenshots for dashboard │
|
||||||
|
└────────┬────────────────────────────────────┬───────────┘
|
||||||
|
│ │
|
||||||
|
│ reads: current_process_health.json │
|
||||||
|
│ │
|
||||||
|
│ writes: current_event.json │
|
||||||
|
│ │
|
||||||
|
┌────────▼────────────────────────────────────▼───────────┐
|
||||||
|
│ display_manager.py (Display Control) │
|
||||||
|
│ - Monitors events and manages displays │
|
||||||
|
│ - Launches Impressive (presentations) │
|
||||||
|
│ - Launches Chromium (websites) │
|
||||||
|
│ - Launches VLC (videos) │
|
||||||
|
│ - Tracks process health and crashes │
|
||||||
|
│ - Detects and restarts crashed processes │
|
||||||
|
│ - Writes health state to JSON bridge │
|
||||||
|
│ - Captures screenshots to shared folder │
|
||||||
|
└─────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Details
|
||||||
|
|
||||||
|
### 1. Health State Tracking (display_manager.py)
|
||||||
|
|
||||||
|
**File**: `src/display_manager.py`
|
||||||
|
**New Class**: `ProcessHealthState`
|
||||||
|
|
||||||
|
Tracks process health and persists to JSON for simclient to read:
|
||||||
|
|
||||||
|
```python
|
||||||
|
class ProcessHealthState:
|
||||||
|
"""Track and persist process health state for monitoring integration"""
|
||||||
|
|
||||||
|
- event_id: Currently active event identifier
|
||||||
|
- event_type: presentation, website, video, or None
|
||||||
|
- process_name: impressive, chromium-browser, vlc, or None
|
||||||
|
- process_pid: Process ID or None for libvlc
|
||||||
|
- status: running, crashed, starting, stopped
|
||||||
|
- restart_count: Number of restart attempts
|
||||||
|
- max_restarts: Maximum allowed restarts (3)
|
||||||
|
```
|
||||||
|
|
||||||
|
Methods:
|
||||||
|
- `update_running()` - Mark process as started (logs to monitoring.log)
|
||||||
|
- `update_crashed()` - Mark process as crashed (warning to monitoring.log)
|
||||||
|
- `update_restart_attempt()` - Increment restart counter (logs attempt and checks max)
|
||||||
|
- `update_stopped()` - Mark process as stopped (info to monitoring.log)
|
||||||
|
- `save()` - Persist state to `src/current_process_health.json`
|
||||||
|
|
||||||
|
**New Health State File**: `src/current_process_health.json`
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"event_id": "event_123",
|
||||||
|
"event_type": "presentation",
|
||||||
|
"current_process": "impressive",
|
||||||
|
"process_pid": 1234,
|
||||||
|
"process_status": "running",
|
||||||
|
"restart_count": 0,
|
||||||
|
"timestamp": "2026-03-11T10:30:45.123456+00:00"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Monitoring Logger (both files)
|
||||||
|
|
||||||
|
**Local Rotating Logs**: 5 files × 5 MB each = 25 MB max per device
|
||||||
|
|
||||||
|
**display_manager.py**:
|
||||||
|
```python
|
||||||
|
MONITORING_LOG_PATH = "logs/monitoring.log"
|
||||||
|
monitoring_logger = logging.getLogger("monitoring")
|
||||||
|
monitoring_handler = RotatingFileHandler(MONITORING_LOG_PATH, maxBytes=5*1024*1024, backupCount=5)
|
||||||
|
```
|
||||||
|
|
||||||
|
**simclient.py**:
|
||||||
|
- Shares same `logs/monitoring.log` file
|
||||||
|
- Both processes write to monitoring logger for health events
|
||||||
|
- Local logs never rotate (persisted for technician inspection)
|
||||||
|
|
||||||
|
**Log Filtering** (tiered strategy):
|
||||||
|
- **ERROR**: Local + MQTT (published to `infoscreen/{uuid}/logs/error`)
|
||||||
|
- **WARN**: Local + MQTT (published to `infoscreen/{uuid}/logs/warn`)
|
||||||
|
- **INFO**: Local only (unless `DEBUG_MODE=1`)
|
||||||
|
- **DEBUG**: Local only (always)
|
||||||
|
|
||||||
|
### 3. Process Mapping with Impressive Support
|
||||||
|
|
||||||
|
**display_manager.py** - When starting processes:
|
||||||
|
|
||||||
|
| Event Type | Process Name | Health Status |
|
||||||
|
|-----------|--------------|---------------|
|
||||||
|
| presentation | `impressive` | tracked with PID |
|
||||||
|
| website/webpage/webuntis | `chromium` or `chromium-browser` | tracked with PID |
|
||||||
|
| video | `vlc` | tracked (may have no PID if using libvlc) |
|
||||||
|
|
||||||
|
**Per-Process Updates**:
|
||||||
|
- Presentation: `health.update_running('event_id', 'presentation', 'impressive', pid)`
|
||||||
|
- Website: `health.update_running('event_id', 'website', browser_name, pid)`
|
||||||
|
- Video: `health.update_running('event_id', 'video', 'vlc', pid or None)`
|
||||||
|
|
||||||
|
### 4. Crash Detection and Restart Logic
|
||||||
|
|
||||||
|
**display_manager.py** - `process_events()` method:
|
||||||
|
|
||||||
|
```
|
||||||
|
If process not running AND same event_id:
|
||||||
|
├─ Check exit code
|
||||||
|
├─ If presentation with exit code 0: Normal completion (no restart)
|
||||||
|
├─ Else: Mark crashed
|
||||||
|
│ ├─ health.update_crashed()
|
||||||
|
│ └─ health.update_restart_attempt()
|
||||||
|
│ ├─ If restart_count > max_restarts: Give up
|
||||||
|
│ └─ Else: Restart display (loop back to start_display_for_event)
|
||||||
|
└─ Log to monitoring.log at each step
|
||||||
|
```
|
||||||
|
|
||||||
|
**Restart Logic**:
|
||||||
|
- Max 3 restart attempts per event
|
||||||
|
- Restarts only if same event still active
|
||||||
|
- Graceful exit (code 0) for Impressive auto-quit presentations is treated as normal
|
||||||
|
- All crashes logged to monitoring.log with context
|
||||||
|
|
||||||
|
### 5. MQTT Health and Log Topics
|
||||||
|
|
||||||
|
**simclient.py** - New functions:
|
||||||
|
|
||||||
|
**`read_health_state()`**
|
||||||
|
- Reads `src/current_process_health.json` written by display_manager
|
||||||
|
- Returns dict or None if no active process
|
||||||
|
|
||||||
|
**`publish_health_message(client, client_id)`**
|
||||||
|
- Topic: `infoscreen/{uuid}/health`
|
||||||
|
- QoS: 1 (reliable)
|
||||||
|
- Payload:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"timestamp": "2026-03-11T10:30:45.123456+00:00",
|
||||||
|
"expected_state": {
|
||||||
|
"event_id": "event_123"
|
||||||
|
},
|
||||||
|
"actual_state": {
|
||||||
|
"process": "impressive",
|
||||||
|
"pid": 1234,
|
||||||
|
"status": "running"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**`publish_log_message(client, client_id, level, message, context)`**
|
||||||
|
- Topics: `infoscreen/{uuid}/logs/error` or `infoscreen/{uuid}/logs/warn`
|
||||||
|
- QoS: 1 (reliable)
|
||||||
|
- Log level filtering (only ERROR/WARN sent unless DEBUG_MODE=1)
|
||||||
|
- Payload:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"timestamp": "2026-03-11T10:30:45.123456+00:00",
|
||||||
|
"message": "Process started: event_id=123 event_type=presentation process=impressive pid=1234",
|
||||||
|
"context": {
|
||||||
|
"event_id": "event_123",
|
||||||
|
"process": "impressive",
|
||||||
|
"event_type": "presentation"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Enhanced Dashboard Heartbeat**:
|
||||||
|
- Topic: `infoscreen/{uuid}/dashboard`
|
||||||
|
- Now includes `process_health` block with event_id, process name, status, restart count
|
||||||
|
|
||||||
|
### 6. Integration Points
|
||||||
|
|
||||||
|
**Existing Features Preserved**:
|
||||||
|
- ✅ Impressive PDF presentations with auto-advance and loop
|
||||||
|
- ✅ Chromium website display with auto-scroll injection
|
||||||
|
- ✅ VLC video playback (python-vlc preferred, binary fallback)
|
||||||
|
- ✅ Screenshot capture and transmission
|
||||||
|
- ✅ HDMI-CEC TV control
|
||||||
|
- ✅ Two-process architecture
|
||||||
|
|
||||||
|
**New Integration Points**:
|
||||||
|
|
||||||
|
| File | Function | Change |
|
||||||
|
|------|----------|--------|
|
||||||
|
| display_manager.py | `__init__()` | Initialize `ProcessHealthState()` |
|
||||||
|
| display_manager.py | `start_presentation()` | Call `health.update_running()` with impressive |
|
||||||
|
| display_manager.py | `start_video()` | Call `health.update_running()` with vlc |
|
||||||
|
| display_manager.py | `start_webpage()` | Call `health.update_running()` with chromium |
|
||||||
|
| display_manager.py | `process_events()` | Detect crashes, call `health.update_crashed()` and `update_restart_attempt()` |
|
||||||
|
| display_manager.py | `stop_current_display()` | Call `health.update_stopped()` |
|
||||||
|
| simclient.py | `screenshot_service_thread()` | (No changes to interval) |
|
||||||
|
| simclient.py | Main heartbeat loop | Call `publish_health_message()` after successful heartbeat |
|
||||||
|
| simclient.py | `send_screenshot_heartbeat()` | Read health state and include in dashboard payload |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Logging Hierarchy
|
||||||
|
|
||||||
|
### Local Rotating Files (5 × 5 MB)
|
||||||
|
|
||||||
|
**`logs/display_manager.log`** (existing - updated):
|
||||||
|
- Display event processing
|
||||||
|
- Process lifecycle (start/stop)
|
||||||
|
- HDMI-CEC operations
|
||||||
|
- Presentation status
|
||||||
|
- Video/website startup
|
||||||
|
|
||||||
|
**`logs/simclient.log`** (existing - updated):
|
||||||
|
- MQTT connection/reconnection
|
||||||
|
- Discovery and heartbeat
|
||||||
|
- File downloads
|
||||||
|
- Group membership changes
|
||||||
|
- Dashboard payload info
|
||||||
|
|
||||||
|
**`logs/monitoring.log`** (NEW):
|
||||||
|
- Process health events (start, crash, restart, stop)
|
||||||
|
- Both display_manager and simclient write here
|
||||||
|
- Centralized health tracking
|
||||||
|
- Technician-focused: "What happened to the processes?"
|
||||||
|
|
||||||
|
```
|
||||||
|
# Example monitoring.log entries:
|
||||||
|
2026-03-11 10:30:45 [INFO] Process started: event_id=event_123 event_type=presentation process=impressive pid=1234
|
||||||
|
2026-03-11 10:35:20 [WARNING] Process crashed: event_id=event_123 event_type=presentation process=impressive restart_count=0/3
|
||||||
|
2026-03-11 10:35:20 [WARNING] Restarting process: attempt 1/3 for impressive
|
||||||
|
2026-03-11 10:35:25 [INFO] Process started: event_id=event_123 event_type=presentation process=impressive pid=1245
|
||||||
|
```
|
||||||
|
|
||||||
|
### MQTT Transmission (Selective)
|
||||||
|
|
||||||
|
**Always sent** (when error occurs):
|
||||||
|
- `infoscreen/{uuid}/logs/error` - Critical failures
|
||||||
|
- `infoscreen/{uuid}/logs/warn` - Restarts, crashes, missing binaries
|
||||||
|
|
||||||
|
**Development mode only** (if DEBUG_MODE=1):
|
||||||
|
- `infoscreen/{uuid}/logs/info` - Event start/stop, process running status
|
||||||
|
|
||||||
|
**Never sent**:
|
||||||
|
- DEBUG messages (local-only debug details)
|
||||||
|
- INFO messages in production
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
No new required variables. Existing configuration supports monitoring:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Existing (unchanged):
|
||||||
|
ENV=development|production
|
||||||
|
DEBUG_MODE=0|1 # Enables INFO logs to MQTT
|
||||||
|
LOG_LEVEL=DEBUG|INFO|WARNING|ERROR # Local log verbosity
|
||||||
|
HEARTBEAT_INTERVAL=5|60 # seconds
|
||||||
|
SCREENSHOT_INTERVAL=30|300 # seconds (display_manager_screenshot_capture)
|
||||||
|
|
||||||
|
# Recommended for monitoring:
|
||||||
|
SCREENSHOT_CAPTURE_INTERVAL=30 # How often display_manager captures screenshots
|
||||||
|
SCREENSHOT_MAX_WIDTH=800 # Downscale for bandwidth
|
||||||
|
SCREENSHOT_JPEG_QUALITY=70 # Balance quality/size
|
||||||
|
|
||||||
|
# File server (if different from MQTT broker):
|
||||||
|
FILE_SERVER_HOST=192.168.1.100
|
||||||
|
FILE_SERVER_PORT=8000
|
||||||
|
FILE_SERVER_SCHEME=http
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Testing Validation
|
||||||
|
|
||||||
|
### System-Level Test Sequence
|
||||||
|
|
||||||
|
**1. Start Services**:
|
||||||
|
```bash
|
||||||
|
# Terminal 1: Display Manager
|
||||||
|
./scripts/start-display-manager.sh
|
||||||
|
|
||||||
|
# Terminal 2: MQTT Client
|
||||||
|
./scripts/start-dev.sh
|
||||||
|
|
||||||
|
# Terminal 3: Monitor logs
|
||||||
|
tail -f logs/monitoring.log
|
||||||
|
```
|
||||||
|
|
||||||
|
**2. Trigger Each Event Type**:
|
||||||
|
```bash
|
||||||
|
# Via test menu or MQTT publish:
|
||||||
|
./scripts/test-display-manager.sh # Options 1-3 trigger events
|
||||||
|
```
|
||||||
|
|
||||||
|
**3. Verify Health State File**:
|
||||||
|
```bash
|
||||||
|
# Check health state gets written immediately
|
||||||
|
cat src/current_process_health.json
|
||||||
|
# Should show: event_id, event_type, current_process (impressive/chromium/vlc), process_status=running
|
||||||
|
```
|
||||||
|
|
||||||
|
**4. Check MQTT Topics**:
|
||||||
|
```bash
|
||||||
|
# Monitor health messages:
|
||||||
|
mosquitto_sub -h localhost -t "infoscreen/+/health" -v
|
||||||
|
|
||||||
|
# Monitor log messages:
|
||||||
|
mosquitto_sub -h localhost -t "infoscreen/+/logs/#" -v
|
||||||
|
|
||||||
|
# Monitor dashboard heartbeat:
|
||||||
|
mosquitto_sub -h localhost -t "infoscreen/+/dashboard" -v | head -c 500 && echo "..."
|
||||||
|
```
|
||||||
|
|
||||||
|
**5. Simulate Process Crash**:
|
||||||
|
```bash
|
||||||
|
# Find impressive/chromium/vlc PID:
|
||||||
|
ps aux | grep -E 'impressive|chromium|vlc'
|
||||||
|
|
||||||
|
# Kill process:
|
||||||
|
kill -9 <pid>
|
||||||
|
|
||||||
|
# Watch monitoring.log for crash detection and restart
|
||||||
|
tail -f logs/monitoring.log
|
||||||
|
# Should see: [WARNING] Process crashed... [WARNING] Restarting process...
|
||||||
|
```
|
||||||
|
|
||||||
|
**6. Verify Server Integration**:
|
||||||
|
```bash
|
||||||
|
# Server receives health messages:
|
||||||
|
sqlite3 infoscreen.db "SELECT process_status, current_process, restart_count FROM clients WHERE uuid='...';"
|
||||||
|
# Should show latest status from health message
|
||||||
|
|
||||||
|
# Server receives logs:
|
||||||
|
sqlite3 infoscreen.db "SELECT level, message FROM client_logs WHERE client_uuid='...' ORDER BY timestamp DESC LIMIT 10;"
|
||||||
|
# Should show ERROR/WARN entries from crashes/restarts
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Health State File Not Created
|
||||||
|
|
||||||
|
**Symptom**: `src/current_process_health.json` missing
|
||||||
|
**Causes**:
|
||||||
|
- No event active (file only created when display starts)
|
||||||
|
- display_manager not running
|
||||||
|
|
||||||
|
**Check**:
|
||||||
|
```bash
|
||||||
|
ps aux | grep display_manager
|
||||||
|
tail -f logs/display_manager.log | grep "Process started\|Process stopped"
|
||||||
|
```
|
||||||
|
|
||||||
|
### MQTT Health Messages Not Arriving
|
||||||
|
|
||||||
|
**Symptom**: No health messages on `infoscreen/{uuid}/health` topic
|
||||||
|
**Causes**:
|
||||||
|
- simclient not reading health state file
|
||||||
|
- MQTT connection dropped
|
||||||
|
- Health update function not called
|
||||||
|
|
||||||
|
**Check**:
|
||||||
|
```bash
|
||||||
|
# Check health file exists and is recent:
|
||||||
|
ls -l src/current_process_health.json
|
||||||
|
stat src/current_process_health.json | grep Modify
|
||||||
|
|
||||||
|
# Monitor simclient logs:
|
||||||
|
tail -f logs/simclient.log | grep -E "Health|heartbeat|publish"
|
||||||
|
|
||||||
|
# Verify MQTT connection:
|
||||||
|
mosquitto_sub -h localhost -t "infoscreen/+/heartbeat" -v
|
||||||
|
```
|
||||||
|
|
||||||
|
### Restart Loop (Process Keeps Crashing)
|
||||||
|
|
||||||
|
**Symptom**: monitoring.log shows repeated crashes and restarts
|
||||||
|
**Check**:
|
||||||
|
```bash
|
||||||
|
# Read last log lines of the process (stored by display_manager):
|
||||||
|
tail -f logs/impressive.out.log # for presentations
|
||||||
|
tail -f logs/browser.out.log # for websites
|
||||||
|
tail -f logs/video_player.out.log # for videos
|
||||||
|
```
|
||||||
|
|
||||||
|
**Common Causes**:
|
||||||
|
- Missing binary (impressive not installed, chromium not found, vlc not available)
|
||||||
|
- Corrupt presentation file
|
||||||
|
- Invalid URL for website
|
||||||
|
- Insufficient permissions for screenshots
|
||||||
|
|
||||||
|
### Log Messages Not Reaching Server
|
||||||
|
|
||||||
|
**Symptom**: client_logs table in server DB is empty
|
||||||
|
**Causes**:
|
||||||
|
- Log level filtering: INFO messages in production are local-only
|
||||||
|
- Logs only published on ERROR/WARN
|
||||||
|
- MQTT publish failing silently
|
||||||
|
|
||||||
|
**Check**:
|
||||||
|
```bash
|
||||||
|
# Force DEBUG_MODE to see all logs:
|
||||||
|
export DEBUG_MODE=1
|
||||||
|
export LOG_LEVEL=DEBUG
|
||||||
|
# Restart simclient and trigger event
|
||||||
|
|
||||||
|
# Monitor local logs first:
|
||||||
|
tail -f logs/monitoring.log | grep -i error
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Performance Considerations
|
||||||
|
|
||||||
|
**Bandwidth per Client**:
|
||||||
|
- Health message: ~200 bytes per heartbeat interval (every 5-60s)
|
||||||
|
- Screenshot heartbeat: ~50-100 KB (every 30-300s)
|
||||||
|
- Log messages: ~100-500 bytes per crash/error (rare)
|
||||||
|
- **Total**: ~0.5-2 MB/day per device (very minimal)
|
||||||
|
|
||||||
|
**Disk Space on Client**:
|
||||||
|
- Monitoring logs: 5 files × 5 MB = 25 MB max
|
||||||
|
- Display manager logs: 5 files × 2 MB = 10 MB max
|
||||||
|
- MQTT client logs: 5 files × 2 MB = 10 MB max
|
||||||
|
- Screenshots: 20 files × 50-100 KB = 1-2 MB max
|
||||||
|
- **Total**: ~50 MB max (typical for Raspberry Pi USB/SSD)
|
||||||
|
|
||||||
|
**Rotation Strategy**:
|
||||||
|
- Old files automatically deleted when size limit reached
|
||||||
|
- Technician can SSH and `tail -f` any time
|
||||||
|
- No database overhead (file-based rotation is minimal CPU)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Integration with Server (Phase 2)
|
||||||
|
|
||||||
|
The client implementation sends data to the server's Phase 2 endpoints:
|
||||||
|
|
||||||
|
**Expected Server Implementation** (from CLIENT_MONITORING_SETUP.md):
|
||||||
|
|
||||||
|
1. **MQTT Listener** receives and stores:
|
||||||
|
- `infoscreen/{uuid}/logs/error`, `/logs/warn`, `/logs/info`
|
||||||
|
- `infoscreen/{uuid}/health` messages
|
||||||
|
- Updates `clients` table with health fields
|
||||||
|
|
||||||
|
2. **Database Tables**:
|
||||||
|
- `clients.process_status`: running/crashed/starting/stopped
|
||||||
|
- `clients.current_process`: impressive/chromium/vlc/None
|
||||||
|
- `clients.process_pid`: PID value
|
||||||
|
- `clients.current_event_id`: Active event
|
||||||
|
- `client_logs`: table stores logs with level/message/context
|
||||||
|
|
||||||
|
3. **API Endpoints**:
|
||||||
|
- `GET /api/client-logs/{uuid}/logs?level=ERROR&limit=50`
|
||||||
|
- `GET /api/client-logs/summary` (errors/warnings across all clients)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary of Changes
|
||||||
|
|
||||||
|
### Files Modified
|
||||||
|
|
||||||
|
1. **`src/display_manager.py`**:
|
||||||
|
- Added `psutil` import for future process monitoring
|
||||||
|
- Added `ProcessHealthState` class (60 lines)
|
||||||
|
- Added monitoring logger setup (8 lines)
|
||||||
|
- Added `health.update_running()` calls in `start_presentation()`, `start_video()`, `start_webpage()`
|
||||||
|
- Added crash detection and restart logic in `process_events()`
|
||||||
|
- Added `health.update_stopped()` in `stop_current_display()`
|
||||||
|
|
||||||
|
2. **`src/simclient.py`**:
|
||||||
|
- Added `timezone` import
|
||||||
|
- Added monitoring logger setup (8 lines)
|
||||||
|
- Added `read_health_state()` function
|
||||||
|
- Added `publish_health_message()` function
|
||||||
|
- Added `publish_log_message()` function (with level filtering)
|
||||||
|
- Updated `send_screenshot_heartbeat()` to include health data
|
||||||
|
- Updated heartbeat loop to call `publish_health_message()`
|
||||||
|
|
||||||
|
### Files Created
|
||||||
|
|
||||||
|
1. **`src/current_process_health.json`** (at runtime):
|
||||||
|
- Bridge file between display_manager and simclient
|
||||||
|
- Shared volume compatible (works in container setup)
|
||||||
|
|
||||||
|
2. **`logs/monitoring.log`** (at runtime):
|
||||||
|
- New rotating log file (5 × 5MB)
|
||||||
|
- Health events from both processes
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
1. **Deploy to test client** and run validation sequence above
|
||||||
|
2. **Deploy server Phase 2** (if not yet done) to receive health/log messages
|
||||||
|
3. **Verify database updates** in server-side `clients` and `client_logs` tables
|
||||||
|
4. **Test dashboard UI** (Phase 4) to display health indicators
|
||||||
|
5. **Configure alerting** (email/Slack) for ERROR level messages
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Implementation Date**: 11. März 2026
|
||||||
|
**Part of**: Infoscreen 2025 Client Monitoring System
|
||||||
|
**Status**: Production Ready (with server Phase 2 integration)
|
||||||
35
README.md
35
README.md
@@ -39,6 +39,7 @@ A comprehensive multi-service digital signage solution for educational instituti
|
|||||||
|
|
||||||
Data flow summary:
|
Data flow summary:
|
||||||
- Listener: consumes discovery and heartbeat messages from the MQTT Broker and updates the API Server (client registration/heartbeats).
|
- Listener: consumes discovery and heartbeat messages from the MQTT Broker and updates the API Server (client registration/heartbeats).
|
||||||
|
- Listener screenshot flow: consumes `infoscreen/{uuid}/screenshot` and `infoscreen/{uuid}/dashboard`. Dashboard messages use grouped v2 schema (`message`, `content`, `runtime`, `metadata`); screenshot data is read from `content.screenshot`, capture type from `metadata.capture.type`, and forwarded to `POST /api/clients/{uuid}/screenshot`.
|
||||||
- Scheduler: reads events from the API Server and publishes only currently active content to the MQTT Broker (retained topics per group). When a group has no active events, the scheduler clears its retained topic by publishing an empty list. All time comparisons are done in UTC; any naive timestamps are normalized.
|
- Scheduler: reads events from the API Server and publishes only currently active content to the MQTT Broker (retained topics per group). When a group has no active events, the scheduler clears its retained topic by publishing an empty list. All time comparisons are done in UTC; any naive timestamps are normalized.
|
||||||
- Clients: send discovery/heartbeat via the MQTT Broker (handled by the Listener) and receive content from the Scheduler via MQTT.
|
- Clients: send discovery/heartbeat via the MQTT Broker (handled by the Listener) and receive content from the Scheduler via MQTT.
|
||||||
- Worker: receives conversion commands directly from the API Server and reports results/status back to the API (no MQTT involved).
|
- Worker: receives conversion commands directly from the API Server and reports results/status back to the API (no MQTT involved).
|
||||||
@@ -225,17 +226,15 @@ For detailed deployment instructions, see:
|
|||||||
|
|
||||||
## Recent changes since last commit
|
## Recent changes since last commit
|
||||||
|
|
||||||
- Video / Streaming support: Added end-to-end support for video events. The API and dashboard now allow creating `video` events referencing uploaded media. The server exposes a range-capable streaming endpoint at `/api/eventmedia/stream/<media_id>/<filename>` so clients can seek during playback.
|
- Monitoring system: End-to-end monitoring is now implemented. The listener ingests `logs/*` and `health` MQTT topics, the API exposes monitoring endpoints (`/api/client-logs/monitoring-overview`, `/api/client-logs/recent-errors`, `/api/client-logs/<uuid>/logs`), and the superadmin dashboard page shows live client status, screenshots, and recent errors.
|
||||||
- Scheduler metadata: Scheduler now performs a best-effort HEAD probe for video stream URLs and includes basic metadata in the retained MQTT payload: `mime_type`, `size` (bytes) and `accept_ranges` (bool). Placeholders for richer metadata (`duration`, `resolution`, `bitrate`, `qualities`, `thumbnails`, `checksum`) are emitted as null/empty until a background worker fills them.
|
- Screenshot priority flow: Screenshot payloads now support `screenshot_type` (`periodic`, `event_start`, `event_stop`). `event_start` and `event_stop` are treated as high-priority screenshots; the API stores typed screenshots, maintains priority metadata, and serves active priority screenshots through `/screenshots/{uuid}/priority`.
|
||||||
- Dashboard & uploads: The dashboard's FileManager upload limits were increased (to support Full-HD uploads) and client-side validation enforces a maximum video length (10 minutes). The event modal exposes playback flags (`autoplay`, `loop`, `volume`, `muted`) and initializes them from system defaults for new events.
|
- MQTT dashboard payload v2 cutover: Listener parsing is now v2-only for dashboard JSON payloads (`message/content/runtime/metadata`). Legacy top-level dashboard fallback has been removed after migration completion; parser observability tracks `v2_success` and `parse_failures`.
|
||||||
- DB model & API: `Event` includes `muted` in addition to `autoplay`, `loop`, and `volume`; endpoints accept, persist, and return these fields for video events. Events reference uploaded media via `event_media_id`.
|
- Presentation persistence fix: Fixed persistence of presentation flags so `page_progress` and `auto_progress` are reliably stored and returned for create/update flows and detached occurrences.
|
||||||
- Settings UI: Settings page refactored to nested tabs; added Events → Videos defaults (autoplay, loop, volume, mute) backed by system settings keys (`video_autoplay`, `video_loop`, `video_volume`, `video_muted`).
|
- Additional improvements: Video/streaming, scheduler metadata, settings defaults, and UI refinements remain documented in the detailed sections below.
|
||||||
- Academic Calendar UI: Merged “School Holidays Import” and “List” into a single “📥 Import & Liste” tab; nested tab selection is persisted with controlled `selectedItem` state to avoid jumps.
|
|
||||||
|
|
||||||
These changes are designed to be safe if metadata extraction or probes fail — clients should still attempt playback using the provided `url` and fall back to requesting/resolving richer metadata when available.
|
These changes are designed to be safe if metadata extraction or probes fail — clients should still attempt playback using the provided `url` and fall back to requesting/resolving richer metadata when available.
|
||||||
|
|
||||||
See `MQTT_EVENT_PAYLOAD_GUIDE.md` for details.
|
See `MQTT_EVENT_PAYLOAD_GUIDE.md` for details.
|
||||||
- `infoscreen/{uuid}/group_id` - Client group assignment
|
|
||||||
|
|
||||||
## 🧩 Developer Environment Notes (Dev Container)
|
## 🧩 Developer Environment Notes (Dev Container)
|
||||||
- Extensions: UI-only `Dev Containers` runs on the host UI; not installed inside the container to avoid reinstallation loops. See `/.devcontainer/devcontainer.json` (`remote.extensionKind`).
|
- Extensions: UI-only `Dev Containers` runs on the host UI; not installed inside the container to avoid reinstallation loops. See `/.devcontainer/devcontainer.json` (`remote.extensionKind`).
|
||||||
@@ -345,8 +344,9 @@ mosquitto_sub -h localhost -t "infoscreen/+/heartbeat" -v
|
|||||||
- `POST /api/conversions/{media_id}/pdf` - Request conversion
|
- `POST /api/conversions/{media_id}/pdf` - Request conversion
|
||||||
- `GET /api/conversions/{media_id}/status` - Check conversion status
|
- `GET /api/conversions/{media_id}/status` - Check conversion status
|
||||||
- `GET /api/eventmedia/stream/<media_id>/<filename>` - Stream media with byte-range support (206) for seeking
|
- `GET /api/eventmedia/stream/<media_id>/<filename>` - Stream media with byte-range support (206) for seeking
|
||||||
- `POST /api/clients/{uuid}/screenshot` - Upload screenshot for client (base64 JPEG)
|
- `POST /api/clients/{uuid}/screenshot` - Upload screenshot for client (base64 JPEG, optional `timestamp`, optional `screenshot_type` = `periodic|event_start|event_stop`)
|
||||||
- **Screenshot retention:** Only the latest and last 20 timestamped screenshots per client are stored on the server. Older screenshots are automatically deleted.
|
- **Screenshot retention:** The API stores `{uuid}.jpg` as latest plus the last 20 timestamped screenshots per client; older timestamped files are deleted automatically.
|
||||||
|
- **Priority screenshots:** For `event_start`/`event_stop`, the API also keeps `{uuid}_priority.jpg` and metadata (`{uuid}_meta.json`) used by monitoring priority selection.
|
||||||
|
|
||||||
### System Settings
|
### System Settings
|
||||||
- `GET /api/system-settings` - List all system settings (admin+)
|
- `GET /api/system-settings` - List all system settings (admin+)
|
||||||
@@ -380,7 +380,11 @@ mosquitto_sub -h localhost -t "infoscreen/+/heartbeat" -v
|
|||||||
|
|
||||||
### Health & Monitoring
|
### Health & Monitoring
|
||||||
- `GET /health` - Service health check
|
- `GET /health` - Service health check
|
||||||
- `GET /api/screenshots/{uuid}.jpg` - Client screenshots
|
- `GET /screenshots/{uuid}.jpg` - Latest client screenshot
|
||||||
|
- `GET /screenshots/{uuid}/priority` - Active high-priority screenshot (falls back to latest)
|
||||||
|
- `GET /api/client-logs/monitoring-overview` - Aggregated monitoring overview for dashboard (superadmin)
|
||||||
|
- `GET /api/client-logs/recent-errors` - Recent error feed across clients (admin+)
|
||||||
|
- `GET /api/client-logs/{uuid}/logs` - Filtered per-client logs (admin+)
|
||||||
|
|
||||||
## 🎨 Frontend Features
|
## 🎨 Frontend Features
|
||||||
|
|
||||||
@@ -444,6 +448,11 @@ mosquitto_sub -h localhost -t "infoscreen/+/heartbeat" -v
|
|||||||
- Real-time event status: shows currently running events with type, title, and time window
|
- Real-time event status: shows currently running events with type, title, and time window
|
||||||
- Filters out unassigned groups for focused view
|
- Filters out unassigned groups for focused view
|
||||||
- Resource-based Syncfusion timeline scheduler with resize and drag-drop support
|
- Resource-based Syncfusion timeline scheduler with resize and drag-drop support
|
||||||
|
- **Monitoring**: Superadmin-only monitoring dashboard
|
||||||
|
- Live client health states (`healthy`, `warning`, `critical`, `offline`) from heartbeat/process/log data
|
||||||
|
- Latest screenshot preview with screenshot-type badges (`periodic`, `event_start`, `event_stop`) and process metadata per client
|
||||||
|
- Active priority screenshots are surfaced immediately and polled faster while priority items are active
|
||||||
|
- System-wide recent error stream and per-client log drill-down
|
||||||
- **Program info**: Version, build info, tech stack and paginated changelog (reads `dashboard/public/program-info.json`)
|
- **Program info**: Version, build info, tech stack and paginated changelog (reads `dashboard/public/program-info.json`)
|
||||||
|
|
||||||
## 🔒 Security & Authentication
|
## 🔒 Security & Authentication
|
||||||
@@ -474,7 +483,8 @@ mosquitto_sub -h localhost -t "infoscreen/+/heartbeat" -v
|
|||||||
- MQTT: Pub/sub functionality test
|
- MQTT: Pub/sub functionality test
|
||||||
- Dashboard: Nginx availability
|
- Dashboard: Nginx availability
|
||||||
- **Scheduler**: Logging is concise; conversion lookups are cached and logged only once per media.
|
- **Scheduler**: Logging is concise; conversion lookups are cached and logged only once per media.
|
||||||
- Dashboard: Nginx availability
|
- Monitoring API: `/api/client-logs/monitoring-overview` and `/api/client-logs/recent-errors` for live diagnostics
|
||||||
|
- Monitoring overview includes screenshot priority state (`latestScreenshotType`, `priorityScreenshotType`, `priorityScreenshotReceivedAt`, `hasActivePriorityScreenshot`) and `summary.activePriorityScreenshots`
|
||||||
|
|
||||||
### Logging Strategy
|
### Logging Strategy
|
||||||
- **Development**: Docker Compose logs with service prefixes
|
- **Development**: Docker Compose logs with service prefixes
|
||||||
@@ -549,7 +559,6 @@ docker exec -it infoscreen-db mysqladmin ping
|
|||||||
# Restart dependent services
|
# Restart dependent services
|
||||||
```
|
```
|
||||||
|
|
||||||
**MQTT communication issues**
|
|
||||||
**Vite import-analysis errors (Syncfusion splitbuttons)**
|
**Vite import-analysis errors (Syncfusion splitbuttons)**
|
||||||
```bash
|
```bash
|
||||||
# Symptom
|
# Symptom
|
||||||
@@ -565,6 +574,8 @@ docker compose rm -sf dashboard
|
|||||||
docker volume rm <project>_dashboard-node-modules <project>_dashboard-vite-cache || true
|
docker volume rm <project>_dashboard-node-modules <project>_dashboard-vite-cache || true
|
||||||
docker compose up -d --build dashboard
|
docker compose up -d --build dashboard
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**MQTT communication issues**
|
||||||
```bash
|
```bash
|
||||||
# Test MQTT broker
|
# Test MQTT broker
|
||||||
mosquitto_pub -h localhost -t test -m "hello"
|
mosquitto_pub -h localhost -t test -m "hello"
|
||||||
|
|||||||
@@ -56,6 +56,58 @@ Notes for integrators:
|
|||||||
- CSS follows modern Material 3 color-function notation (`rgb(r g b / alpha%)`)
|
- CSS follows modern Material 3 color-function notation (`rgb(r g b / alpha%)`)
|
||||||
- Syncfusion ScheduleComponent requires TimelineViews, Resize, and DragAndDrop modules injected
|
- Syncfusion ScheduleComponent requires TimelineViews, Resize, and DragAndDrop modules injected
|
||||||
|
|
||||||
|
Backend technical work (post-release notes; no version bump):
|
||||||
|
- 📊 **Client Monitoring Infrastructure (Server-Side) (2026-03-10)**:
|
||||||
|
- Database schema: New Alembic migration `c1d2e3f4g5h6_add_client_monitoring.py` (idempotent) adds:
|
||||||
|
- `client_logs` table: Stores centralized logs with columns (id, client_uuid, timestamp, level, message, context, created_at)
|
||||||
|
- Foreign key: `client_logs.client_uuid` → `clients.uuid` (ON DELETE CASCADE)
|
||||||
|
- Health monitoring columns added to `clients` table: `current_event_id`, `current_process`, `process_status`, `process_pid`, `last_screenshot_analyzed`, `screen_health_status`, `last_screenshot_hash`
|
||||||
|
- Indexes for performance: (client_uuid, timestamp DESC), (level, timestamp DESC), (created_at DESC)
|
||||||
|
- Data models (`models/models.py`):
|
||||||
|
- New enums: `LogLevel` (ERROR, WARN, INFO, DEBUG), `ProcessStatus` (running, crashed, starting, stopped), `ScreenHealthStatus` (OK, BLACK, FROZEN, UNKNOWN)
|
||||||
|
- New model: `ClientLog` with foreign key to `Client` (CASCADE on delete)
|
||||||
|
- Extended `Client` model with 7 health monitoring fields
|
||||||
|
- MQTT listener extensions (`listener/listener.py`):
|
||||||
|
- New topic subscriptions: `infoscreen/+/logs/error`, `infoscreen/+/logs/warn`, `infoscreen/+/logs/info`, `infoscreen/+/health`
|
||||||
|
- Log handler: Parses JSON payloads, creates `ClientLog` entries, validates client UUID exists (FK constraint)
|
||||||
|
- Health handler: Updates client state from MQTT health messages
|
||||||
|
- Enhanced heartbeat handler: Captures `process_status`, `current_process`, `process_pid`, `current_event_id` from payload
|
||||||
|
- API endpoints (`server/routes/client_logs.py`):
|
||||||
|
- `GET /api/client-logs/<uuid>/logs` – Retrieve client logs with filters (level, limit, since); authenticated (admin_or_higher)
|
||||||
|
- `GET /api/client-logs/summary` – Get log counts by level per client for last 24h; authenticated (admin_or_higher)
|
||||||
|
- `GET /api/client-logs/monitoring-overview` – Aggregated monitoring overview for dashboard clients/statuses; authenticated (admin_or_higher)
|
||||||
|
- `GET /api/client-logs/recent-errors` – System-wide error monitoring; authenticated (admin_or_higher)
|
||||||
|
- `GET /api/client-logs/test` – Infrastructure validation endpoint (no auth required)
|
||||||
|
- Blueprint registered in `server/wsgi.py` as `client_logs_bp`
|
||||||
|
- Dev environment fix: Updated `docker-compose.override.yml` listener service to use `working_dir: /workspace` and direct command path for live code reload
|
||||||
|
- 🖥️ **Monitoring Dashboard Integration (2026-03-24)**:
|
||||||
|
- Frontend monitoring dashboard (`dashboard/src/monitoring.tsx`) is active and wired to monitoring APIs
|
||||||
|
- Superadmin-only route/menu integration completed in `dashboard/src/App.tsx`
|
||||||
|
- Added dashboard monitoring API client (`dashboard/src/apiClientMonitoring.ts`) for overview and recent errors
|
||||||
|
- 🐛 **Presentation Flags Persistence Fix (2026-03-24)**:
|
||||||
|
- Fixed persistence for presentation flags `page_progress` and `auto_progress` across create/update and detached-occurrence flows
|
||||||
|
- API serialization now reliably returns stored values for presentation behavior fields
|
||||||
|
- 📡 **MQTT Protocol Extensions**:
|
||||||
|
- New log topics: `infoscreen/{uuid}/logs/{error|warn|info}` with JSON payload (timestamp, message, context)
|
||||||
|
- New health topic: `infoscreen/{uuid}/health` with metrics (expected_state, actual_state, health_metrics)
|
||||||
|
- Enhanced heartbeat: `infoscreen/{uuid}/heartbeat` now includes `current_process`, `process_pid`, `process_status`, `current_event_id`
|
||||||
|
- QoS levels: ERROR/WARN logs use QoS 1 (at least once), INFO/health use QoS 0 (fire and forget)
|
||||||
|
- 📖 **Documentation**:
|
||||||
|
- New file: `CLIENT_MONITORING_SPECIFICATION.md` – Comprehensive 20-section technical spec for client-side implementation (MQTT protocol, process monitoring, auto-recovery, payload formats, testing guide)
|
||||||
|
- New file: `CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md` – 5-phase implementation guide (database, backend, client watchdog, dashboard UI, testing)
|
||||||
|
- Updated `.github/copilot-instructions.md`: Added MQTT topics section, client monitoring integration notes
|
||||||
|
- ✅ **Validation**:
|
||||||
|
- End-to-end testing completed: MQTT message → listener → database → API confirmed working
|
||||||
|
- Test flow: Published message to `infoscreen/{real-uuid}/logs/error` → listener logs showed receipt → database stored entry → test API returned log data
|
||||||
|
- Known client UUIDs validated: 9b8d1856-ff34-4864-a726-12de072d0f77, 7f65c615-5827-4ada-9ac8-4727c2e8ee55, bdbfff95-0b2b-4265-8cc7-b0284509540a
|
||||||
|
|
||||||
|
Notes for integrators:
|
||||||
|
- Tiered logging strategy: ERROR/WARN always centralized (QoS 1), INFO dev-only (QoS 0), DEBUG local-only
|
||||||
|
- Monitoring dashboard is implemented and consumes `/api/client-logs/monitoring-overview`, `/api/client-logs/recent-errors`, and `/api/client-logs/<uuid>/logs`
|
||||||
|
- Foreign key constraint prevents logging for non-existent clients (data integrity enforced)
|
||||||
|
- Migration is idempotent and can be safely rerun after interruption
|
||||||
|
- Use `GET /api/client-logs/test` for quick infrastructure validation without authentication
|
||||||
|
|
||||||
## 2025.1.0-beta.1 (TBD)
|
## 2025.1.0-beta.1 (TBD)
|
||||||
- 🔐 **User Management & Role-Based Access Control**:
|
- 🔐 **User Management & Role-Based Access Control**:
|
||||||
- Backend: Implemented comprehensive user management API (`server/routes/users.py`) with 6 endpoints (GET, POST, PUT, DELETE users + password reset).
|
- Backend: Implemented comprehensive user management API (`server/routes/users.py`) with 6 endpoints (GET, POST, PUT, DELETE users + password reset).
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
import React, { useState } from 'react';
|
import React, { useState } from 'react';
|
||||||
import { BrowserRouter as Router, Routes, Route, Link, Outlet, useNavigate } from 'react-router-dom';
|
import { BrowserRouter as Router, Routes, Route, Link, Outlet, useNavigate, Navigate } from 'react-router-dom';
|
||||||
import { SidebarComponent } from '@syncfusion/ej2-react-navigations';
|
import { SidebarComponent } from '@syncfusion/ej2-react-navigations';
|
||||||
import { ButtonComponent } from '@syncfusion/ej2-react-buttons';
|
import { ButtonComponent } from '@syncfusion/ej2-react-buttons';
|
||||||
import { DropDownButtonComponent } from '@syncfusion/ej2-react-splitbuttons';
|
import { DropDownButtonComponent } from '@syncfusion/ej2-react-splitbuttons';
|
||||||
@@ -19,6 +19,7 @@ import {
|
|||||||
Settings,
|
Settings,
|
||||||
Monitor,
|
Monitor,
|
||||||
MonitorDotIcon,
|
MonitorDotIcon,
|
||||||
|
Activity,
|
||||||
LogOut,
|
LogOut,
|
||||||
Wrench,
|
Wrench,
|
||||||
Info,
|
Info,
|
||||||
@@ -31,6 +32,7 @@ const sidebarItems = [
|
|||||||
{ name: 'Ressourcen', path: '/ressourcen', icon: Boxes, minRole: 'editor' },
|
{ name: 'Ressourcen', path: '/ressourcen', icon: Boxes, minRole: 'editor' },
|
||||||
{ name: 'Raumgruppen', path: '/infoscr_groups', icon: MonitorDotIcon, minRole: 'admin' },
|
{ name: 'Raumgruppen', path: '/infoscr_groups', icon: MonitorDotIcon, minRole: 'admin' },
|
||||||
{ name: 'Infoscreen-Clients', path: '/clients', icon: Monitor, minRole: 'admin' },
|
{ name: 'Infoscreen-Clients', path: '/clients', icon: Monitor, minRole: 'admin' },
|
||||||
|
{ name: 'Monitor-Dashboard', path: '/monitoring', icon: Activity, minRole: 'superadmin' },
|
||||||
{ name: 'Erweiterungsmodus', path: '/setup', icon: Wrench, minRole: 'admin' },
|
{ name: 'Erweiterungsmodus', path: '/setup', icon: Wrench, minRole: 'admin' },
|
||||||
{ name: 'Medien', path: '/medien', icon: Image, minRole: 'editor' },
|
{ name: 'Medien', path: '/medien', icon: Image, minRole: 'editor' },
|
||||||
{ name: 'Benutzer', path: '/benutzer', icon: User, minRole: 'admin' },
|
{ name: 'Benutzer', path: '/benutzer', icon: User, minRole: 'admin' },
|
||||||
@@ -49,6 +51,7 @@ import Benutzer from './users';
|
|||||||
import Einstellungen from './settings';
|
import Einstellungen from './settings';
|
||||||
import SetupMode from './SetupMode';
|
import SetupMode from './SetupMode';
|
||||||
import Programminfo from './programminfo';
|
import Programminfo from './programminfo';
|
||||||
|
import MonitoringDashboard from './monitoring';
|
||||||
import Logout from './logout';
|
import Logout from './logout';
|
||||||
import Login from './login';
|
import Login from './login';
|
||||||
import { useAuth } from './useAuth';
|
import { useAuth } from './useAuth';
|
||||||
@@ -436,7 +439,7 @@ const Layout: React.FC = () => {
|
|||||||
type="password"
|
type="password"
|
||||||
placeholder="Aktuelles Passwort"
|
placeholder="Aktuelles Passwort"
|
||||||
value={pwdCurrent}
|
value={pwdCurrent}
|
||||||
input={(e: any) => setPwdCurrent(e.value)}
|
input={(e: { value?: string }) => setPwdCurrent(e.value ?? '')}
|
||||||
disabled={pwdBusy}
|
disabled={pwdBusy}
|
||||||
/>
|
/>
|
||||||
</div>
|
</div>
|
||||||
@@ -446,7 +449,7 @@ const Layout: React.FC = () => {
|
|||||||
type="password"
|
type="password"
|
||||||
placeholder="Mindestens 6 Zeichen"
|
placeholder="Mindestens 6 Zeichen"
|
||||||
value={pwdNew}
|
value={pwdNew}
|
||||||
input={(e: any) => setPwdNew(e.value)}
|
input={(e: { value?: string }) => setPwdNew(e.value ?? '')}
|
||||||
disabled={pwdBusy}
|
disabled={pwdBusy}
|
||||||
/>
|
/>
|
||||||
</div>
|
</div>
|
||||||
@@ -456,7 +459,7 @@ const Layout: React.FC = () => {
|
|||||||
type="password"
|
type="password"
|
||||||
placeholder="Wiederholen"
|
placeholder="Wiederholen"
|
||||||
value={pwdConfirm}
|
value={pwdConfirm}
|
||||||
input={(e: any) => setPwdConfirm(e.value)}
|
input={(e: { value?: string }) => setPwdConfirm(e.value ?? '')}
|
||||||
disabled={pwdBusy}
|
disabled={pwdBusy}
|
||||||
/>
|
/>
|
||||||
</div>
|
</div>
|
||||||
@@ -480,6 +483,14 @@ const App: React.FC = () => {
|
|||||||
return <>{children}</>;
|
return <>{children}</>;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
const RequireSuperadmin: React.FC<{ children: React.ReactNode }> = ({ children }) => {
|
||||||
|
const { isAuthenticated, loading, user } = useAuth();
|
||||||
|
if (loading) return <div style={{ padding: 24 }}>Lade ...</div>;
|
||||||
|
if (!isAuthenticated) return <Login />;
|
||||||
|
if (user?.role !== 'superadmin') return <Navigate to="/" replace />;
|
||||||
|
return <>{children}</>;
|
||||||
|
};
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<ToastProvider>
|
<ToastProvider>
|
||||||
<Routes>
|
<Routes>
|
||||||
@@ -499,6 +510,14 @@ const App: React.FC = () => {
|
|||||||
<Route path="benutzer" element={<Benutzer />} />
|
<Route path="benutzer" element={<Benutzer />} />
|
||||||
<Route path="einstellungen" element={<Einstellungen />} />
|
<Route path="einstellungen" element={<Einstellungen />} />
|
||||||
<Route path="clients" element={<Infoscreens />} />
|
<Route path="clients" element={<Infoscreens />} />
|
||||||
|
<Route
|
||||||
|
path="monitoring"
|
||||||
|
element={
|
||||||
|
<RequireSuperadmin>
|
||||||
|
<MonitoringDashboard />
|
||||||
|
</RequireSuperadmin>
|
||||||
|
}
|
||||||
|
/>
|
||||||
<Route path="setup" element={<SetupMode />} />
|
<Route path="setup" element={<SetupMode />} />
|
||||||
<Route path="programminfo" element={<Programminfo />} />
|
<Route path="programminfo" element={<Programminfo />} />
|
||||||
</Route>
|
</Route>
|
||||||
|
|||||||
111
dashboard/src/apiClientMonitoring.ts
Normal file
111
dashboard/src/apiClientMonitoring.ts
Normal file
@@ -0,0 +1,111 @@
|
|||||||
|
export interface MonitoringLogEntry {
|
||||||
|
id: number;
|
||||||
|
timestamp: string | null;
|
||||||
|
level: 'ERROR' | 'WARN' | 'INFO' | 'DEBUG' | null;
|
||||||
|
message: string;
|
||||||
|
context: Record<string, unknown>;
|
||||||
|
client_uuid?: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface MonitoringClient {
|
||||||
|
uuid: string;
|
||||||
|
hostname?: string | null;
|
||||||
|
description?: string | null;
|
||||||
|
ip?: string | null;
|
||||||
|
model?: string | null;
|
||||||
|
groupId?: number | null;
|
||||||
|
groupName?: string | null;
|
||||||
|
registrationTime?: string | null;
|
||||||
|
lastAlive?: string | null;
|
||||||
|
isAlive: boolean;
|
||||||
|
status: 'healthy' | 'warning' | 'critical' | 'offline';
|
||||||
|
currentEventId?: number | null;
|
||||||
|
currentProcess?: string | null;
|
||||||
|
processStatus?: string | null;
|
||||||
|
processPid?: number | null;
|
||||||
|
screenHealthStatus?: string | null;
|
||||||
|
lastScreenshotAnalyzed?: string | null;
|
||||||
|
lastScreenshotHash?: string | null;
|
||||||
|
latestScreenshotType?: 'periodic' | 'event_start' | 'event_stop' | null;
|
||||||
|
priorityScreenshotType?: 'event_start' | 'event_stop' | null;
|
||||||
|
priorityScreenshotReceivedAt?: string | null;
|
||||||
|
hasActivePriorityScreenshot?: boolean;
|
||||||
|
screenshotUrl: string;
|
||||||
|
logCounts24h: {
|
||||||
|
error: number;
|
||||||
|
warn: number;
|
||||||
|
info: number;
|
||||||
|
debug: number;
|
||||||
|
};
|
||||||
|
latestLog?: MonitoringLogEntry | null;
|
||||||
|
latestError?: MonitoringLogEntry | null;
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface MonitoringOverview {
|
||||||
|
summary: {
|
||||||
|
totalClients: number;
|
||||||
|
onlineClients: number;
|
||||||
|
offlineClients: number;
|
||||||
|
healthyClients: number;
|
||||||
|
warningClients: number;
|
||||||
|
criticalClients: number;
|
||||||
|
errorLogs: number;
|
||||||
|
warnLogs: number;
|
||||||
|
activePriorityScreenshots: number;
|
||||||
|
};
|
||||||
|
periodHours: number;
|
||||||
|
gracePeriodSeconds: number;
|
||||||
|
since: string;
|
||||||
|
timestamp: string;
|
||||||
|
clients: MonitoringClient[];
|
||||||
|
}
|
||||||
|
|
||||||
|
export interface ClientLogsResponse {
|
||||||
|
client_uuid: string;
|
||||||
|
logs: MonitoringLogEntry[];
|
||||||
|
count: number;
|
||||||
|
limit: number;
|
||||||
|
}
|
||||||
|
|
||||||
|
async function parseJsonResponse<T>(response: Response, fallbackMessage: string): Promise<T> {
|
||||||
|
const data = await response.json();
|
||||||
|
if (!response.ok) {
|
||||||
|
throw new Error(data.error || fallbackMessage);
|
||||||
|
}
|
||||||
|
return data as T;
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function fetchMonitoringOverview(hours = 24): Promise<MonitoringOverview> {
|
||||||
|
const response = await fetch(`/api/client-logs/monitoring-overview?hours=${hours}`, {
|
||||||
|
credentials: 'include',
|
||||||
|
});
|
||||||
|
return parseJsonResponse<MonitoringOverview>(response, 'Fehler beim Laden der Monitoring-Übersicht');
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function fetchRecentClientErrors(limit = 20): Promise<MonitoringLogEntry[]> {
|
||||||
|
const response = await fetch(`/api/client-logs/recent-errors?limit=${limit}`, {
|
||||||
|
credentials: 'include',
|
||||||
|
});
|
||||||
|
const data = await parseJsonResponse<{ errors: MonitoringLogEntry[] }>(
|
||||||
|
response,
|
||||||
|
'Fehler beim Laden der letzten Fehler'
|
||||||
|
);
|
||||||
|
return data.errors;
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function fetchClientMonitoringLogs(
|
||||||
|
uuid: string,
|
||||||
|
options: { level?: string; limit?: number } = {}
|
||||||
|
): Promise<MonitoringLogEntry[]> {
|
||||||
|
const params = new URLSearchParams();
|
||||||
|
if (options.level && options.level !== 'ALL') {
|
||||||
|
params.set('level', options.level);
|
||||||
|
}
|
||||||
|
params.set('limit', String(options.limit ?? 100));
|
||||||
|
|
||||||
|
const response = await fetch(`/api/client-logs/${uuid}/logs?${params.toString()}`, {
|
||||||
|
credentials: 'include',
|
||||||
|
});
|
||||||
|
const data = await parseJsonResponse<ClientLogsResponse>(response, 'Fehler beim Laden der Client-Logs');
|
||||||
|
return data.logs;
|
||||||
|
}
|
||||||
@@ -523,28 +523,10 @@ const Appointments: React.FC = () => {
|
|||||||
}, [holidays, allowScheduleOnHolidays]);
|
}, [holidays, allowScheduleOnHolidays]);
|
||||||
|
|
||||||
const dataSource = useMemo(() => {
|
const dataSource = useMemo(() => {
|
||||||
// Filter: Events with SkipHolidays=true (from internal Event type) are never shown on holidays
|
// Existing events should always be visible; holiday skipping for recurring events
|
||||||
const filteredEvents = events.filter(ev => {
|
// is handled via RecurrenceException from the backend.
|
||||||
if (ev.SkipHolidays) {
|
return [...events, ...holidayDisplayEvents, ...holidayBlockEvents];
|
||||||
// If event falls within a holiday, hide it
|
}, [events, holidayDisplayEvents, holidayBlockEvents]);
|
||||||
const s = ev.StartTime instanceof Date ? ev.StartTime : new Date(ev.StartTime);
|
|
||||||
const e = ev.EndTime instanceof Date ? ev.EndTime : new Date(ev.EndTime);
|
|
||||||
for (const h of holidays) {
|
|
||||||
const hs = new Date(h.start_date + 'T00:00:00');
|
|
||||||
const he = new Date(h.end_date + 'T23:59:59');
|
|
||||||
if (
|
|
||||||
(s >= hs && s <= he) ||
|
|
||||||
(e >= hs && e <= he) ||
|
|
||||||
(s <= hs && e >= he)
|
|
||||||
) {
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return true;
|
|
||||||
});
|
|
||||||
return [...filteredEvents, ...holidayDisplayEvents, ...holidayBlockEvents];
|
|
||||||
}, [events, holidayDisplayEvents, holidayBlockEvents, holidays]);
|
|
||||||
|
|
||||||
// Removed dataSource logging
|
// Removed dataSource logging
|
||||||
|
|
||||||
@@ -1227,37 +1209,6 @@ const Appointments: React.FC = () => {
|
|||||||
}
|
}
|
||||||
}}
|
}}
|
||||||
eventRendered={(args: EventRenderedArgs) => {
|
eventRendered={(args: EventRenderedArgs) => {
|
||||||
// Always hide events that skip holidays when they fall on holidays, regardless of toggle
|
|
||||||
if (args.data) {
|
|
||||||
const ev = args.data as unknown as Partial<Event>;
|
|
||||||
if (ev.SkipHolidays && !args.data.isHoliday) {
|
|
||||||
const s =
|
|
||||||
args.data.StartTime instanceof Date
|
|
||||||
? args.data.StartTime
|
|
||||||
: new Date(args.data.StartTime);
|
|
||||||
const e =
|
|
||||||
args.data.EndTime instanceof Date ? args.data.EndTime : new Date(args.data.EndTime);
|
|
||||||
if (isWithinHolidayRange(s, e)) {
|
|
||||||
args.cancel = true;
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Blende Nicht-Ferien-Events aus, falls sie in Ferien fallen und Terminieren nicht erlaubt ist
|
|
||||||
// Hide events on holidays if not allowed
|
|
||||||
if (!allowScheduleOnHolidays && args.data && !args.data.isHoliday) {
|
|
||||||
const s =
|
|
||||||
args.data.StartTime instanceof Date
|
|
||||||
? args.data.StartTime
|
|
||||||
: new Date(args.data.StartTime);
|
|
||||||
const e =
|
|
||||||
args.data.EndTime instanceof Date ? args.data.EndTime : new Date(args.data.EndTime);
|
|
||||||
if (isWithinHolidayRange(s, e)) {
|
|
||||||
args.cancel = true;
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if (selectedGroupId && args.data && args.data.Id) {
|
if (selectedGroupId && args.data && args.data.Id) {
|
||||||
const groupColor = getGroupColor(selectedGroupId, groups);
|
const groupColor = getGroupColor(selectedGroupId, groups);
|
||||||
|
|||||||
373
dashboard/src/monitoring.css
Normal file
373
dashboard/src/monitoring.css
Normal file
@@ -0,0 +1,373 @@
|
|||||||
|
.monitoring-page {
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
gap: 1.25rem;
|
||||||
|
padding: 0.5rem 0.25rem 1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-header-row {
|
||||||
|
display: flex;
|
||||||
|
justify-content: space-between;
|
||||||
|
align-items: flex-start;
|
||||||
|
gap: 1rem;
|
||||||
|
flex-wrap: wrap;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-title {
|
||||||
|
margin: 0;
|
||||||
|
font-size: 1.75rem;
|
||||||
|
font-weight: 700;
|
||||||
|
color: #5c4318;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-subtitle {
|
||||||
|
margin: 0.35rem 0 0;
|
||||||
|
color: #6b7280;
|
||||||
|
max-width: 60ch;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-toolbar {
|
||||||
|
display: flex;
|
||||||
|
align-items: end;
|
||||||
|
gap: 0.75rem;
|
||||||
|
flex-wrap: wrap;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-toolbar-field {
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
gap: 0.35rem;
|
||||||
|
min-width: 190px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-toolbar-field-compact {
|
||||||
|
min-width: 160px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-toolbar-field label {
|
||||||
|
font-size: 0.875rem;
|
||||||
|
font-weight: 600;
|
||||||
|
color: #5b4b32;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-meta-row {
|
||||||
|
display: flex;
|
||||||
|
gap: 1rem;
|
||||||
|
flex-wrap: wrap;
|
||||||
|
color: #6b7280;
|
||||||
|
font-size: 0.92rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-summary-grid {
|
||||||
|
display: grid;
|
||||||
|
grid-template-columns: repeat(auto-fit, minmax(180px, 1fr));
|
||||||
|
gap: 1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-metric-card {
|
||||||
|
overflow: hidden;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-metric-content {
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
gap: 0.35rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-metric-title {
|
||||||
|
font-size: 0.9rem;
|
||||||
|
font-weight: 600;
|
||||||
|
color: #6b7280;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-metric-value {
|
||||||
|
font-size: 2rem;
|
||||||
|
font-weight: 700;
|
||||||
|
color: #1f2937;
|
||||||
|
line-height: 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-metric-subtitle {
|
||||||
|
font-size: 0.85rem;
|
||||||
|
color: #64748b;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-main-grid {
|
||||||
|
display: grid;
|
||||||
|
grid-template-columns: minmax(0, 2fr) minmax(320px, 1fr);
|
||||||
|
gap: 1rem;
|
||||||
|
align-items: start;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-sidebar-column {
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
gap: 1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-panel {
|
||||||
|
background: #fff;
|
||||||
|
border: 1px solid #e5e7eb;
|
||||||
|
border-radius: 16px;
|
||||||
|
padding: 1.1rem;
|
||||||
|
box-shadow: 0 12px 40px rgb(120 89 28 / 8%);
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-clients-panel {
|
||||||
|
min-width: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-panel-header {
|
||||||
|
display: flex;
|
||||||
|
justify-content: space-between;
|
||||||
|
align-items: center;
|
||||||
|
gap: 0.75rem;
|
||||||
|
margin-bottom: 0.85rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-panel-header-stacked {
|
||||||
|
align-items: end;
|
||||||
|
flex-wrap: wrap;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-panel-header h3 {
|
||||||
|
margin: 0;
|
||||||
|
font-size: 1.1rem;
|
||||||
|
font-weight: 700;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-panel-header span {
|
||||||
|
color: #6b7280;
|
||||||
|
font-size: 0.9rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-detail-card .e-card-content {
|
||||||
|
padding-top: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-detail-list {
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
gap: 0.75rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-detail-row {
|
||||||
|
display: flex;
|
||||||
|
justify-content: space-between;
|
||||||
|
gap: 1rem;
|
||||||
|
align-items: flex-start;
|
||||||
|
border-bottom: 1px solid #f1f5f9;
|
||||||
|
padding-bottom: 0.55rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-detail-row span {
|
||||||
|
color: #64748b;
|
||||||
|
font-size: 0.9rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-detail-row strong {
|
||||||
|
text-align: right;
|
||||||
|
color: #111827;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-status-badge {
|
||||||
|
display: inline-flex;
|
||||||
|
align-items: center;
|
||||||
|
justify-content: center;
|
||||||
|
padding: 0.22rem 0.6rem;
|
||||||
|
border-radius: 999px;
|
||||||
|
font-weight: 700;
|
||||||
|
font-size: 0.78rem;
|
||||||
|
letter-spacing: 0.01em;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-screenshot {
|
||||||
|
width: 100%;
|
||||||
|
border-radius: 12px;
|
||||||
|
border: 1px solid #e5e7eb;
|
||||||
|
background: linear-gradient(135deg, #f8fafc, #e2e8f0);
|
||||||
|
min-height: 180px;
|
||||||
|
object-fit: cover;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-screenshot-meta {
|
||||||
|
margin-top: 0.55rem;
|
||||||
|
font-size: 0.88rem;
|
||||||
|
color: #64748b;
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
gap: 0.35rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-shot-type {
|
||||||
|
display: inline-flex;
|
||||||
|
align-items: center;
|
||||||
|
border-radius: 999px;
|
||||||
|
padding: 0.15rem 0.55rem;
|
||||||
|
font-size: 0.78rem;
|
||||||
|
font-weight: 700;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-shot-type-periodic {
|
||||||
|
background: #e2e8f0;
|
||||||
|
color: #334155;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-shot-type-event {
|
||||||
|
background: #ffedd5;
|
||||||
|
color: #9a3412;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-shot-type-active {
|
||||||
|
box-shadow: 0 0 0 2px #fdba74;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-error-box {
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
gap: 0.5rem;
|
||||||
|
padding: 0.85rem;
|
||||||
|
border-radius: 12px;
|
||||||
|
background: linear-gradient(135deg, #fff1f2, #fee2e2);
|
||||||
|
border: 1px solid #fecdd3;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-error-time {
|
||||||
|
color: #9f1239;
|
||||||
|
font-size: 0.85rem;
|
||||||
|
font-weight: 600;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-error-message {
|
||||||
|
color: #4c0519;
|
||||||
|
font-weight: 600;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-mono {
|
||||||
|
font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, 'Liberation Mono', 'Courier New', monospace;
|
||||||
|
font-size: 0.85rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-detail-row {
|
||||||
|
display: flex;
|
||||||
|
justify-content: space-between;
|
||||||
|
gap: 1rem;
|
||||||
|
align-items: flex-start;
|
||||||
|
border-bottom: 1px solid #f1f5f9;
|
||||||
|
padding-bottom: 0.55rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-detail-row span {
|
||||||
|
color: #64748b;
|
||||||
|
font-size: 0.9rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-detail-row strong {
|
||||||
|
text-align: right;
|
||||||
|
color: #111827;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-context {
|
||||||
|
margin: 0;
|
||||||
|
background: #f8fafc;
|
||||||
|
border: 1px solid #e2e8f0;
|
||||||
|
border-radius: 10px;
|
||||||
|
padding: 0.75rem;
|
||||||
|
white-space: pre-wrap;
|
||||||
|
overflow-wrap: anywhere;
|
||||||
|
max-height: 280px;
|
||||||
|
overflow: auto;
|
||||||
|
font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, 'Liberation Mono', 'Courier New', monospace;
|
||||||
|
font-size: 0.84rem;
|
||||||
|
color: #0f172a;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-dialog-content {
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
gap: 1rem;
|
||||||
|
padding: 0.9rem 1rem 0.55rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-dialog-body {
|
||||||
|
min-height: 340px;
|
||||||
|
display: flex;
|
||||||
|
flex-direction: column;
|
||||||
|
justify-content: space-between;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-dialog-actions {
|
||||||
|
margin-top: 0.5rem;
|
||||||
|
padding: 0 1rem 0.9rem;
|
||||||
|
display: flex;
|
||||||
|
justify-content: flex-end;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-context-title {
|
||||||
|
font-weight: 600;
|
||||||
|
margin-bottom: 0.55rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-dialog-content .monitoring-log-detail-row {
|
||||||
|
padding: 0.1rem 0 0.75rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-dialog-content .monitoring-log-context {
|
||||||
|
padding: 0.95rem;
|
||||||
|
border-radius: 12px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-lower-grid {
|
||||||
|
display: grid;
|
||||||
|
grid-template-columns: repeat(2, minmax(0, 1fr));
|
||||||
|
gap: 1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
@media (width <= 1200px) {
|
||||||
|
.monitoring-main-grid,
|
||||||
|
.monitoring-lower-grid {
|
||||||
|
grid-template-columns: 1fr;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@media (width <= 720px) {
|
||||||
|
.monitoring-page {
|
||||||
|
padding: 0.25rem 0 0.75rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-title {
|
||||||
|
font-size: 1.5rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-header-row,
|
||||||
|
.monitoring-panel-header,
|
||||||
|
.monitoring-detail-row,
|
||||||
|
.monitoring-log-detail-row {
|
||||||
|
flex-direction: column;
|
||||||
|
align-items: flex-start;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-detail-row strong,
|
||||||
|
.monitoring-log-detail-row strong {
|
||||||
|
text-align: left;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-toolbar,
|
||||||
|
.monitoring-toolbar-field,
|
||||||
|
.monitoring-toolbar-field-compact {
|
||||||
|
width: 100%;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-dialog-content {
|
||||||
|
padding: 0.4rem 0.2rem 0.1rem;
|
||||||
|
gap: 0.75rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-dialog-body {
|
||||||
|
min-height: 300px;
|
||||||
|
}
|
||||||
|
|
||||||
|
.monitoring-log-dialog-actions {
|
||||||
|
padding: 0 0.2rem 0.4rem;
|
||||||
|
}
|
||||||
|
}
|
||||||
573
dashboard/src/monitoring.tsx
Normal file
573
dashboard/src/monitoring.tsx
Normal file
@@ -0,0 +1,573 @@
|
|||||||
|
import React from 'react';
|
||||||
|
import {
|
||||||
|
fetchClientMonitoringLogs,
|
||||||
|
fetchMonitoringOverview,
|
||||||
|
fetchRecentClientErrors,
|
||||||
|
type MonitoringClient,
|
||||||
|
type MonitoringLogEntry,
|
||||||
|
type MonitoringOverview,
|
||||||
|
} from './apiClientMonitoring';
|
||||||
|
import { useAuth } from './useAuth';
|
||||||
|
import { ButtonComponent } from '@syncfusion/ej2-react-buttons';
|
||||||
|
import { DropDownListComponent } from '@syncfusion/ej2-react-dropdowns';
|
||||||
|
import {
|
||||||
|
GridComponent,
|
||||||
|
ColumnsDirective,
|
||||||
|
ColumnDirective,
|
||||||
|
Inject,
|
||||||
|
Page,
|
||||||
|
Search,
|
||||||
|
Sort,
|
||||||
|
Toolbar,
|
||||||
|
} from '@syncfusion/ej2-react-grids';
|
||||||
|
import { MessageComponent } from '@syncfusion/ej2-react-notifications';
|
||||||
|
import { DialogComponent } from '@syncfusion/ej2-react-popups';
|
||||||
|
import './monitoring.css';
|
||||||
|
|
||||||
|
const REFRESH_INTERVAL_MS = 15000;
|
||||||
|
const PRIORITY_REFRESH_INTERVAL_MS = 3000;
|
||||||
|
|
||||||
|
const hourOptions = [
|
||||||
|
{ text: 'Letzte 6 Stunden', value: 6 },
|
||||||
|
{ text: 'Letzte 24 Stunden', value: 24 },
|
||||||
|
{ text: 'Letzte 72 Stunden', value: 72 },
|
||||||
|
{ text: 'Letzte 168 Stunden', value: 168 },
|
||||||
|
];
|
||||||
|
|
||||||
|
const logLevelOptions = [
|
||||||
|
{ text: 'Alle Logs', value: 'ALL' },
|
||||||
|
{ text: 'ERROR', value: 'ERROR' },
|
||||||
|
{ text: 'WARN', value: 'WARN' },
|
||||||
|
{ text: 'INFO', value: 'INFO' },
|
||||||
|
{ text: 'DEBUG', value: 'DEBUG' },
|
||||||
|
];
|
||||||
|
|
||||||
|
const statusPalette: Record<string, { label: string; color: string; background: string }> = {
|
||||||
|
healthy: { label: 'Stabil', color: '#166534', background: '#dcfce7' },
|
||||||
|
warning: { label: 'Warnung', color: '#92400e', background: '#fef3c7' },
|
||||||
|
critical: { label: 'Kritisch', color: '#991b1b', background: '#fee2e2' },
|
||||||
|
offline: { label: 'Offline', color: '#334155', background: '#e2e8f0' },
|
||||||
|
};
|
||||||
|
|
||||||
|
function parseUtcDate(value?: string | null): Date | null {
|
||||||
|
if (!value) return null;
|
||||||
|
const trimmed = value.trim();
|
||||||
|
if (!trimmed) return null;
|
||||||
|
|
||||||
|
const hasTimezone = /[zZ]$|[+-]\d{2}:?\d{2}$/.test(trimmed);
|
||||||
|
const utcValue = hasTimezone ? trimmed : `${trimmed}Z`;
|
||||||
|
const parsed = new Date(utcValue);
|
||||||
|
if (Number.isNaN(parsed.getTime())) return null;
|
||||||
|
return parsed;
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatTimestamp(value?: string | null): string {
|
||||||
|
if (!value) return 'Keine Daten';
|
||||||
|
const date = parseUtcDate(value);
|
||||||
|
if (!date) return value;
|
||||||
|
return date.toLocaleString('de-DE');
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatRelative(value?: string | null): string {
|
||||||
|
if (!value) return 'Keine Daten';
|
||||||
|
const date = parseUtcDate(value);
|
||||||
|
if (!date) return 'Unbekannt';
|
||||||
|
|
||||||
|
const diffMs = Date.now() - date.getTime();
|
||||||
|
const diffMinutes = Math.floor(diffMs / 60000);
|
||||||
|
const diffHours = Math.floor(diffMinutes / 60);
|
||||||
|
const diffDays = Math.floor(diffHours / 24);
|
||||||
|
|
||||||
|
if (diffMinutes < 1) return 'gerade eben';
|
||||||
|
if (diffMinutes < 60) return `vor ${diffMinutes} Min.`;
|
||||||
|
if (diffHours < 24) return `vor ${diffHours} Std.`;
|
||||||
|
return `vor ${diffDays} Tag${diffDays === 1 ? '' : 'en'}`;
|
||||||
|
}
|
||||||
|
|
||||||
|
function statusBadge(status: string) {
|
||||||
|
const palette = statusPalette[status] || statusPalette.offline;
|
||||||
|
return (
|
||||||
|
<span
|
||||||
|
className="monitoring-status-badge"
|
||||||
|
style={{ color: palette.color, backgroundColor: palette.background }}
|
||||||
|
>
|
||||||
|
{palette.label}
|
||||||
|
</span>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function screenshotTypeBadge(type?: string | null, hasPriority = false) {
|
||||||
|
const normalized = (type || 'periodic').toLowerCase();
|
||||||
|
const map: Record<string, { label: string; className: string }> = {
|
||||||
|
periodic: { label: 'Periodisch', className: 'monitoring-shot-type-periodic' },
|
||||||
|
event_start: { label: 'Event-Start', className: 'monitoring-shot-type-event' },
|
||||||
|
event_stop: { label: 'Event-Stopp', className: 'monitoring-shot-type-event' },
|
||||||
|
};
|
||||||
|
|
||||||
|
const info = map[normalized] || map.periodic;
|
||||||
|
const classes = `monitoring-shot-type ${info.className}${hasPriority ? ' monitoring-shot-type-active' : ''}`;
|
||||||
|
return <span className={classes}>{info.label}</span>;
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderMetricCard(title: string, value: number, subtitle: string, accent: string) {
|
||||||
|
return (
|
||||||
|
<div className="e-card monitoring-metric-card" style={{ borderTop: `4px solid ${accent}` }}>
|
||||||
|
<div className="e-card-content monitoring-metric-content">
|
||||||
|
<div className="monitoring-metric-title">{title}</div>
|
||||||
|
<div className="monitoring-metric-value">{value}</div>
|
||||||
|
<div className="monitoring-metric-subtitle">{subtitle}</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function renderContext(context?: Record<string, unknown>): string {
|
||||||
|
if (!context || Object.keys(context).length === 0) {
|
||||||
|
return 'Kein Kontext vorhanden';
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
return JSON.stringify(context, null, 2);
|
||||||
|
} catch {
|
||||||
|
return 'Kontext konnte nicht formatiert werden';
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function buildScreenshotUrl(client: MonitoringClient, overviewTimestamp?: string | null): string {
|
||||||
|
const refreshKey = client.lastScreenshotHash || client.lastScreenshotAnalyzed || overviewTimestamp;
|
||||||
|
if (!refreshKey) {
|
||||||
|
return client.screenshotUrl;
|
||||||
|
}
|
||||||
|
|
||||||
|
const separator = client.screenshotUrl.includes('?') ? '&' : '?';
|
||||||
|
return `${client.screenshotUrl}${separator}v=${encodeURIComponent(refreshKey)}`;
|
||||||
|
}
|
||||||
|
|
||||||
|
const MonitoringDashboard: React.FC = () => {
|
||||||
|
const { user } = useAuth();
|
||||||
|
const [hours, setHours] = React.useState<number>(24);
|
||||||
|
const [logLevel, setLogLevel] = React.useState<string>('ALL');
|
||||||
|
const [overview, setOverview] = React.useState<MonitoringOverview | null>(null);
|
||||||
|
const [recentErrors, setRecentErrors] = React.useState<MonitoringLogEntry[]>([]);
|
||||||
|
const [clientLogs, setClientLogs] = React.useState<MonitoringLogEntry[]>([]);
|
||||||
|
const [selectedClientUuid, setSelectedClientUuid] = React.useState<string | null>(null);
|
||||||
|
const [loading, setLoading] = React.useState<boolean>(true);
|
||||||
|
const [error, setError] = React.useState<string | null>(null);
|
||||||
|
const [logsLoading, setLogsLoading] = React.useState<boolean>(false);
|
||||||
|
const [screenshotErrored, setScreenshotErrored] = React.useState<boolean>(false);
|
||||||
|
const selectedClientUuidRef = React.useRef<string | null>(null);
|
||||||
|
const [selectedLogEntry, setSelectedLogEntry] = React.useState<MonitoringLogEntry | null>(null);
|
||||||
|
|
||||||
|
const selectedClient = React.useMemo<MonitoringClient | null>(() => {
|
||||||
|
if (!overview || !selectedClientUuid) return null;
|
||||||
|
return overview.clients.find(client => client.uuid === selectedClientUuid) || null;
|
||||||
|
}, [overview, selectedClientUuid]);
|
||||||
|
|
||||||
|
const selectedClientScreenshotUrl = React.useMemo<string | null>(() => {
|
||||||
|
if (!selectedClient) return null;
|
||||||
|
return buildScreenshotUrl(selectedClient, overview?.timestamp || null);
|
||||||
|
}, [selectedClient, overview?.timestamp]);
|
||||||
|
|
||||||
|
React.useEffect(() => {
|
||||||
|
selectedClientUuidRef.current = selectedClientUuid;
|
||||||
|
}, [selectedClientUuid]);
|
||||||
|
|
||||||
|
const loadOverview = React.useCallback(async (requestedHours: number, preserveSelection = true) => {
|
||||||
|
setLoading(true);
|
||||||
|
setError(null);
|
||||||
|
try {
|
||||||
|
const [overviewData, errorsData] = await Promise.all([
|
||||||
|
fetchMonitoringOverview(requestedHours),
|
||||||
|
fetchRecentClientErrors(25),
|
||||||
|
]);
|
||||||
|
setOverview(overviewData);
|
||||||
|
setRecentErrors(errorsData);
|
||||||
|
|
||||||
|
const currentSelection = selectedClientUuidRef.current;
|
||||||
|
const nextSelectedUuid =
|
||||||
|
preserveSelection && currentSelection && overviewData.clients.some(client => client.uuid === currentSelection)
|
||||||
|
? currentSelection
|
||||||
|
: overviewData.clients[0]?.uuid || null;
|
||||||
|
|
||||||
|
setSelectedClientUuid(nextSelectedUuid);
|
||||||
|
setScreenshotErrored(false);
|
||||||
|
} catch (loadError) {
|
||||||
|
setError(loadError instanceof Error ? loadError.message : 'Monitoring-Daten konnten nicht geladen werden');
|
||||||
|
} finally {
|
||||||
|
setLoading(false);
|
||||||
|
}
|
||||||
|
}, []);
|
||||||
|
|
||||||
|
React.useEffect(() => {
|
||||||
|
loadOverview(hours, false);
|
||||||
|
}, [hours, loadOverview]);
|
||||||
|
|
||||||
|
React.useEffect(() => {
|
||||||
|
const hasActivePriorityScreenshots = (overview?.summary.activePriorityScreenshots || 0) > 0;
|
||||||
|
const intervalMs = hasActivePriorityScreenshots ? PRIORITY_REFRESH_INTERVAL_MS : REFRESH_INTERVAL_MS;
|
||||||
|
const intervalId = window.setInterval(() => {
|
||||||
|
loadOverview(hours);
|
||||||
|
}, intervalMs);
|
||||||
|
|
||||||
|
return () => window.clearInterval(intervalId);
|
||||||
|
}, [hours, loadOverview, overview?.summary.activePriorityScreenshots]);
|
||||||
|
|
||||||
|
React.useEffect(() => {
|
||||||
|
if (!selectedClientUuid) {
|
||||||
|
setClientLogs([]);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
let active = true;
|
||||||
|
const loadLogs = async () => {
|
||||||
|
setLogsLoading(true);
|
||||||
|
try {
|
||||||
|
const logs = await fetchClientMonitoringLogs(selectedClientUuid, { level: logLevel, limit: 100 });
|
||||||
|
if (active) {
|
||||||
|
setClientLogs(logs);
|
||||||
|
}
|
||||||
|
} catch (loadError) {
|
||||||
|
if (active) {
|
||||||
|
setClientLogs([]);
|
||||||
|
setError(loadError instanceof Error ? loadError.message : 'Client-Logs konnten nicht geladen werden');
|
||||||
|
}
|
||||||
|
} finally {
|
||||||
|
if (active) {
|
||||||
|
setLogsLoading(false);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
loadLogs();
|
||||||
|
return () => {
|
||||||
|
active = false;
|
||||||
|
};
|
||||||
|
}, [selectedClientUuid, logLevel]);
|
||||||
|
|
||||||
|
React.useEffect(() => {
|
||||||
|
setScreenshotErrored(false);
|
||||||
|
}, [selectedClientUuid]);
|
||||||
|
|
||||||
|
if (!user || user.role !== 'superadmin') {
|
||||||
|
return (
|
||||||
|
<MessageComponent severity="Error" content="Dieses Monitoring-Dashboard ist nur für Superadministratoren sichtbar." />
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const clientGridData = (overview?.clients || []).map(client => ({
|
||||||
|
...client,
|
||||||
|
displayName: client.description || client.hostname || client.uuid,
|
||||||
|
lastAliveDisplay: formatTimestamp(client.lastAlive),
|
||||||
|
currentProcessDisplay: client.currentProcess || 'kein Prozess',
|
||||||
|
processStatusDisplay: client.processStatus || 'unbekannt',
|
||||||
|
errorCount: client.logCounts24h.error,
|
||||||
|
warnCount: client.logCounts24h.warn,
|
||||||
|
}));
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="monitoring-page">
|
||||||
|
<div className="monitoring-header-row">
|
||||||
|
<div>
|
||||||
|
<h2 className="monitoring-title">Monitor-Dashboard</h2>
|
||||||
|
<p className="monitoring-subtitle">
|
||||||
|
Live-Zustand der Infoscreen-Clients, Prozessstatus und zentrale Fehlerprotokolle.
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-toolbar">
|
||||||
|
<div className="monitoring-toolbar-field">
|
||||||
|
<label>Zeitraum</label>
|
||||||
|
<DropDownListComponent
|
||||||
|
dataSource={hourOptions}
|
||||||
|
fields={{ text: 'text', value: 'value' }}
|
||||||
|
value={hours}
|
||||||
|
change={(args: { value: number }) => setHours(Number(args.value))}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
<ButtonComponent cssClass="e-primary" onClick={() => loadOverview(hours)} disabled={loading}>
|
||||||
|
Aktualisieren
|
||||||
|
</ButtonComponent>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{error && <MessageComponent severity="Error" content={error} />}
|
||||||
|
|
||||||
|
{overview && (
|
||||||
|
<div className="monitoring-meta-row">
|
||||||
|
<span>Stand: {formatTimestamp(overview.timestamp)}</span>
|
||||||
|
<span>Alive-Fenster: {overview.gracePeriodSeconds} Sekunden</span>
|
||||||
|
<span>Betrachtungszeitraum: {overview.periodHours} Stunden</span>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<div className="monitoring-summary-grid">
|
||||||
|
{renderMetricCard('Clients gesamt', overview?.summary.totalClients || 0, 'Registrierte Displays', '#7c3aed')}
|
||||||
|
{renderMetricCard('Online', overview?.summary.onlineClients || 0, 'Heartbeat innerhalb der Grace-Periode', '#15803d')}
|
||||||
|
{renderMetricCard('Warnungen', overview?.summary.warningClients || 0, 'Warn-Logs oder Übergangszustände', '#d97706')}
|
||||||
|
{renderMetricCard('Kritisch', overview?.summary.criticalClients || 0, 'Crashs oder Fehler-Logs', '#dc2626')}
|
||||||
|
{renderMetricCard('Offline', overview?.summary.offlineClients || 0, 'Keine frischen Signale', '#475569')}
|
||||||
|
{renderMetricCard('Prioritäts-Screens', overview?.summary.activePriorityScreenshots || 0, 'Event-Start/Stop aktiv', '#ea580c')}
|
||||||
|
{renderMetricCard('Fehler-Logs', overview?.summary.errorLogs || 0, 'Im gewählten Zeitraum', '#b91c1c')}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{loading && !overview ? (
|
||||||
|
<MessageComponent severity="Info" content="Monitoring-Daten werden geladen ..." />
|
||||||
|
) : (
|
||||||
|
<div className="monitoring-main-grid">
|
||||||
|
<div className="monitoring-panel monitoring-clients-panel">
|
||||||
|
<div className="monitoring-panel-header">
|
||||||
|
<h3>Client-Zustand</h3>
|
||||||
|
<span>{overview?.clients.length || 0} Einträge</span>
|
||||||
|
</div>
|
||||||
|
<GridComponent
|
||||||
|
dataSource={clientGridData}
|
||||||
|
allowPaging={true}
|
||||||
|
pageSettings={{ pageSize: 10 }}
|
||||||
|
allowSorting={true}
|
||||||
|
toolbar={['Search']}
|
||||||
|
height={460}
|
||||||
|
rowSelected={(args: { data: MonitoringClient }) => {
|
||||||
|
setSelectedClientUuid(args.data.uuid);
|
||||||
|
}}
|
||||||
|
>
|
||||||
|
<ColumnsDirective>
|
||||||
|
<ColumnDirective
|
||||||
|
field="status"
|
||||||
|
headerText="Status"
|
||||||
|
width="120"
|
||||||
|
template={(props: MonitoringClient) => statusBadge(props.status)}
|
||||||
|
/>
|
||||||
|
<ColumnDirective field="displayName" headerText="Client" width="190" />
|
||||||
|
<ColumnDirective field="groupName" headerText="Gruppe" width="150" />
|
||||||
|
<ColumnDirective field="currentProcessDisplay" headerText="Prozess" width="130" />
|
||||||
|
<ColumnDirective field="processStatusDisplay" headerText="Prozessstatus" width="130" />
|
||||||
|
<ColumnDirective field="errorCount" headerText="ERROR" textAlign="Right" width="90" />
|
||||||
|
<ColumnDirective field="warnCount" headerText="WARN" textAlign="Right" width="90" />
|
||||||
|
<ColumnDirective field="lastAliveDisplay" headerText="Letztes Signal" width="170" />
|
||||||
|
</ColumnsDirective>
|
||||||
|
<Inject services={[Page, Search, Sort, Toolbar]} />
|
||||||
|
</GridComponent>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="monitoring-sidebar-column">
|
||||||
|
<div className="e-card monitoring-detail-card">
|
||||||
|
<div className="e-card-header">
|
||||||
|
<div className="e-card-header-caption">
|
||||||
|
<div className="e-card-title">Aktiver Client</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div className="e-card-content">
|
||||||
|
{selectedClient ? (
|
||||||
|
<div className="monitoring-detail-list">
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>Name</span>
|
||||||
|
<strong>{selectedClient.description || selectedClient.hostname || selectedClient.uuid}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>Status</span>
|
||||||
|
<strong>{statusBadge(selectedClient.status)}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>UUID</span>
|
||||||
|
<strong className="monitoring-mono">{selectedClient.uuid}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>Raumgruppe</span>
|
||||||
|
<strong>{selectedClient.groupName || 'Nicht zugeordnet'}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>Prozess</span>
|
||||||
|
<strong>{selectedClient.currentProcess || 'kein Prozess'}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>PID</span>
|
||||||
|
<strong>{selectedClient.processPid || 'keine PID'}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>Event-ID</span>
|
||||||
|
<strong>{selectedClient.currentEventId || 'keine Zuordnung'}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>Letztes Signal</span>
|
||||||
|
<strong>{formatRelative(selectedClient.lastAlive)}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>Bildschirmstatus</span>
|
||||||
|
<strong>{selectedClient.screenHealthStatus || 'UNKNOWN'}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>Letzte Analyse</span>
|
||||||
|
<strong>{formatTimestamp(selectedClient.lastScreenshotAnalyzed)}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>Screenshot-Typ</span>
|
||||||
|
<strong>
|
||||||
|
{screenshotTypeBadge(
|
||||||
|
selectedClient.latestScreenshotType,
|
||||||
|
!!selectedClient.hasActivePriorityScreenshot
|
||||||
|
)}
|
||||||
|
</strong>
|
||||||
|
</div>
|
||||||
|
{selectedClient.priorityScreenshotReceivedAt && (
|
||||||
|
<div className="monitoring-detail-row">
|
||||||
|
<span>Priorität empfangen</span>
|
||||||
|
<strong>{formatTimestamp(selectedClient.priorityScreenshotReceivedAt)}</strong>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
) : (
|
||||||
|
<MessageComponent severity="Info" content="Wählen Sie links einen Client aus." />
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="e-card monitoring-detail-card">
|
||||||
|
<div className="e-card-header">
|
||||||
|
<div className="e-card-header-caption">
|
||||||
|
<div className="e-card-title">Der letzte Screenshot</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div className="e-card-content">
|
||||||
|
{selectedClient ? (
|
||||||
|
<>
|
||||||
|
{screenshotErrored ? (
|
||||||
|
<MessageComponent severity="Warning" content="Für diesen Client liegt noch kein Screenshot vor." />
|
||||||
|
) : (
|
||||||
|
<img
|
||||||
|
src={selectedClientScreenshotUrl || selectedClient.screenshotUrl}
|
||||||
|
alt={`Screenshot ${selectedClient.uuid}`}
|
||||||
|
className="monitoring-screenshot"
|
||||||
|
onError={() => setScreenshotErrored(true)}
|
||||||
|
/>
|
||||||
|
)}
|
||||||
|
<div className="monitoring-screenshot-meta">
|
||||||
|
<span>Empfangen: {formatTimestamp(selectedClient.lastScreenshotAnalyzed)}</span>
|
||||||
|
<span>
|
||||||
|
Typ:{' '}
|
||||||
|
{screenshotTypeBadge(
|
||||||
|
selectedClient.latestScreenshotType,
|
||||||
|
!!selectedClient.hasActivePriorityScreenshot
|
||||||
|
)}
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
|
</>
|
||||||
|
) : (
|
||||||
|
<MessageComponent severity="Info" content="Kein Client ausgewählt." />
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="e-card monitoring-detail-card">
|
||||||
|
<div className="e-card-header">
|
||||||
|
<div className="e-card-header-caption">
|
||||||
|
<div className="e-card-title">Letzter Fehler</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div className="e-card-content">
|
||||||
|
{selectedClient?.latestError ? (
|
||||||
|
<div className="monitoring-error-box">
|
||||||
|
<div className="monitoring-error-time">{formatTimestamp(selectedClient.latestError.timestamp)}</div>
|
||||||
|
<div className="monitoring-error-message">{selectedClient.latestError.message}</div>
|
||||||
|
</div>
|
||||||
|
) : (
|
||||||
|
<MessageComponent severity="Success" content="Kein ERROR-Log für den ausgewählten Client gefunden." />
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<div className="monitoring-lower-grid">
|
||||||
|
<div className="monitoring-panel">
|
||||||
|
<div className="monitoring-panel-header monitoring-panel-header-stacked">
|
||||||
|
<div>
|
||||||
|
<h3>Client-Logs</h3>
|
||||||
|
<span>{selectedClient ? `Client ${selectedClient.uuid}` : 'Kein Client ausgewählt'}</span>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-toolbar-field monitoring-toolbar-field-compact">
|
||||||
|
<label>Level</label>
|
||||||
|
<DropDownListComponent
|
||||||
|
dataSource={logLevelOptions}
|
||||||
|
fields={{ text: 'text', value: 'value' }}
|
||||||
|
value={logLevel}
|
||||||
|
change={(args: { value: string }) => setLogLevel(String(args.value))}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
{logsLoading && <MessageComponent severity="Info" content="Client-Logs werden geladen ..." />}
|
||||||
|
<GridComponent
|
||||||
|
dataSource={clientLogs}
|
||||||
|
allowPaging={true}
|
||||||
|
pageSettings={{ pageSize: 8 }}
|
||||||
|
allowSorting={true}
|
||||||
|
height={320}
|
||||||
|
rowSelected={(args: { data: MonitoringLogEntry }) => {
|
||||||
|
setSelectedLogEntry(args.data);
|
||||||
|
}}
|
||||||
|
>
|
||||||
|
<ColumnsDirective>
|
||||||
|
<ColumnDirective field="timestamp" headerText="Zeit" width="180" template={(props: MonitoringLogEntry) => formatTimestamp(props.timestamp)} />
|
||||||
|
<ColumnDirective field="level" headerText="Level" width="90" />
|
||||||
|
<ColumnDirective field="message" headerText="Nachricht" width="360" />
|
||||||
|
</ColumnsDirective>
|
||||||
|
<Inject services={[Page, Sort]} />
|
||||||
|
</GridComponent>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="monitoring-panel">
|
||||||
|
<div className="monitoring-panel-header">
|
||||||
|
<h3>Letzte Fehler systemweit</h3>
|
||||||
|
<span>{recentErrors.length} Einträge</span>
|
||||||
|
</div>
|
||||||
|
<GridComponent dataSource={recentErrors} allowPaging={true} pageSettings={{ pageSize: 8 }} allowSorting={true} height={320}>
|
||||||
|
<ColumnsDirective>
|
||||||
|
<ColumnDirective field="timestamp" headerText="Zeit" width="180" template={(props: MonitoringLogEntry) => formatTimestamp(props.timestamp)} />
|
||||||
|
<ColumnDirective field="client_uuid" headerText="Client" width="220" />
|
||||||
|
<ColumnDirective field="message" headerText="Nachricht" width="360" />
|
||||||
|
</ColumnsDirective>
|
||||||
|
<Inject services={[Page, Sort]} />
|
||||||
|
</GridComponent>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<DialogComponent
|
||||||
|
isModal={true}
|
||||||
|
visible={!!selectedLogEntry}
|
||||||
|
width="860px"
|
||||||
|
minHeight="420px"
|
||||||
|
header="Log-Details"
|
||||||
|
animationSettings={{ effect: 'None' }}
|
||||||
|
buttons={[]}
|
||||||
|
showCloseIcon={true}
|
||||||
|
close={() => setSelectedLogEntry(null)}
|
||||||
|
>
|
||||||
|
{selectedLogEntry && (
|
||||||
|
<div className="monitoring-log-dialog-body">
|
||||||
|
<div className="monitoring-log-dialog-content">
|
||||||
|
<div className="monitoring-log-detail-row">
|
||||||
|
<span>Zeit</span>
|
||||||
|
<strong>{formatTimestamp(selectedLogEntry.timestamp)}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-log-detail-row">
|
||||||
|
<span>Level</span>
|
||||||
|
<strong>{selectedLogEntry.level || 'Unbekannt'}</strong>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-log-detail-row">
|
||||||
|
<span>Nachricht</span>
|
||||||
|
<strong style={{ whiteSpace: 'normal', textAlign: 'left' }}>{selectedLogEntry.message}</strong>
|
||||||
|
</div>
|
||||||
|
<div>
|
||||||
|
<div className="monitoring-log-context-title">Kontext</div>
|
||||||
|
<pre className="monitoring-log-context">{renderContext(selectedLogEntry.context)}</pre>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div className="monitoring-log-dialog-actions">
|
||||||
|
<ButtonComponent onClick={() => setSelectedLogEntry(null)}>Schließen</ButtonComponent>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</DialogComponent>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
};
|
||||||
|
|
||||||
|
export default MonitoringDashboard;
|
||||||
@@ -33,7 +33,7 @@ const Ressourcen: React.FC = () => {
|
|||||||
const [groupOrder, setGroupOrder] = useState<number[]>([]);
|
const [groupOrder, setGroupOrder] = useState<number[]>([]);
|
||||||
const [showOrderPanel, setShowOrderPanel] = useState<boolean>(false);
|
const [showOrderPanel, setShowOrderPanel] = useState<boolean>(false);
|
||||||
const [timelineView] = useState<TimelineView>('day');
|
const [timelineView] = useState<TimelineView>('day');
|
||||||
const [viewDate] = useState<Date>(() => {
|
const [viewDate, setViewDate] = useState<Date>(() => {
|
||||||
const now = new Date();
|
const now = new Date();
|
||||||
now.setHours(0, 0, 0, 0);
|
now.setHours(0, 0, 0, 0);
|
||||||
return now;
|
return now;
|
||||||
@@ -110,23 +110,31 @@ const Ressourcen: React.FC = () => {
|
|||||||
for (const group of groups) {
|
for (const group of groups) {
|
||||||
try {
|
try {
|
||||||
console.log(`[Ressourcen] Fetching events for group "${group.name}" (ID: ${group.id})`);
|
console.log(`[Ressourcen] Fetching events for group "${group.name}" (ID: ${group.id})`);
|
||||||
const apiEvents = await fetchEvents(group.id.toString(), false, {
|
const apiEvents = await fetchEvents(group.id.toString(), true, {
|
||||||
start,
|
start,
|
||||||
end,
|
end,
|
||||||
});
|
});
|
||||||
console.log(`[Ressourcen] Got ${apiEvents?.length || 0} events for group "${group.name}"`);
|
console.log(`[Ressourcen] Got ${apiEvents?.length || 0} events for group "${group.name}"`);
|
||||||
|
|
||||||
if (Array.isArray(apiEvents) && apiEvents.length > 0) {
|
if (Array.isArray(apiEvents) && apiEvents.length > 0) {
|
||||||
const event = apiEvents[0];
|
for (const event of apiEvents) {
|
||||||
const eventTitle = event.subject || event.title || 'Unnamed Event';
|
const eventTitle = event.subject || event.title || 'Unnamed Event';
|
||||||
const eventType = event.type || event.event_type || 'other';
|
const eventType = event.type || event.event_type || 'other';
|
||||||
const eventStart = event.startTime || event.start;
|
const eventStart = event.startTime || event.start;
|
||||||
const eventEnd = event.endTime || event.end;
|
const eventEnd = event.endTime || event.end;
|
||||||
|
|
||||||
|
if (!eventStart || !eventEnd) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
if (eventStart && eventEnd) {
|
|
||||||
const parsedStart = parseUTCDate(eventStart);
|
const parsedStart = parseUTCDate(eventStart);
|
||||||
const parsedEnd = parseUTCDate(eventEnd);
|
const parsedEnd = parseUTCDate(eventEnd);
|
||||||
|
|
||||||
|
// Keep only events that overlap the visible range.
|
||||||
|
if (parsedEnd < start || parsedStart > end) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
// Capitalize first letter of event type
|
// Capitalize first letter of event type
|
||||||
const formattedType = eventType.charAt(0).toUpperCase() + eventType.slice(1);
|
const formattedType = eventType.charAt(0).toUpperCase() + eventType.slice(1);
|
||||||
|
|
||||||
@@ -138,7 +146,6 @@ const Ressourcen: React.FC = () => {
|
|||||||
ResourceId: group.id,
|
ResourceId: group.id,
|
||||||
EventType: eventType,
|
EventType: eventType,
|
||||||
});
|
});
|
||||||
console.log(`[Ressourcen] Group "${group.name}" has event: ${eventTitle}`);
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
} catch (error) {
|
} catch (error) {
|
||||||
@@ -324,6 +331,16 @@ const Ressourcen: React.FC = () => {
|
|||||||
group={{ resources: ['Groups'], allowGroupEdit: false }}
|
group={{ resources: ['Groups'], allowGroupEdit: false }}
|
||||||
timeScale={{ interval: 60, slotCount: 1 }}
|
timeScale={{ interval: 60, slotCount: 1 }}
|
||||||
rowAutoHeight={false}
|
rowAutoHeight={false}
|
||||||
|
actionComplete={(args) => {
|
||||||
|
if (args.requestType === 'dateNavigate' || args.requestType === 'viewNavigate') {
|
||||||
|
const selected = scheduleRef.current?.selectedDate;
|
||||||
|
if (selected) {
|
||||||
|
const normalized = new Date(selected);
|
||||||
|
normalized.setHours(0, 0, 0, 0);
|
||||||
|
setViewDate(normalized);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}}
|
||||||
>
|
>
|
||||||
<ViewsDirective>
|
<ViewsDirective>
|
||||||
<ViewDirective option="TimelineDay" displayName="Tag"></ViewDirective>
|
<ViewDirective option="TimelineDay" displayName="Tag"></ViewDirective>
|
||||||
|
|||||||
@@ -18,8 +18,9 @@ services:
|
|||||||
environment:
|
environment:
|
||||||
- DB_CONN=mysql+pymysql://${DB_USER}:${DB_PASSWORD}@db/${DB_NAME}
|
- DB_CONN=mysql+pymysql://${DB_USER}:${DB_PASSWORD}@db/${DB_NAME}
|
||||||
- DB_URL=mysql+pymysql://${DB_USER}:${DB_PASSWORD}@db/${DB_NAME}
|
- DB_URL=mysql+pymysql://${DB_USER}:${DB_PASSWORD}@db/${DB_NAME}
|
||||||
- ENV=${ENV:-development}
|
- API_BASE_URL=http://server:8000
|
||||||
- FLASK_SECRET_KEY=${FLASK_SECRET_KEY:-dev-secret-key-change-in-production}
|
- ENV=${ENV:-development}
|
||||||
|
- FLASK_SECRET_KEY=${FLASK_SECRET_KEY:-dev-secret-key-change-in-production}
|
||||||
- DEFAULT_SUPERADMIN_USERNAME=${DEFAULT_SUPERADMIN_USERNAME:-superadmin}
|
- DEFAULT_SUPERADMIN_USERNAME=${DEFAULT_SUPERADMIN_USERNAME:-superadmin}
|
||||||
- DEFAULT_SUPERADMIN_PASSWORD=${DEFAULT_SUPERADMIN_PASSWORD}
|
- DEFAULT_SUPERADMIN_PASSWORD=${DEFAULT_SUPERADMIN_PASSWORD}
|
||||||
# 🔧 ENTFERNT: Volume-Mount ist nur für die Entwicklung
|
# 🔧 ENTFERNT: Volume-Mount ist nur für die Entwicklung
|
||||||
|
|||||||
@@ -3,15 +3,17 @@ import json
|
|||||||
import logging
|
import logging
|
||||||
import datetime
|
import datetime
|
||||||
import base64
|
import base64
|
||||||
|
import re
|
||||||
import requests
|
import requests
|
||||||
import paho.mqtt.client as mqtt
|
import paho.mqtt.client as mqtt
|
||||||
from sqlalchemy import create_engine
|
from sqlalchemy import create_engine
|
||||||
from sqlalchemy.orm import sessionmaker
|
from sqlalchemy.orm import sessionmaker
|
||||||
from models.models import Client
|
from models.models import Client, ClientLog, LogLevel, ProcessStatus, ScreenHealthStatus
|
||||||
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s [%(levelname)s] %(message)s')
|
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s [%(levelname)s] %(message)s')
|
||||||
|
|
||||||
# Load .env in development
|
# Load .env only when not already configured by Docker (API_BASE_URL not set by compose means we're outside a container)
|
||||||
if os.getenv("ENV", "development") == "development":
|
_api_already_set = bool(os.environ.get("API_BASE_URL"))
|
||||||
|
if not _api_already_set and os.getenv("ENV", "development") == "development":
|
||||||
try:
|
try:
|
||||||
from dotenv import load_dotenv
|
from dotenv import load_dotenv
|
||||||
load_dotenv(".env")
|
load_dotenv(".env")
|
||||||
@@ -30,6 +32,288 @@ Session = sessionmaker(bind=engine)
|
|||||||
# API configuration
|
# API configuration
|
||||||
API_BASE_URL = os.getenv("API_BASE_URL", "http://server:8000")
|
API_BASE_URL = os.getenv("API_BASE_URL", "http://server:8000")
|
||||||
|
|
||||||
|
# Dashboard payload migration observability
|
||||||
|
DASHBOARD_METRICS_LOG_EVERY = int(os.getenv("DASHBOARD_METRICS_LOG_EVERY", "5"))
|
||||||
|
DASHBOARD_PARSE_METRICS = {
|
||||||
|
"v2_success": 0,
|
||||||
|
"parse_failures": 0,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_process_status(value):
|
||||||
|
if value is None:
|
||||||
|
return None
|
||||||
|
if isinstance(value, ProcessStatus):
|
||||||
|
return value
|
||||||
|
|
||||||
|
normalized = str(value).strip().lower()
|
||||||
|
if not normalized:
|
||||||
|
return None
|
||||||
|
|
||||||
|
try:
|
||||||
|
return ProcessStatus(normalized)
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_event_id(value):
|
||||||
|
if value is None or isinstance(value, bool):
|
||||||
|
return None
|
||||||
|
if isinstance(value, int):
|
||||||
|
return value
|
||||||
|
if isinstance(value, float):
|
||||||
|
return int(value)
|
||||||
|
|
||||||
|
normalized = str(value).strip()
|
||||||
|
if not normalized:
|
||||||
|
return None
|
||||||
|
if normalized.isdigit():
|
||||||
|
return int(normalized)
|
||||||
|
|
||||||
|
match = re.search(r"(\d+)$", normalized)
|
||||||
|
if match:
|
||||||
|
return int(match.group(1))
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def parse_timestamp(value):
|
||||||
|
if not value:
|
||||||
|
return None
|
||||||
|
if isinstance(value, (int, float)):
|
||||||
|
try:
|
||||||
|
ts_value = float(value)
|
||||||
|
if ts_value > 1e12:
|
||||||
|
ts_value = ts_value / 1000.0
|
||||||
|
return datetime.datetime.fromtimestamp(ts_value, datetime.UTC)
|
||||||
|
except (TypeError, ValueError, OverflowError):
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
value_str = str(value).strip()
|
||||||
|
if value_str.isdigit():
|
||||||
|
ts_value = float(value_str)
|
||||||
|
if ts_value > 1e12:
|
||||||
|
ts_value = ts_value / 1000.0
|
||||||
|
return datetime.datetime.fromtimestamp(ts_value, datetime.UTC)
|
||||||
|
|
||||||
|
parsed = datetime.datetime.fromisoformat(value_str.replace('Z', '+00:00'))
|
||||||
|
if parsed.tzinfo is None:
|
||||||
|
return parsed.replace(tzinfo=datetime.UTC)
|
||||||
|
return parsed.astimezone(datetime.UTC)
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def infer_screen_health_status(payload_data):
|
||||||
|
explicit = payload_data.get('screen_health_status')
|
||||||
|
if explicit:
|
||||||
|
try:
|
||||||
|
return ScreenHealthStatus[str(explicit).strip().upper()]
|
||||||
|
except KeyError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
metrics = payload_data.get('health_metrics') or {}
|
||||||
|
if metrics.get('screen_on') is False:
|
||||||
|
return ScreenHealthStatus.BLACK
|
||||||
|
|
||||||
|
last_frame_update = parse_timestamp(metrics.get('last_frame_update'))
|
||||||
|
if last_frame_update:
|
||||||
|
age_seconds = (datetime.datetime.now(datetime.UTC) - last_frame_update).total_seconds()
|
||||||
|
if age_seconds > 30:
|
||||||
|
return ScreenHealthStatus.FROZEN
|
||||||
|
return ScreenHealthStatus.OK
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def apply_monitoring_update(client_obj, *, event_id=None, process_name=None, process_pid=None,
|
||||||
|
process_status=None, last_seen=None, screen_health_status=None,
|
||||||
|
last_screenshot_analyzed=None):
|
||||||
|
if last_seen:
|
||||||
|
client_obj.last_alive = last_seen
|
||||||
|
|
||||||
|
normalized_event_id = normalize_event_id(event_id)
|
||||||
|
if normalized_event_id is not None:
|
||||||
|
client_obj.current_event_id = normalized_event_id
|
||||||
|
|
||||||
|
if process_name is not None:
|
||||||
|
client_obj.current_process = process_name
|
||||||
|
|
||||||
|
if process_pid is not None:
|
||||||
|
client_obj.process_pid = process_pid
|
||||||
|
|
||||||
|
normalized_status = normalize_process_status(process_status)
|
||||||
|
if normalized_status is not None:
|
||||||
|
client_obj.process_status = normalized_status
|
||||||
|
|
||||||
|
if screen_health_status is not None:
|
||||||
|
client_obj.screen_health_status = screen_health_status
|
||||||
|
|
||||||
|
if last_screenshot_analyzed is not None:
|
||||||
|
existing = client_obj.last_screenshot_analyzed
|
||||||
|
if existing is not None and existing.tzinfo is None:
|
||||||
|
existing = existing.replace(tzinfo=datetime.UTC)
|
||||||
|
|
||||||
|
candidate = last_screenshot_analyzed
|
||||||
|
if candidate.tzinfo is None:
|
||||||
|
candidate = candidate.replace(tzinfo=datetime.UTC)
|
||||||
|
|
||||||
|
if existing is None or candidate >= existing:
|
||||||
|
client_obj.last_screenshot_analyzed = candidate
|
||||||
|
|
||||||
|
|
||||||
|
def _normalize_screenshot_type(raw_type):
|
||||||
|
if raw_type is None:
|
||||||
|
return None
|
||||||
|
|
||||||
|
normalized = str(raw_type).strip().lower()
|
||||||
|
if normalized in ("periodic", "event_start", "event_stop"):
|
||||||
|
return normalized
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _classify_dashboard_payload(data):
|
||||||
|
"""
|
||||||
|
Classify dashboard payload into migration categories for observability.
|
||||||
|
"""
|
||||||
|
if not isinstance(data, dict):
|
||||||
|
return "parse_failures", None
|
||||||
|
|
||||||
|
message_obj = data.get("message") if isinstance(data.get("message"), dict) else None
|
||||||
|
content_obj = data.get("content") if isinstance(data.get("content"), dict) else None
|
||||||
|
metadata_obj = data.get("metadata") if isinstance(data.get("metadata"), dict) else None
|
||||||
|
schema_version = metadata_obj.get("schema_version") if metadata_obj else None
|
||||||
|
|
||||||
|
# v2 detection: grouped blocks available with metadata.
|
||||||
|
if message_obj is not None and content_obj is not None and metadata_obj is not None:
|
||||||
|
return "v2_success", schema_version
|
||||||
|
|
||||||
|
return "parse_failures", schema_version
|
||||||
|
|
||||||
|
|
||||||
|
def _record_dashboard_parse_metric(mode, uuid, schema_version=None, reason=None):
|
||||||
|
if mode not in DASHBOARD_PARSE_METRICS:
|
||||||
|
mode = "parse_failures"
|
||||||
|
|
||||||
|
DASHBOARD_PARSE_METRICS[mode] += 1
|
||||||
|
total = sum(DASHBOARD_PARSE_METRICS.values())
|
||||||
|
|
||||||
|
if mode == "v2_success":
|
||||||
|
if schema_version is None:
|
||||||
|
logging.warning(f"Dashboard payload from {uuid}: missing metadata.schema_version for grouped payload")
|
||||||
|
else:
|
||||||
|
version_text = str(schema_version).strip()
|
||||||
|
if not version_text.startswith("2"):
|
||||||
|
logging.warning(f"Dashboard payload from {uuid}: unknown schema_version={version_text}")
|
||||||
|
|
||||||
|
if mode == "parse_failures":
|
||||||
|
if reason:
|
||||||
|
logging.warning(f"Dashboard payload parse failure for {uuid}: {reason}")
|
||||||
|
else:
|
||||||
|
logging.warning(f"Dashboard payload parse failure for {uuid}")
|
||||||
|
|
||||||
|
if DASHBOARD_METRICS_LOG_EVERY > 0 and total % DASHBOARD_METRICS_LOG_EVERY == 0:
|
||||||
|
logging.info(
|
||||||
|
"Dashboard payload metrics: "
|
||||||
|
f"total={total}, "
|
||||||
|
f"v2_success={DASHBOARD_PARSE_METRICS['v2_success']}, "
|
||||||
|
f"parse_failures={DASHBOARD_PARSE_METRICS['parse_failures']}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _validate_v2_required_fields(data, uuid):
|
||||||
|
"""
|
||||||
|
Soft validation of required v2 fields for grouped dashboard payloads.
|
||||||
|
Logs a WARNING for each missing field. Never drops the message.
|
||||||
|
"""
|
||||||
|
message_obj = data.get("message") if isinstance(data.get("message"), dict) else {}
|
||||||
|
metadata_obj = data.get("metadata") if isinstance(data.get("metadata"), dict) else {}
|
||||||
|
capture_obj = metadata_obj.get("capture") if isinstance(metadata_obj.get("capture"), dict) else {}
|
||||||
|
|
||||||
|
missing = []
|
||||||
|
if not message_obj.get("client_id"):
|
||||||
|
missing.append("message.client_id")
|
||||||
|
if not message_obj.get("status"):
|
||||||
|
missing.append("message.status")
|
||||||
|
if not metadata_obj.get("schema_version"):
|
||||||
|
missing.append("metadata.schema_version")
|
||||||
|
if not capture_obj.get("type"):
|
||||||
|
missing.append("metadata.capture.type")
|
||||||
|
|
||||||
|
if missing:
|
||||||
|
logging.warning(
|
||||||
|
f"Dashboard v2 payload from {uuid} missing required fields: {', '.join(missing)}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_dashboard_payload_fields(data):
|
||||||
|
"""
|
||||||
|
Parse dashboard payload fields from the grouped v2 schema only.
|
||||||
|
"""
|
||||||
|
if not isinstance(data, dict):
|
||||||
|
return {
|
||||||
|
"image": None,
|
||||||
|
"timestamp": None,
|
||||||
|
"screenshot_type": None,
|
||||||
|
"status": None,
|
||||||
|
"process_health": {},
|
||||||
|
}
|
||||||
|
|
||||||
|
# v2 grouped payload blocks
|
||||||
|
message_obj = data.get("message") if isinstance(data.get("message"), dict) else None
|
||||||
|
content_obj = data.get("content") if isinstance(data.get("content"), dict) else None
|
||||||
|
runtime_obj = data.get("runtime") if isinstance(data.get("runtime"), dict) else None
|
||||||
|
metadata_obj = data.get("metadata") if isinstance(data.get("metadata"), dict) else None
|
||||||
|
|
||||||
|
screenshot_obj = None
|
||||||
|
if isinstance(content_obj, dict) and isinstance(content_obj.get("screenshot"), dict):
|
||||||
|
screenshot_obj = content_obj.get("screenshot")
|
||||||
|
|
||||||
|
capture_obj = metadata_obj.get("capture") if metadata_obj and isinstance(metadata_obj.get("capture"), dict) else None
|
||||||
|
|
||||||
|
# Screenshot type comes from v2 metadata.capture.type.
|
||||||
|
screenshot_type = _normalize_screenshot_type(capture_obj.get("type") if capture_obj else None)
|
||||||
|
|
||||||
|
# Image from v2 content.screenshot.
|
||||||
|
image_value = None
|
||||||
|
for container in (screenshot_obj,):
|
||||||
|
if not isinstance(container, dict):
|
||||||
|
continue
|
||||||
|
for key in ("data", "image"):
|
||||||
|
value = container.get(key)
|
||||||
|
if isinstance(value, str) and value:
|
||||||
|
image_value = value
|
||||||
|
break
|
||||||
|
if image_value is not None:
|
||||||
|
break
|
||||||
|
|
||||||
|
# Timestamp precedence: v2 screenshot.timestamp -> capture.captured_at -> metadata.published_at
|
||||||
|
timestamp_value = None
|
||||||
|
timestamp_candidates = [
|
||||||
|
screenshot_obj.get("timestamp") if screenshot_obj else None,
|
||||||
|
capture_obj.get("captured_at") if capture_obj else None,
|
||||||
|
metadata_obj.get("published_at") if metadata_obj else None,
|
||||||
|
]
|
||||||
|
|
||||||
|
for value in timestamp_candidates:
|
||||||
|
if value is not None:
|
||||||
|
timestamp_value = value
|
||||||
|
break
|
||||||
|
|
||||||
|
# Monitoring fields from v2 message/runtime.
|
||||||
|
status_value = (message_obj or {}).get("status")
|
||||||
|
process_health = (runtime_obj or {}).get("process_health")
|
||||||
|
if not isinstance(process_health, dict):
|
||||||
|
process_health = {}
|
||||||
|
|
||||||
|
return {
|
||||||
|
"image": image_value,
|
||||||
|
"timestamp": timestamp_value,
|
||||||
|
"screenshot_type": screenshot_type,
|
||||||
|
"status": status_value,
|
||||||
|
"process_health": process_health,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
def handle_screenshot(uuid, payload):
|
def handle_screenshot(uuid, payload):
|
||||||
"""
|
"""
|
||||||
@@ -40,13 +324,21 @@ def handle_screenshot(uuid, payload):
|
|||||||
# Try to parse as JSON first
|
# Try to parse as JSON first
|
||||||
try:
|
try:
|
||||||
data = json.loads(payload.decode())
|
data = json.loads(payload.decode())
|
||||||
if "image" in data:
|
extracted = _extract_dashboard_payload_fields(data)
|
||||||
|
image_b64 = extracted["image"]
|
||||||
|
timestamp_value = extracted["timestamp"]
|
||||||
|
screenshot_type = extracted["screenshot_type"]
|
||||||
|
if image_b64:
|
||||||
# Payload is JSON with base64 image
|
# Payload is JSON with base64 image
|
||||||
api_payload = {"image": data["image"]}
|
api_payload = {"image": image_b64}
|
||||||
|
if timestamp_value is not None:
|
||||||
|
api_payload["timestamp"] = timestamp_value
|
||||||
|
if screenshot_type:
|
||||||
|
api_payload["screenshot_type"] = screenshot_type
|
||||||
headers = {"Content-Type": "application/json"}
|
headers = {"Content-Type": "application/json"}
|
||||||
logging.debug(f"Forwarding base64 screenshot from {uuid} to API")
|
logging.debug(f"Forwarding base64 screenshot from {uuid} to API")
|
||||||
else:
|
else:
|
||||||
logging.warning(f"Screenshot JSON from {uuid} missing 'image' field")
|
logging.warning(f"Screenshot JSON from {uuid} missing image/data field")
|
||||||
return
|
return
|
||||||
except (json.JSONDecodeError, UnicodeDecodeError):
|
except (json.JSONDecodeError, UnicodeDecodeError):
|
||||||
# Payload is raw binary image data - encode to base64 for API
|
# Payload is raw binary image data - encode to base64 for API
|
||||||
@@ -78,7 +370,14 @@ def on_connect(client, userdata, flags, reasonCode, properties):
|
|||||||
client.subscribe("infoscreen/+/heartbeat")
|
client.subscribe("infoscreen/+/heartbeat")
|
||||||
client.subscribe("infoscreen/+/screenshot")
|
client.subscribe("infoscreen/+/screenshot")
|
||||||
client.subscribe("infoscreen/+/dashboard")
|
client.subscribe("infoscreen/+/dashboard")
|
||||||
logging.info(f"MQTT connected (reasonCode: {reasonCode}); (re)subscribed to discovery, heartbeats, screenshots, and dashboards")
|
|
||||||
|
# Subscribe to monitoring topics
|
||||||
|
client.subscribe("infoscreen/+/logs/error")
|
||||||
|
client.subscribe("infoscreen/+/logs/warn")
|
||||||
|
client.subscribe("infoscreen/+/logs/info")
|
||||||
|
client.subscribe("infoscreen/+/health")
|
||||||
|
|
||||||
|
logging.info(f"MQTT connected (reasonCode: {reasonCode}); (re)subscribed to discovery, heartbeats, screenshots, dashboards, logs, and health")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logging.error(f"Subscribe failed on connect: {e}")
|
logging.error(f"Subscribe failed on connect: {e}")
|
||||||
|
|
||||||
@@ -94,24 +393,37 @@ def on_message(client, userdata, msg):
|
|||||||
try:
|
try:
|
||||||
payload_text = msg.payload.decode()
|
payload_text = msg.payload.decode()
|
||||||
data = json.loads(payload_text)
|
data = json.loads(payload_text)
|
||||||
shot = data.get("screenshot")
|
parse_mode, schema_version = _classify_dashboard_payload(data)
|
||||||
if isinstance(shot, dict):
|
_record_dashboard_parse_metric(parse_mode, uuid, schema_version=schema_version)
|
||||||
# Prefer 'data' field (base64) inside screenshot object
|
if parse_mode == "v2_success":
|
||||||
image_b64 = shot.get("data")
|
_validate_v2_required_fields(data, uuid)
|
||||||
if image_b64:
|
|
||||||
logging.debug(f"Dashboard enthält Screenshot für {uuid}; Weiterleitung an API")
|
extracted = _extract_dashboard_payload_fields(data)
|
||||||
# Build a lightweight JSON with image field for API handler
|
image_b64 = extracted["image"]
|
||||||
api_payload = json.dumps({"image": image_b64}).encode("utf-8")
|
ts_value = extracted["timestamp"]
|
||||||
handle_screenshot(uuid, api_payload)
|
screenshot_type = extracted["screenshot_type"]
|
||||||
|
if image_b64:
|
||||||
|
logging.debug(f"Dashboard enthält Screenshot für {uuid}; Weiterleitung an API")
|
||||||
|
# Forward original v2 payload so handle_screenshot can parse grouped fields.
|
||||||
|
handle_screenshot(uuid, msg.payload)
|
||||||
# Update last_alive if status present
|
# Update last_alive if status present
|
||||||
if data.get("status") == "alive":
|
if extracted["status"] == "alive":
|
||||||
session = Session()
|
session = Session()
|
||||||
client_obj = session.query(Client).filter_by(uuid=uuid).first()
|
client_obj = session.query(Client).filter_by(uuid=uuid).first()
|
||||||
if client_obj:
|
if client_obj:
|
||||||
client_obj.last_alive = datetime.datetime.now(datetime.UTC)
|
process_health = extracted["process_health"]
|
||||||
|
apply_monitoring_update(
|
||||||
|
client_obj,
|
||||||
|
last_seen=datetime.datetime.now(datetime.UTC),
|
||||||
|
event_id=process_health.get('event_id'),
|
||||||
|
process_name=process_health.get('current_process') or process_health.get('process'),
|
||||||
|
process_pid=process_health.get('process_pid') or process_health.get('pid'),
|
||||||
|
process_status=process_health.get('process_status') or process_health.get('status'),
|
||||||
|
)
|
||||||
session.commit()
|
session.commit()
|
||||||
session.close()
|
session.close()
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
|
_record_dashboard_parse_metric("parse_failures", uuid, reason=str(e))
|
||||||
logging.error(f"Fehler beim Verarbeiten des Dashboard-Payloads von {uuid}: {e}")
|
logging.error(f"Fehler beim Verarbeiten des Dashboard-Payloads von {uuid}: {e}")
|
||||||
return
|
return
|
||||||
|
|
||||||
@@ -124,15 +436,110 @@ def on_message(client, userdata, msg):
|
|||||||
# Heartbeat-Handling
|
# Heartbeat-Handling
|
||||||
if topic.startswith("infoscreen/") and topic.endswith("/heartbeat"):
|
if topic.startswith("infoscreen/") and topic.endswith("/heartbeat"):
|
||||||
uuid = topic.split("/")[1]
|
uuid = topic.split("/")[1]
|
||||||
|
try:
|
||||||
|
# Parse payload to get optional health data
|
||||||
|
payload_data = json.loads(msg.payload.decode())
|
||||||
|
except (json.JSONDecodeError, UnicodeDecodeError):
|
||||||
|
payload_data = {}
|
||||||
|
|
||||||
session = Session()
|
session = Session()
|
||||||
client_obj = session.query(Client).filter_by(uuid=uuid).first()
|
client_obj = session.query(Client).filter_by(uuid=uuid).first()
|
||||||
if client_obj:
|
if client_obj:
|
||||||
client_obj.last_alive = datetime.datetime.now(datetime.UTC)
|
apply_monitoring_update(
|
||||||
|
client_obj,
|
||||||
|
last_seen=datetime.datetime.now(datetime.UTC),
|
||||||
|
event_id=payload_data.get('current_event_id'),
|
||||||
|
process_name=payload_data.get('current_process'),
|
||||||
|
process_pid=payload_data.get('process_pid'),
|
||||||
|
process_status=payload_data.get('process_status'),
|
||||||
|
)
|
||||||
session.commit()
|
session.commit()
|
||||||
logging.info(
|
logging.info(f"Heartbeat von {uuid} empfangen, last_alive (UTC) aktualisiert.")
|
||||||
f"Heartbeat von {uuid} empfangen, last_alive (UTC) aktualisiert.")
|
|
||||||
session.close()
|
session.close()
|
||||||
return
|
return
|
||||||
|
|
||||||
|
# Log-Handling (ERROR, WARN, INFO)
|
||||||
|
if topic.startswith("infoscreen/") and "/logs/" in topic:
|
||||||
|
parts = topic.split("/")
|
||||||
|
if len(parts) >= 4:
|
||||||
|
uuid = parts[1]
|
||||||
|
level_str = parts[3].upper() # 'error', 'warn', 'info' -> 'ERROR', 'WARN', 'INFO'
|
||||||
|
|
||||||
|
try:
|
||||||
|
payload_data = json.loads(msg.payload.decode())
|
||||||
|
message = payload_data.get('message', '')
|
||||||
|
timestamp_str = payload_data.get('timestamp')
|
||||||
|
context = payload_data.get('context', {})
|
||||||
|
|
||||||
|
# Parse timestamp or use current time
|
||||||
|
if timestamp_str:
|
||||||
|
try:
|
||||||
|
log_timestamp = datetime.datetime.fromisoformat(timestamp_str.replace('Z', '+00:00'))
|
||||||
|
if log_timestamp.tzinfo is None:
|
||||||
|
log_timestamp = log_timestamp.replace(tzinfo=datetime.UTC)
|
||||||
|
except ValueError:
|
||||||
|
log_timestamp = datetime.datetime.now(datetime.UTC)
|
||||||
|
else:
|
||||||
|
log_timestamp = datetime.datetime.now(datetime.UTC)
|
||||||
|
|
||||||
|
# Store in database
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
log_level = LogLevel[level_str]
|
||||||
|
log_entry = ClientLog(
|
||||||
|
client_uuid=uuid,
|
||||||
|
timestamp=log_timestamp,
|
||||||
|
level=log_level,
|
||||||
|
message=message,
|
||||||
|
context=json.dumps(context) if context else None
|
||||||
|
)
|
||||||
|
session.add(log_entry)
|
||||||
|
session.commit()
|
||||||
|
logging.info(f"[{level_str}] {uuid}: {message}")
|
||||||
|
except Exception as e:
|
||||||
|
logging.error(f"Error saving log from {uuid}: {e}")
|
||||||
|
session.rollback()
|
||||||
|
finally:
|
||||||
|
session.close()
|
||||||
|
|
||||||
|
except (json.JSONDecodeError, UnicodeDecodeError) as e:
|
||||||
|
logging.error(f"Could not parse log payload from {uuid}: {e}")
|
||||||
|
return
|
||||||
|
|
||||||
|
# Health-Handling
|
||||||
|
if topic.startswith("infoscreen/") and topic.endswith("/health"):
|
||||||
|
uuid = topic.split("/")[1]
|
||||||
|
try:
|
||||||
|
payload_data = json.loads(msg.payload.decode())
|
||||||
|
|
||||||
|
session = Session()
|
||||||
|
client_obj = session.query(Client).filter_by(uuid=uuid).first()
|
||||||
|
if client_obj:
|
||||||
|
# Update expected state
|
||||||
|
expected = payload_data.get('expected_state', {})
|
||||||
|
|
||||||
|
# Update actual state
|
||||||
|
actual = payload_data.get('actual_state', {})
|
||||||
|
screen_health_status = infer_screen_health_status(payload_data)
|
||||||
|
apply_monitoring_update(
|
||||||
|
client_obj,
|
||||||
|
last_seen=datetime.datetime.now(datetime.UTC),
|
||||||
|
event_id=expected.get('event_id'),
|
||||||
|
process_name=actual.get('process'),
|
||||||
|
process_pid=actual.get('pid'),
|
||||||
|
process_status=actual.get('status'),
|
||||||
|
screen_health_status=screen_health_status,
|
||||||
|
last_screenshot_analyzed=parse_timestamp((payload_data.get('health_metrics') or {}).get('last_frame_update')),
|
||||||
|
)
|
||||||
|
session.commit()
|
||||||
|
logging.debug(f"Health update from {uuid}: {actual.get('process')} ({actual.get('status')})")
|
||||||
|
session.close()
|
||||||
|
|
||||||
|
except (json.JSONDecodeError, UnicodeDecodeError) as e:
|
||||||
|
logging.error(f"Could not parse health payload from {uuid}: {e}")
|
||||||
|
except Exception as e:
|
||||||
|
logging.error(f"Error processing health from {uuid}: {e}")
|
||||||
|
return
|
||||||
|
|
||||||
# Discovery-Handling
|
# Discovery-Handling
|
||||||
if topic == "infoscreen/discovery":
|
if topic == "infoscreen/discovery":
|
||||||
|
|||||||
378
listener/test_listener_parser.py
Normal file
378
listener/test_listener_parser.py
Normal file
@@ -0,0 +1,378 @@
|
|||||||
|
"""
|
||||||
|
Mixed-format integration tests for the dashboard payload parser.
|
||||||
|
|
||||||
|
Tests cover:
|
||||||
|
- Legacy top-level payload is rejected (v2-only mode)
|
||||||
|
- v2 grouped payload: periodic capture
|
||||||
|
- v2 grouped payload: event_start capture
|
||||||
|
- v2 grouped payload: event_stop capture
|
||||||
|
- Classification into v2_success / parse_failures
|
||||||
|
- Soft required-field validation (v2 only, never drops message)
|
||||||
|
- Edge cases: missing image, missing status, non-dict payload
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
import logging
|
||||||
|
import importlib.util
|
||||||
|
|
||||||
|
# listener/ has no __init__.py — load the module directly from its file path
|
||||||
|
os.environ.setdefault("DB_CONN", "sqlite:///:memory:") # prevent DB engine errors on import
|
||||||
|
_LISTENER_PATH = os.path.join(os.path.dirname(__file__), "listener.py")
|
||||||
|
_spec = importlib.util.spec_from_file_location("listener_module", _LISTENER_PATH)
|
||||||
|
_mod = importlib.util.module_from_spec(_spec)
|
||||||
|
_spec.loader.exec_module(_mod)
|
||||||
|
|
||||||
|
_extract_dashboard_payload_fields = _mod._extract_dashboard_payload_fields
|
||||||
|
_classify_dashboard_payload = _mod._classify_dashboard_payload
|
||||||
|
_validate_v2_required_fields = _mod._validate_v2_required_fields
|
||||||
|
_normalize_screenshot_type = _mod._normalize_screenshot_type
|
||||||
|
DASHBOARD_PARSE_METRICS = _mod.DASHBOARD_PARSE_METRICS
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Fixtures
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
IMAGE_B64 = "aGVsbG8=" # base64("hello")
|
||||||
|
|
||||||
|
LEGACY_PAYLOAD = {
|
||||||
|
"client_id": "uuid-legacy",
|
||||||
|
"status": "alive",
|
||||||
|
"screenshot": {
|
||||||
|
"data": IMAGE_B64,
|
||||||
|
"timestamp": "2026-03-30T10:00:00+00:00",
|
||||||
|
},
|
||||||
|
"screenshot_type": "periodic",
|
||||||
|
"process_health": {
|
||||||
|
"current_process": "impressive",
|
||||||
|
"process_pid": 1234,
|
||||||
|
"process_status": "running",
|
||||||
|
"event_id": 42,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
def _make_v2(capture_type):
|
||||||
|
return {
|
||||||
|
"message": {
|
||||||
|
"client_id": "uuid-v2",
|
||||||
|
"status": "alive",
|
||||||
|
},
|
||||||
|
"content": {
|
||||||
|
"screenshot": {
|
||||||
|
"filename": "latest.jpg",
|
||||||
|
"data": IMAGE_B64,
|
||||||
|
"timestamp": "2026-03-30T10:15:41.123456+00:00",
|
||||||
|
"size": 6,
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"runtime": {
|
||||||
|
"system_info": {
|
||||||
|
"hostname": "pi-display-01",
|
||||||
|
"ip": "192.168.1.42",
|
||||||
|
"uptime": 12345.0,
|
||||||
|
},
|
||||||
|
"process_health": {
|
||||||
|
"event_id": "evt-7",
|
||||||
|
"event_type": "presentation",
|
||||||
|
"current_process": "impressive",
|
||||||
|
"process_pid": 4123,
|
||||||
|
"process_status": "running",
|
||||||
|
"restart_count": 0,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
"metadata": {
|
||||||
|
"schema_version": "2.0",
|
||||||
|
"producer": "simclient",
|
||||||
|
"published_at": "2026-03-30T10:15:42.004321+00:00",
|
||||||
|
"capture": {
|
||||||
|
"type": capture_type,
|
||||||
|
"captured_at": "2026-03-30T10:15:41.123456+00:00",
|
||||||
|
"age_s": 0.9,
|
||||||
|
"triggered": capture_type != "periodic",
|
||||||
|
"send_immediately": capture_type != "periodic",
|
||||||
|
},
|
||||||
|
"transport": {"qos": 0, "publisher": "simclient"},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
V2_PERIODIC = _make_v2("periodic")
|
||||||
|
V2_EVT_START = _make_v2("event_start")
|
||||||
|
V2_EVT_STOP = _make_v2("event_stop")
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def assert_eq(label, actual, expected):
|
||||||
|
assert actual == expected, f"FAIL [{label}]: expected {expected!r}, got {actual!r}"
|
||||||
|
|
||||||
|
def assert_not_none(label, actual):
|
||||||
|
assert actual is not None, f"FAIL [{label}]: expected non-None, got None"
|
||||||
|
|
||||||
|
def assert_none(label, actual):
|
||||||
|
assert actual is None, f"FAIL [{label}]: expected None, got {actual!r}"
|
||||||
|
|
||||||
|
def assert_warns(label, fn, substring):
|
||||||
|
"""Assert that fn() emits a logging.WARNING containing substring."""
|
||||||
|
records = []
|
||||||
|
handler = logging.handlers_collector(records)
|
||||||
|
logger = logging.getLogger()
|
||||||
|
logger.addHandler(handler)
|
||||||
|
try:
|
||||||
|
fn()
|
||||||
|
finally:
|
||||||
|
logger.removeHandler(handler)
|
||||||
|
warnings = [r.getMessage() for r in records if r.levelno == logging.WARNING]
|
||||||
|
assert any(substring in w for w in warnings), (
|
||||||
|
f"FAIL [{label}]: no WARNING containing {substring!r} found in {warnings}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class _CapturingHandler(logging.Handler):
|
||||||
|
def __init__(self, records):
|
||||||
|
super().__init__()
|
||||||
|
self._records = records
|
||||||
|
|
||||||
|
def emit(self, record):
|
||||||
|
self._records.append(record)
|
||||||
|
|
||||||
|
|
||||||
|
def capture_warnings(fn):
|
||||||
|
"""Run fn(), return list of WARNING message strings."""
|
||||||
|
records = []
|
||||||
|
handler = _CapturingHandler(records)
|
||||||
|
logger = logging.getLogger()
|
||||||
|
logger.addHandler(handler)
|
||||||
|
try:
|
||||||
|
fn()
|
||||||
|
finally:
|
||||||
|
logger.removeHandler(handler)
|
||||||
|
return [r.getMessage() for r in records if r.levelno == logging.WARNING]
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Tests: _normalize_screenshot_type
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_normalize_known_types():
|
||||||
|
for t in ("periodic", "event_start", "event_stop"):
|
||||||
|
assert_eq(f"normalize_{t}", _normalize_screenshot_type(t), t)
|
||||||
|
assert_eq(f"normalize_{t}_upper", _normalize_screenshot_type(t.upper()), t)
|
||||||
|
|
||||||
|
def test_normalize_unknown_returns_none():
|
||||||
|
assert_none("normalize_unknown", _normalize_screenshot_type("unknown"))
|
||||||
|
assert_none("normalize_none", _normalize_screenshot_type(None))
|
||||||
|
assert_none("normalize_empty", _normalize_screenshot_type(""))
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Tests: _classify_dashboard_payload
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_classify_legacy():
|
||||||
|
mode, ver = _classify_dashboard_payload(LEGACY_PAYLOAD)
|
||||||
|
assert_eq("classify_legacy_mode", mode, "parse_failures")
|
||||||
|
assert_none("classify_legacy_version", ver)
|
||||||
|
|
||||||
|
def test_classify_v2_periodic():
|
||||||
|
mode, ver = _classify_dashboard_payload(V2_PERIODIC)
|
||||||
|
assert_eq("classify_v2_periodic_mode", mode, "v2_success")
|
||||||
|
assert_eq("classify_v2_periodic_version", ver, "2.0")
|
||||||
|
|
||||||
|
def test_classify_v2_event_start():
|
||||||
|
mode, ver = _classify_dashboard_payload(V2_EVT_START)
|
||||||
|
assert_eq("classify_v2_event_start_mode", mode, "v2_success")
|
||||||
|
|
||||||
|
def test_classify_v2_event_stop():
|
||||||
|
mode, ver = _classify_dashboard_payload(V2_EVT_STOP)
|
||||||
|
assert_eq("classify_v2_event_stop_mode", mode, "v2_success")
|
||||||
|
|
||||||
|
def test_classify_non_dict():
|
||||||
|
mode, ver = _classify_dashboard_payload("not a dict")
|
||||||
|
assert_eq("classify_non_dict", mode, "parse_failures")
|
||||||
|
|
||||||
|
def test_classify_empty_dict():
|
||||||
|
mode, ver = _classify_dashboard_payload({})
|
||||||
|
assert_eq("classify_empty_dict", mode, "parse_failures")
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Tests: _extract_dashboard_payload_fields — legacy payload rejected in v2-only mode
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_legacy_image_not_extracted():
|
||||||
|
r = _extract_dashboard_payload_fields(LEGACY_PAYLOAD)
|
||||||
|
assert_none("legacy_image", r["image"])
|
||||||
|
|
||||||
|
def test_legacy_screenshot_type_missing():
|
||||||
|
r = _extract_dashboard_payload_fields(LEGACY_PAYLOAD)
|
||||||
|
assert_none("legacy_screenshot_type", r["screenshot_type"])
|
||||||
|
|
||||||
|
def test_legacy_status_missing():
|
||||||
|
r = _extract_dashboard_payload_fields(LEGACY_PAYLOAD)
|
||||||
|
assert_none("legacy_status", r["status"])
|
||||||
|
|
||||||
|
def test_legacy_process_health_empty():
|
||||||
|
r = _extract_dashboard_payload_fields(LEGACY_PAYLOAD)
|
||||||
|
assert_eq("legacy_process_health", r["process_health"], {})
|
||||||
|
|
||||||
|
def test_legacy_timestamp_missing():
|
||||||
|
r = _extract_dashboard_payload_fields(LEGACY_PAYLOAD)
|
||||||
|
assert_none("legacy_timestamp", r["timestamp"])
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Tests: _extract_dashboard_payload_fields — v2 periodic
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_v2_periodic_image():
|
||||||
|
r = _extract_dashboard_payload_fields(V2_PERIODIC)
|
||||||
|
assert_eq("v2_periodic_image", r["image"], IMAGE_B64)
|
||||||
|
|
||||||
|
def test_v2_periodic_screenshot_type():
|
||||||
|
r = _extract_dashboard_payload_fields(V2_PERIODIC)
|
||||||
|
assert_eq("v2_periodic_type", r["screenshot_type"], "periodic")
|
||||||
|
|
||||||
|
def test_v2_periodic_status():
|
||||||
|
r = _extract_dashboard_payload_fields(V2_PERIODIC)
|
||||||
|
assert_eq("v2_periodic_status", r["status"], "alive")
|
||||||
|
|
||||||
|
def test_v2_periodic_process_health():
|
||||||
|
r = _extract_dashboard_payload_fields(V2_PERIODIC)
|
||||||
|
assert_eq("v2_periodic_pid", r["process_health"]["process_pid"], 4123)
|
||||||
|
assert_eq("v2_periodic_process", r["process_health"]["current_process"], "impressive")
|
||||||
|
|
||||||
|
def test_v2_periodic_timestamp_prefers_screenshot():
|
||||||
|
r = _extract_dashboard_payload_fields(V2_PERIODIC)
|
||||||
|
# screenshot.timestamp must take precedence over capture.captured_at / published_at
|
||||||
|
assert_eq("v2_periodic_ts", r["timestamp"], "2026-03-30T10:15:41.123456+00:00")
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Tests: _extract_dashboard_payload_fields — v2 event_start
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_v2_event_start_type():
|
||||||
|
r = _extract_dashboard_payload_fields(V2_EVT_START)
|
||||||
|
assert_eq("v2_event_start_type", r["screenshot_type"], "event_start")
|
||||||
|
|
||||||
|
def test_v2_event_start_image():
|
||||||
|
r = _extract_dashboard_payload_fields(V2_EVT_START)
|
||||||
|
assert_eq("v2_event_start_image", r["image"], IMAGE_B64)
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Tests: _extract_dashboard_payload_fields — v2 event_stop
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_v2_event_stop_type():
|
||||||
|
r = _extract_dashboard_payload_fields(V2_EVT_STOP)
|
||||||
|
assert_eq("v2_event_stop_type", r["screenshot_type"], "event_stop")
|
||||||
|
|
||||||
|
def test_v2_event_stop_image():
|
||||||
|
r = _extract_dashboard_payload_fields(V2_EVT_STOP)
|
||||||
|
assert_eq("v2_event_stop_image", r["image"], IMAGE_B64)
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Tests: _extract_dashboard_payload_fields — edge cases
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_non_dict_returns_nulls():
|
||||||
|
r = _extract_dashboard_payload_fields("bad")
|
||||||
|
assert_none("non_dict_image", r["image"])
|
||||||
|
assert_none("non_dict_type", r["screenshot_type"])
|
||||||
|
assert_none("non_dict_status", r["status"])
|
||||||
|
|
||||||
|
def test_missing_image_returns_none():
|
||||||
|
payload = {**V2_PERIODIC, "content": {"screenshot": {"timestamp": "2026-03-30T10:00:00+00:00"}}}
|
||||||
|
r = _extract_dashboard_payload_fields(payload)
|
||||||
|
assert_none("missing_image", r["image"])
|
||||||
|
|
||||||
|
def test_missing_process_health_returns_empty_dict():
|
||||||
|
import copy
|
||||||
|
payload = copy.deepcopy(V2_PERIODIC)
|
||||||
|
del payload["runtime"]["process_health"]
|
||||||
|
r = _extract_dashboard_payload_fields(payload)
|
||||||
|
assert_eq("missing_ph", r["process_health"], {})
|
||||||
|
|
||||||
|
def test_timestamp_fallback_to_captured_at_when_no_screenshot_ts():
|
||||||
|
import copy
|
||||||
|
payload = copy.deepcopy(V2_PERIODIC)
|
||||||
|
del payload["content"]["screenshot"]["timestamp"]
|
||||||
|
r = _extract_dashboard_payload_fields(payload)
|
||||||
|
assert_eq("ts_fallback_captured_at", r["timestamp"], "2026-03-30T10:15:41.123456+00:00")
|
||||||
|
|
||||||
|
def test_timestamp_fallback_to_published_at_when_no_capture_ts():
|
||||||
|
import copy
|
||||||
|
payload = copy.deepcopy(V2_PERIODIC)
|
||||||
|
del payload["content"]["screenshot"]["timestamp"]
|
||||||
|
del payload["metadata"]["capture"]["captured_at"]
|
||||||
|
r = _extract_dashboard_payload_fields(payload)
|
||||||
|
assert_eq("ts_fallback_published_at", r["timestamp"], "2026-03-30T10:15:42.004321+00:00")
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Tests: _validate_v2_required_fields (soft — never raises)
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def test_v2_valid_payload_no_warnings():
|
||||||
|
warnings = capture_warnings(lambda: _validate_v2_required_fields(V2_PERIODIC, "uuid-v2"))
|
||||||
|
assert warnings == [], f"FAIL: unexpected warnings for valid payload: {warnings}"
|
||||||
|
|
||||||
|
def test_v2_missing_client_id_warns():
|
||||||
|
import copy
|
||||||
|
payload = copy.deepcopy(V2_PERIODIC)
|
||||||
|
del payload["message"]["client_id"]
|
||||||
|
warnings = capture_warnings(lambda: _validate_v2_required_fields(payload, "uuid-v2"))
|
||||||
|
assert any("message.client_id" in w for w in warnings), f"FAIL: {warnings}"
|
||||||
|
|
||||||
|
def test_v2_missing_status_warns():
|
||||||
|
import copy
|
||||||
|
payload = copy.deepcopy(V2_PERIODIC)
|
||||||
|
del payload["message"]["status"]
|
||||||
|
warnings = capture_warnings(lambda: _validate_v2_required_fields(payload, "uuid-v2"))
|
||||||
|
assert any("message.status" in w for w in warnings), f"FAIL: {warnings}"
|
||||||
|
|
||||||
|
def test_v2_missing_schema_version_warns():
|
||||||
|
import copy
|
||||||
|
payload = copy.deepcopy(V2_PERIODIC)
|
||||||
|
del payload["metadata"]["schema_version"]
|
||||||
|
warnings = capture_warnings(lambda: _validate_v2_required_fields(payload, "uuid-v2"))
|
||||||
|
assert any("metadata.schema_version" in w for w in warnings), f"FAIL: {warnings}"
|
||||||
|
|
||||||
|
def test_v2_missing_capture_type_warns():
|
||||||
|
import copy
|
||||||
|
payload = copy.deepcopy(V2_PERIODIC)
|
||||||
|
del payload["metadata"]["capture"]["type"]
|
||||||
|
warnings = capture_warnings(lambda: _validate_v2_required_fields(payload, "uuid-v2"))
|
||||||
|
assert any("metadata.capture.type" in w for w in warnings), f"FAIL: {warnings}"
|
||||||
|
|
||||||
|
def test_v2_multiple_missing_fields_all_reported():
|
||||||
|
import copy
|
||||||
|
payload = copy.deepcopy(V2_PERIODIC)
|
||||||
|
del payload["message"]["client_id"]
|
||||||
|
del payload["metadata"]["capture"]["type"]
|
||||||
|
warnings = capture_warnings(lambda: _validate_v2_required_fields(payload, "uuid-v2"))
|
||||||
|
assert len(warnings) == 1, f"FAIL: expected 1 combined warning, got {warnings}"
|
||||||
|
assert "message.client_id" in warnings[0], f"FAIL: {warnings}"
|
||||||
|
assert "metadata.capture.type" in warnings[0], f"FAIL: {warnings}"
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Runner
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def run_all():
|
||||||
|
tests = {k: v for k, v in globals().items() if k.startswith("test_") and callable(v)}
|
||||||
|
passed = failed = 0
|
||||||
|
for name, fn in sorted(tests.items()):
|
||||||
|
try:
|
||||||
|
fn()
|
||||||
|
print(f" PASS {name}")
|
||||||
|
passed += 1
|
||||||
|
except AssertionError as e:
|
||||||
|
print(f" FAIL {name}: {e}")
|
||||||
|
failed += 1
|
||||||
|
except Exception as e:
|
||||||
|
print(f" ERROR {name}: {type(e).__name__}: {e}")
|
||||||
|
failed += 1
|
||||||
|
print(f"\n{passed} passed, {failed} failed out of {passed + failed} tests")
|
||||||
|
return failed == 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
ok = run_all()
|
||||||
|
sys.exit(0 if ok else 1)
|
||||||
@@ -21,6 +21,27 @@ class AcademicPeriodType(enum.Enum):
|
|||||||
trimester = "trimester"
|
trimester = "trimester"
|
||||||
|
|
||||||
|
|
||||||
|
class LogLevel(enum.Enum):
|
||||||
|
ERROR = "ERROR"
|
||||||
|
WARN = "WARN"
|
||||||
|
INFO = "INFO"
|
||||||
|
DEBUG = "DEBUG"
|
||||||
|
|
||||||
|
|
||||||
|
class ProcessStatus(enum.Enum):
|
||||||
|
running = "running"
|
||||||
|
crashed = "crashed"
|
||||||
|
starting = "starting"
|
||||||
|
stopped = "stopped"
|
||||||
|
|
||||||
|
|
||||||
|
class ScreenHealthStatus(enum.Enum):
|
||||||
|
OK = "OK"
|
||||||
|
BLACK = "BLACK"
|
||||||
|
FROZEN = "FROZEN"
|
||||||
|
UNKNOWN = "UNKNOWN"
|
||||||
|
|
||||||
|
|
||||||
class User(Base):
|
class User(Base):
|
||||||
__tablename__ = 'users'
|
__tablename__ = 'users'
|
||||||
id = Column(Integer, primary_key=True, autoincrement=True)
|
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||||
@@ -106,6 +127,31 @@ class Client(Base):
|
|||||||
is_active = Column(Boolean, default=True, nullable=False)
|
is_active = Column(Boolean, default=True, nullable=False)
|
||||||
group_id = Column(Integer, ForeignKey(
|
group_id = Column(Integer, ForeignKey(
|
||||||
'client_groups.id'), nullable=False, default=1)
|
'client_groups.id'), nullable=False, default=1)
|
||||||
|
|
||||||
|
# Health monitoring fields
|
||||||
|
current_event_id = Column(Integer, nullable=True)
|
||||||
|
current_process = Column(String(50), nullable=True) # 'vlc', 'chromium', 'pdf_viewer'
|
||||||
|
process_status = Column(Enum(ProcessStatus), nullable=True)
|
||||||
|
process_pid = Column(Integer, nullable=True)
|
||||||
|
last_screenshot_analyzed = Column(TIMESTAMP(timezone=True), nullable=True)
|
||||||
|
screen_health_status = Column(Enum(ScreenHealthStatus), nullable=True, server_default='UNKNOWN')
|
||||||
|
last_screenshot_hash = Column(String(32), nullable=True)
|
||||||
|
|
||||||
|
|
||||||
|
class ClientLog(Base):
|
||||||
|
__tablename__ = 'client_logs'
|
||||||
|
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||||
|
client_uuid = Column(String(36), ForeignKey('clients.uuid', ondelete='CASCADE'), nullable=False, index=True)
|
||||||
|
timestamp = Column(TIMESTAMP(timezone=True), nullable=False, index=True)
|
||||||
|
level = Column(Enum(LogLevel), nullable=False, index=True)
|
||||||
|
message = Column(Text, nullable=False)
|
||||||
|
context = Column(Text, nullable=True) # JSON stored as text
|
||||||
|
created_at = Column(TIMESTAMP(timezone=True), server_default=func.current_timestamp(), nullable=False)
|
||||||
|
|
||||||
|
__table_args__ = (
|
||||||
|
Index('ix_client_logs_client_timestamp', 'client_uuid', 'timestamp'),
|
||||||
|
Index('ix_client_logs_level_timestamp', 'level', 'timestamp'),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class EventType(enum.Enum):
|
class EventType(enum.Enum):
|
||||||
|
|||||||
@@ -306,6 +306,7 @@ def format_event_with_media(event):
|
|||||||
"autoplay": getattr(event, "autoplay", True),
|
"autoplay": getattr(event, "autoplay", True),
|
||||||
"loop": getattr(event, "loop", False),
|
"loop": getattr(event, "loop", False),
|
||||||
"volume": getattr(event, "volume", 0.8),
|
"volume": getattr(event, "volume", 0.8),
|
||||||
|
"muted": getattr(event, "muted", False),
|
||||||
# Best-effort metadata to help clients decide how to stream
|
# Best-effort metadata to help clients decide how to stream
|
||||||
"mime_type": mime_type,
|
"mime_type": mime_type,
|
||||||
"size": size,
|
"size": size,
|
||||||
|
|||||||
@@ -0,0 +1,84 @@
|
|||||||
|
"""add client monitoring tables and columns
|
||||||
|
|
||||||
|
Revision ID: c1d2e3f4g5h6
|
||||||
|
Revises: 4f0b8a3e5c20
|
||||||
|
Create Date: 2026-03-09 21:08:38.000000
|
||||||
|
|
||||||
|
"""
|
||||||
|
from alembic import op
|
||||||
|
import sqlalchemy as sa
|
||||||
|
|
||||||
|
# revision identifiers, used by Alembic.
|
||||||
|
revision = 'c1d2e3f4g5h6'
|
||||||
|
down_revision = '4f0b8a3e5c20'
|
||||||
|
branch_labels = None
|
||||||
|
depends_on = None
|
||||||
|
|
||||||
|
|
||||||
|
def upgrade():
|
||||||
|
bind = op.get_bind()
|
||||||
|
inspector = sa.inspect(bind)
|
||||||
|
|
||||||
|
# 1. Add health monitoring columns to clients table (safe on rerun)
|
||||||
|
existing_client_columns = {c['name'] for c in inspector.get_columns('clients')}
|
||||||
|
if 'current_event_id' not in existing_client_columns:
|
||||||
|
op.add_column('clients', sa.Column('current_event_id', sa.Integer(), nullable=True))
|
||||||
|
if 'current_process' not in existing_client_columns:
|
||||||
|
op.add_column('clients', sa.Column('current_process', sa.String(50), nullable=True))
|
||||||
|
if 'process_status' not in existing_client_columns:
|
||||||
|
op.add_column('clients', sa.Column('process_status', sa.Enum('running', 'crashed', 'starting', 'stopped', name='processstatus'), nullable=True))
|
||||||
|
if 'process_pid' not in existing_client_columns:
|
||||||
|
op.add_column('clients', sa.Column('process_pid', sa.Integer(), nullable=True))
|
||||||
|
if 'last_screenshot_analyzed' not in existing_client_columns:
|
||||||
|
op.add_column('clients', sa.Column('last_screenshot_analyzed', sa.TIMESTAMP(timezone=True), nullable=True))
|
||||||
|
if 'screen_health_status' not in existing_client_columns:
|
||||||
|
op.add_column('clients', sa.Column('screen_health_status', sa.Enum('OK', 'BLACK', 'FROZEN', 'UNKNOWN', name='screenhealthstatus'), nullable=True, server_default='UNKNOWN'))
|
||||||
|
if 'last_screenshot_hash' not in existing_client_columns:
|
||||||
|
op.add_column('clients', sa.Column('last_screenshot_hash', sa.String(32), nullable=True))
|
||||||
|
|
||||||
|
# 2. Create client_logs table (safe on rerun)
|
||||||
|
if not inspector.has_table('client_logs'):
|
||||||
|
op.create_table('client_logs',
|
||||||
|
sa.Column('id', sa.Integer(), autoincrement=True, nullable=False),
|
||||||
|
sa.Column('client_uuid', sa.String(36), nullable=False),
|
||||||
|
sa.Column('timestamp', sa.TIMESTAMP(timezone=True), nullable=False),
|
||||||
|
sa.Column('level', sa.Enum('ERROR', 'WARN', 'INFO', 'DEBUG', name='loglevel'), nullable=False),
|
||||||
|
sa.Column('message', sa.Text(), nullable=False),
|
||||||
|
sa.Column('context', sa.JSON(), nullable=True),
|
||||||
|
sa.Column('created_at', sa.TIMESTAMP(timezone=True), server_default=sa.func.current_timestamp(), nullable=False),
|
||||||
|
sa.PrimaryKeyConstraint('id'),
|
||||||
|
sa.ForeignKeyConstraint(['client_uuid'], ['clients.uuid'], ondelete='CASCADE'),
|
||||||
|
mysql_charset='utf8mb4',
|
||||||
|
mysql_collate='utf8mb4_unicode_ci',
|
||||||
|
mysql_engine='InnoDB'
|
||||||
|
)
|
||||||
|
|
||||||
|
# 3. Create indexes for efficient querying (safe on rerun)
|
||||||
|
client_log_indexes = {idx['name'] for idx in inspector.get_indexes('client_logs')} if inspector.has_table('client_logs') else set()
|
||||||
|
client_indexes = {idx['name'] for idx in inspector.get_indexes('clients')}
|
||||||
|
|
||||||
|
if 'ix_client_logs_client_timestamp' not in client_log_indexes:
|
||||||
|
op.create_index('ix_client_logs_client_timestamp', 'client_logs', ['client_uuid', 'timestamp'])
|
||||||
|
if 'ix_client_logs_level_timestamp' not in client_log_indexes:
|
||||||
|
op.create_index('ix_client_logs_level_timestamp', 'client_logs', ['level', 'timestamp'])
|
||||||
|
if 'ix_clients_process_status' not in client_indexes:
|
||||||
|
op.create_index('ix_clients_process_status', 'clients', ['process_status'])
|
||||||
|
|
||||||
|
|
||||||
|
def downgrade():
|
||||||
|
# Drop indexes
|
||||||
|
op.drop_index('ix_clients_process_status', table_name='clients')
|
||||||
|
op.drop_index('ix_client_logs_level_timestamp', table_name='client_logs')
|
||||||
|
op.drop_index('ix_client_logs_client_timestamp', table_name='client_logs')
|
||||||
|
|
||||||
|
# Drop table
|
||||||
|
op.drop_table('client_logs')
|
||||||
|
|
||||||
|
# Drop columns from clients
|
||||||
|
op.drop_column('clients', 'last_screenshot_hash')
|
||||||
|
op.drop_column('clients', 'screen_health_status')
|
||||||
|
op.drop_column('clients', 'last_screenshot_analyzed')
|
||||||
|
op.drop_column('clients', 'process_pid')
|
||||||
|
op.drop_column('clients', 'process_status')
|
||||||
|
op.drop_column('clients', 'current_process')
|
||||||
|
op.drop_column('clients', 'current_event_id')
|
||||||
491
server/routes/client_logs.py
Normal file
491
server/routes/client_logs.py
Normal file
@@ -0,0 +1,491 @@
|
|||||||
|
from flask import Blueprint, jsonify, request
|
||||||
|
from server.database import Session
|
||||||
|
from server.permissions import admin_or_higher, superadmin_only
|
||||||
|
from models.models import ClientLog, Client, ClientGroup, LogLevel
|
||||||
|
from sqlalchemy import desc, func
|
||||||
|
from datetime import datetime, timedelta, timezone
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import glob
|
||||||
|
|
||||||
|
from server.serializers import dict_to_camel_case
|
||||||
|
|
||||||
|
client_logs_bp = Blueprint("client_logs", __name__, url_prefix="/api/client-logs")
|
||||||
|
PRIORITY_SCREENSHOT_TTL_SECONDS = int(os.environ.get("PRIORITY_SCREENSHOT_TTL_SECONDS", "120"))
|
||||||
|
|
||||||
|
|
||||||
|
def _grace_period_seconds():
|
||||||
|
env = os.environ.get("ENV", "production").lower()
|
||||||
|
if env in ("development", "dev"):
|
||||||
|
return int(os.environ.get("HEARTBEAT_GRACE_PERIOD_DEV", "180"))
|
||||||
|
return int(os.environ.get("HEARTBEAT_GRACE_PERIOD_PROD", "170"))
|
||||||
|
|
||||||
|
|
||||||
|
def _to_utc(dt):
|
||||||
|
if dt is None:
|
||||||
|
return None
|
||||||
|
if dt.tzinfo is None:
|
||||||
|
return dt.replace(tzinfo=timezone.utc)
|
||||||
|
return dt.astimezone(timezone.utc)
|
||||||
|
|
||||||
|
|
||||||
|
def _is_client_alive(last_alive, is_active):
|
||||||
|
if not last_alive or not is_active:
|
||||||
|
return False
|
||||||
|
return (datetime.now(timezone.utc) - _to_utc(last_alive)) <= timedelta(seconds=_grace_period_seconds())
|
||||||
|
|
||||||
|
|
||||||
|
def _safe_context(raw_context):
|
||||||
|
if not raw_context:
|
||||||
|
return {}
|
||||||
|
try:
|
||||||
|
return json.loads(raw_context)
|
||||||
|
except (TypeError, json.JSONDecodeError):
|
||||||
|
return {"raw": raw_context}
|
||||||
|
|
||||||
|
|
||||||
|
def _serialize_log_entry(log, include_client_uuid=False):
|
||||||
|
if not log:
|
||||||
|
return None
|
||||||
|
|
||||||
|
entry = {
|
||||||
|
"id": log.id,
|
||||||
|
"timestamp": log.timestamp.isoformat() if log.timestamp else None,
|
||||||
|
"level": log.level.value if log.level else None,
|
||||||
|
"message": log.message,
|
||||||
|
"context": _safe_context(log.context),
|
||||||
|
}
|
||||||
|
if include_client_uuid:
|
||||||
|
entry["client_uuid"] = log.client_uuid
|
||||||
|
return entry
|
||||||
|
|
||||||
|
|
||||||
|
def _determine_client_status(is_alive, process_status, screen_health_status, log_counts):
|
||||||
|
if not is_alive:
|
||||||
|
return "offline"
|
||||||
|
if process_status == "crashed" or screen_health_status in ("BLACK", "FROZEN"):
|
||||||
|
return "critical"
|
||||||
|
if log_counts.get("ERROR", 0) > 0:
|
||||||
|
return "critical"
|
||||||
|
if process_status in ("starting", "stopped") or log_counts.get("WARN", 0) > 0:
|
||||||
|
return "warning"
|
||||||
|
return "healthy"
|
||||||
|
|
||||||
|
|
||||||
|
def _infer_last_screenshot_ts(client_uuid):
|
||||||
|
screenshots_dir = os.path.join(os.path.dirname(__file__), "..", "screenshots")
|
||||||
|
|
||||||
|
candidate_files = []
|
||||||
|
latest_file = os.path.join(screenshots_dir, f"{client_uuid}.jpg")
|
||||||
|
if os.path.exists(latest_file):
|
||||||
|
candidate_files.append(latest_file)
|
||||||
|
|
||||||
|
candidate_files.extend(glob.glob(os.path.join(screenshots_dir, f"{client_uuid}_*.jpg")))
|
||||||
|
if not candidate_files:
|
||||||
|
return None
|
||||||
|
|
||||||
|
try:
|
||||||
|
newest_path = max(candidate_files, key=os.path.getmtime)
|
||||||
|
return datetime.fromtimestamp(os.path.getmtime(newest_path), timezone.utc)
|
||||||
|
except Exception:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _load_screenshot_metadata(client_uuid):
|
||||||
|
screenshots_dir = os.path.join(os.path.dirname(__file__), "..", "screenshots")
|
||||||
|
metadata_path = os.path.join(screenshots_dir, f"{client_uuid}_meta.json")
|
||||||
|
if not os.path.exists(metadata_path):
|
||||||
|
return {}
|
||||||
|
|
||||||
|
try:
|
||||||
|
with open(metadata_path, "r", encoding="utf-8") as metadata_file:
|
||||||
|
data = json.load(metadata_file)
|
||||||
|
return data if isinstance(data, dict) else {}
|
||||||
|
except Exception:
|
||||||
|
return {}
|
||||||
|
|
||||||
|
|
||||||
|
def _is_priority_screenshot_active(priority_received_at):
|
||||||
|
if not priority_received_at:
|
||||||
|
return False
|
||||||
|
|
||||||
|
try:
|
||||||
|
normalized = str(priority_received_at).replace("Z", "+00:00")
|
||||||
|
parsed = datetime.fromisoformat(normalized)
|
||||||
|
parsed_utc = _to_utc(parsed)
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
return (datetime.now(timezone.utc) - parsed_utc) <= timedelta(seconds=PRIORITY_SCREENSHOT_TTL_SECONDS)
|
||||||
|
|
||||||
|
|
||||||
|
@client_logs_bp.route("/test", methods=["GET"])
|
||||||
|
def test_client_logs():
|
||||||
|
"""Test endpoint to verify logging infrastructure (no auth required)"""
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
# Count total logs
|
||||||
|
total_logs = session.query(func.count(ClientLog.id)).scalar()
|
||||||
|
|
||||||
|
# Count by level
|
||||||
|
error_count = session.query(func.count(ClientLog.id)).filter_by(level=LogLevel.ERROR).scalar()
|
||||||
|
warn_count = session.query(func.count(ClientLog.id)).filter_by(level=LogLevel.WARN).scalar()
|
||||||
|
info_count = session.query(func.count(ClientLog.id)).filter_by(level=LogLevel.INFO).scalar()
|
||||||
|
|
||||||
|
# Get last 5 logs
|
||||||
|
recent_logs = session.query(ClientLog).order_by(desc(ClientLog.timestamp)).limit(5).all()
|
||||||
|
|
||||||
|
recent = []
|
||||||
|
for log in recent_logs:
|
||||||
|
recent.append({
|
||||||
|
"client_uuid": log.client_uuid,
|
||||||
|
"level": log.level.value if log.level else None,
|
||||||
|
"message": log.message,
|
||||||
|
"timestamp": log.timestamp.isoformat() if log.timestamp else None
|
||||||
|
})
|
||||||
|
|
||||||
|
session.close()
|
||||||
|
return jsonify({
|
||||||
|
"status": "ok",
|
||||||
|
"infrastructure": "working",
|
||||||
|
"total_logs": total_logs,
|
||||||
|
"counts": {
|
||||||
|
"ERROR": error_count,
|
||||||
|
"WARN": warn_count,
|
||||||
|
"INFO": info_count
|
||||||
|
},
|
||||||
|
"recent_5": recent
|
||||||
|
})
|
||||||
|
except Exception as e:
|
||||||
|
session.close()
|
||||||
|
return jsonify({"status": "error", "message": str(e)}), 500
|
||||||
|
|
||||||
|
|
||||||
|
@client_logs_bp.route("/<uuid>/logs", methods=["GET"])
|
||||||
|
@admin_or_higher
|
||||||
|
def get_client_logs(uuid):
|
||||||
|
"""
|
||||||
|
Get logs for a specific client
|
||||||
|
Query params:
|
||||||
|
- level: ERROR, WARN, INFO, DEBUG (optional)
|
||||||
|
- limit: number of entries (default 50, max 500)
|
||||||
|
- since: ISO timestamp (optional)
|
||||||
|
|
||||||
|
Example: /api/client-logs/abc-123/logs?level=ERROR&limit=100
|
||||||
|
"""
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
# Verify client exists
|
||||||
|
client = session.query(Client).filter_by(uuid=uuid).first()
|
||||||
|
if not client:
|
||||||
|
session.close()
|
||||||
|
return jsonify({"error": "Client not found"}), 404
|
||||||
|
|
||||||
|
# Parse query parameters
|
||||||
|
level_param = request.args.get('level')
|
||||||
|
limit = min(int(request.args.get('limit', 50)), 500)
|
||||||
|
since_param = request.args.get('since')
|
||||||
|
|
||||||
|
# Build query
|
||||||
|
query = session.query(ClientLog).filter_by(client_uuid=uuid)
|
||||||
|
|
||||||
|
# Filter by log level
|
||||||
|
if level_param:
|
||||||
|
try:
|
||||||
|
level_enum = LogLevel[level_param.upper()]
|
||||||
|
query = query.filter_by(level=level_enum)
|
||||||
|
except KeyError:
|
||||||
|
session.close()
|
||||||
|
return jsonify({"error": f"Invalid level: {level_param}. Must be ERROR, WARN, INFO, or DEBUG"}), 400
|
||||||
|
|
||||||
|
# Filter by timestamp
|
||||||
|
if since_param:
|
||||||
|
try:
|
||||||
|
# Handle both with and without 'Z' suffix
|
||||||
|
since_str = since_param.replace('Z', '+00:00')
|
||||||
|
since_dt = datetime.fromisoformat(since_str)
|
||||||
|
if since_dt.tzinfo is None:
|
||||||
|
since_dt = since_dt.replace(tzinfo=timezone.utc)
|
||||||
|
query = query.filter(ClientLog.timestamp >= since_dt)
|
||||||
|
except ValueError:
|
||||||
|
session.close()
|
||||||
|
return jsonify({"error": "Invalid timestamp format. Use ISO 8601"}), 400
|
||||||
|
|
||||||
|
# Execute query
|
||||||
|
logs = query.order_by(desc(ClientLog.timestamp)).limit(limit).all()
|
||||||
|
|
||||||
|
# Format results
|
||||||
|
result = []
|
||||||
|
for log in logs:
|
||||||
|
result.append(_serialize_log_entry(log))
|
||||||
|
|
||||||
|
session.close()
|
||||||
|
return jsonify({
|
||||||
|
"client_uuid": uuid,
|
||||||
|
"logs": result,
|
||||||
|
"count": len(result),
|
||||||
|
"limit": limit
|
||||||
|
})
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
session.close()
|
||||||
|
return jsonify({"error": f"Server error: {str(e)}"}), 500
|
||||||
|
|
||||||
|
|
||||||
|
@client_logs_bp.route("/summary", methods=["GET"])
|
||||||
|
@admin_or_higher
|
||||||
|
def get_logs_summary():
|
||||||
|
"""
|
||||||
|
Get summary of errors/warnings across all clients in last 24 hours
|
||||||
|
Returns count of ERROR, WARN, INFO logs per client
|
||||||
|
|
||||||
|
Example response:
|
||||||
|
{
|
||||||
|
"summary": {
|
||||||
|
"client-uuid-1": {"ERROR": 5, "WARN": 12, "INFO": 45},
|
||||||
|
"client-uuid-2": {"ERROR": 0, "WARN": 3, "INFO": 20}
|
||||||
|
},
|
||||||
|
"period_hours": 24,
|
||||||
|
"timestamp": "2026-03-09T21:00:00Z"
|
||||||
|
}
|
||||||
|
"""
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
# Get hours parameter (default 24, max 168 = 1 week)
|
||||||
|
hours = min(int(request.args.get('hours', 24)), 168)
|
||||||
|
since = datetime.now(timezone.utc) - timedelta(hours=hours)
|
||||||
|
|
||||||
|
# Query log counts grouped by client and level
|
||||||
|
stats = session.query(
|
||||||
|
ClientLog.client_uuid,
|
||||||
|
ClientLog.level,
|
||||||
|
func.count(ClientLog.id).label('count')
|
||||||
|
).filter(
|
||||||
|
ClientLog.timestamp >= since
|
||||||
|
).group_by(
|
||||||
|
ClientLog.client_uuid,
|
||||||
|
ClientLog.level
|
||||||
|
).all()
|
||||||
|
|
||||||
|
# Build summary dictionary
|
||||||
|
summary = {}
|
||||||
|
for stat in stats:
|
||||||
|
uuid = stat.client_uuid
|
||||||
|
if uuid not in summary:
|
||||||
|
# Initialize all levels to 0
|
||||||
|
summary[uuid] = {
|
||||||
|
"ERROR": 0,
|
||||||
|
"WARN": 0,
|
||||||
|
"INFO": 0,
|
||||||
|
"DEBUG": 0
|
||||||
|
}
|
||||||
|
|
||||||
|
summary[uuid][stat.level.value] = stat.count
|
||||||
|
|
||||||
|
# Get client info for enrichment
|
||||||
|
clients = session.query(Client.uuid, Client.hostname, Client.description).all()
|
||||||
|
client_info = {c.uuid: {"hostname": c.hostname, "description": c.description} for c in clients}
|
||||||
|
|
||||||
|
# Enrich summary with client info
|
||||||
|
enriched_summary = {}
|
||||||
|
for uuid, counts in summary.items():
|
||||||
|
enriched_summary[uuid] = {
|
||||||
|
"counts": counts,
|
||||||
|
"info": client_info.get(uuid, {})
|
||||||
|
}
|
||||||
|
|
||||||
|
session.close()
|
||||||
|
return jsonify({
|
||||||
|
"summary": enriched_summary,
|
||||||
|
"period_hours": hours,
|
||||||
|
"since": since.isoformat(),
|
||||||
|
"timestamp": datetime.now(timezone.utc).isoformat()
|
||||||
|
})
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
session.close()
|
||||||
|
return jsonify({"error": f"Server error: {str(e)}"}), 500
|
||||||
|
|
||||||
|
|
||||||
|
@client_logs_bp.route("/monitoring-overview", methods=["GET"])
|
||||||
|
@superadmin_only
|
||||||
|
def get_monitoring_overview():
|
||||||
|
"""Return a dashboard-friendly monitoring overview for all clients."""
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
hours = min(int(request.args.get("hours", 24)), 168)
|
||||||
|
since = datetime.now(timezone.utc) - timedelta(hours=hours)
|
||||||
|
|
||||||
|
clients = (
|
||||||
|
session.query(Client, ClientGroup.name.label("group_name"))
|
||||||
|
.outerjoin(ClientGroup, Client.group_id == ClientGroup.id)
|
||||||
|
.order_by(ClientGroup.name.asc(), Client.description.asc(), Client.hostname.asc(), Client.uuid.asc())
|
||||||
|
.all()
|
||||||
|
)
|
||||||
|
|
||||||
|
log_stats = (
|
||||||
|
session.query(
|
||||||
|
ClientLog.client_uuid,
|
||||||
|
ClientLog.level,
|
||||||
|
func.count(ClientLog.id).label("count"),
|
||||||
|
)
|
||||||
|
.filter(ClientLog.timestamp >= since)
|
||||||
|
.group_by(ClientLog.client_uuid, ClientLog.level)
|
||||||
|
.all()
|
||||||
|
)
|
||||||
|
|
||||||
|
counts_by_client = {}
|
||||||
|
for stat in log_stats:
|
||||||
|
if stat.client_uuid not in counts_by_client:
|
||||||
|
counts_by_client[stat.client_uuid] = {
|
||||||
|
"ERROR": 0,
|
||||||
|
"WARN": 0,
|
||||||
|
"INFO": 0,
|
||||||
|
"DEBUG": 0,
|
||||||
|
}
|
||||||
|
counts_by_client[stat.client_uuid][stat.level.value] = stat.count
|
||||||
|
|
||||||
|
clients_payload = []
|
||||||
|
summary_counts = {
|
||||||
|
"total_clients": 0,
|
||||||
|
"online_clients": 0,
|
||||||
|
"offline_clients": 0,
|
||||||
|
"healthy_clients": 0,
|
||||||
|
"warning_clients": 0,
|
||||||
|
"critical_clients": 0,
|
||||||
|
"error_logs": 0,
|
||||||
|
"warn_logs": 0,
|
||||||
|
"active_priority_screenshots": 0,
|
||||||
|
}
|
||||||
|
|
||||||
|
for client, group_name in clients:
|
||||||
|
log_counts = counts_by_client.get(
|
||||||
|
client.uuid,
|
||||||
|
{"ERROR": 0, "WARN": 0, "INFO": 0, "DEBUG": 0},
|
||||||
|
)
|
||||||
|
is_alive = _is_client_alive(client.last_alive, client.is_active)
|
||||||
|
process_status = client.process_status.value if client.process_status else None
|
||||||
|
screen_health_status = client.screen_health_status.value if client.screen_health_status else None
|
||||||
|
status = _determine_client_status(is_alive, process_status, screen_health_status, log_counts)
|
||||||
|
|
||||||
|
latest_log = (
|
||||||
|
session.query(ClientLog)
|
||||||
|
.filter_by(client_uuid=client.uuid)
|
||||||
|
.order_by(desc(ClientLog.timestamp))
|
||||||
|
.first()
|
||||||
|
)
|
||||||
|
latest_error = (
|
||||||
|
session.query(ClientLog)
|
||||||
|
.filter_by(client_uuid=client.uuid, level=LogLevel.ERROR)
|
||||||
|
.order_by(desc(ClientLog.timestamp))
|
||||||
|
.first()
|
||||||
|
)
|
||||||
|
|
||||||
|
screenshot_ts = client.last_screenshot_analyzed or _infer_last_screenshot_ts(client.uuid)
|
||||||
|
screenshot_meta = _load_screenshot_metadata(client.uuid)
|
||||||
|
latest_screenshot_type = screenshot_meta.get("latest_screenshot_type") or "periodic"
|
||||||
|
priority_screenshot_type = screenshot_meta.get("last_priority_screenshot_type")
|
||||||
|
priority_screenshot_received_at = screenshot_meta.get("last_priority_received_at")
|
||||||
|
has_active_priority = _is_priority_screenshot_active(priority_screenshot_received_at)
|
||||||
|
screenshot_url = f"/screenshots/{client.uuid}/priority" if has_active_priority else f"/screenshots/{client.uuid}"
|
||||||
|
|
||||||
|
clients_payload.append({
|
||||||
|
"uuid": client.uuid,
|
||||||
|
"hostname": client.hostname,
|
||||||
|
"description": client.description,
|
||||||
|
"ip": client.ip,
|
||||||
|
"model": client.model,
|
||||||
|
"group_id": client.group_id,
|
||||||
|
"group_name": group_name,
|
||||||
|
"registration_time": client.registration_time.isoformat() if client.registration_time else None,
|
||||||
|
"last_alive": client.last_alive.isoformat() if client.last_alive else None,
|
||||||
|
"is_alive": is_alive,
|
||||||
|
"status": status,
|
||||||
|
"current_event_id": client.current_event_id,
|
||||||
|
"current_process": client.current_process,
|
||||||
|
"process_status": process_status,
|
||||||
|
"process_pid": client.process_pid,
|
||||||
|
"screen_health_status": screen_health_status,
|
||||||
|
"last_screenshot_analyzed": screenshot_ts.isoformat() if screenshot_ts else None,
|
||||||
|
"last_screenshot_hash": client.last_screenshot_hash,
|
||||||
|
"latest_screenshot_type": latest_screenshot_type,
|
||||||
|
"priority_screenshot_type": priority_screenshot_type,
|
||||||
|
"priority_screenshot_received_at": priority_screenshot_received_at,
|
||||||
|
"has_active_priority_screenshot": has_active_priority,
|
||||||
|
"screenshot_url": screenshot_url,
|
||||||
|
"log_counts_24h": {
|
||||||
|
"error": log_counts["ERROR"],
|
||||||
|
"warn": log_counts["WARN"],
|
||||||
|
"info": log_counts["INFO"],
|
||||||
|
"debug": log_counts["DEBUG"],
|
||||||
|
},
|
||||||
|
"latest_log": _serialize_log_entry(latest_log),
|
||||||
|
"latest_error": _serialize_log_entry(latest_error),
|
||||||
|
})
|
||||||
|
|
||||||
|
summary_counts["total_clients"] += 1
|
||||||
|
summary_counts["error_logs"] += log_counts["ERROR"]
|
||||||
|
summary_counts["warn_logs"] += log_counts["WARN"]
|
||||||
|
if has_active_priority:
|
||||||
|
summary_counts["active_priority_screenshots"] += 1
|
||||||
|
if is_alive:
|
||||||
|
summary_counts["online_clients"] += 1
|
||||||
|
else:
|
||||||
|
summary_counts["offline_clients"] += 1
|
||||||
|
if status == "healthy":
|
||||||
|
summary_counts["healthy_clients"] += 1
|
||||||
|
elif status == "warning":
|
||||||
|
summary_counts["warning_clients"] += 1
|
||||||
|
elif status == "critical":
|
||||||
|
summary_counts["critical_clients"] += 1
|
||||||
|
|
||||||
|
payload = {
|
||||||
|
"summary": summary_counts,
|
||||||
|
"period_hours": hours,
|
||||||
|
"grace_period_seconds": _grace_period_seconds(),
|
||||||
|
"since": since.isoformat(),
|
||||||
|
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||||
|
"clients": clients_payload,
|
||||||
|
}
|
||||||
|
session.close()
|
||||||
|
return jsonify(dict_to_camel_case(payload))
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
session.close()
|
||||||
|
return jsonify({"error": f"Server error: {str(e)}"}), 500
|
||||||
|
|
||||||
|
|
||||||
|
@client_logs_bp.route("/recent-errors", methods=["GET"])
|
||||||
|
@admin_or_higher
|
||||||
|
def get_recent_errors():
|
||||||
|
"""
|
||||||
|
Get recent ERROR logs across all clients
|
||||||
|
Query params:
|
||||||
|
- limit: number of entries (default 20, max 100)
|
||||||
|
|
||||||
|
Useful for system-wide error monitoring
|
||||||
|
"""
|
||||||
|
session = Session()
|
||||||
|
try:
|
||||||
|
limit = min(int(request.args.get('limit', 20)), 100)
|
||||||
|
|
||||||
|
# Get recent errors from all clients
|
||||||
|
logs = session.query(ClientLog).filter_by(
|
||||||
|
level=LogLevel.ERROR
|
||||||
|
).order_by(
|
||||||
|
desc(ClientLog.timestamp)
|
||||||
|
).limit(limit).all()
|
||||||
|
|
||||||
|
result = []
|
||||||
|
for log in logs:
|
||||||
|
result.append(_serialize_log_entry(log, include_client_uuid=True))
|
||||||
|
|
||||||
|
session.close()
|
||||||
|
return jsonify({
|
||||||
|
"errors": result,
|
||||||
|
"count": len(result)
|
||||||
|
})
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
session.close()
|
||||||
|
return jsonify({"error": f"Server error: {str(e)}"}), 500
|
||||||
@@ -4,10 +4,58 @@ from flask import Blueprint, request, jsonify
|
|||||||
from server.permissions import admin_or_higher
|
from server.permissions import admin_or_higher
|
||||||
from server.mqtt_helper import publish_client_group, delete_client_group_message, publish_multiple_client_groups
|
from server.mqtt_helper import publish_client_group, delete_client_group_message, publish_multiple_client_groups
|
||||||
import sys
|
import sys
|
||||||
|
import os
|
||||||
|
import glob
|
||||||
|
import base64
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
from datetime import datetime, timezone
|
||||||
sys.path.append('/workspace')
|
sys.path.append('/workspace')
|
||||||
|
|
||||||
clients_bp = Blueprint("clients", __name__, url_prefix="/api/clients")
|
clients_bp = Blueprint("clients", __name__, url_prefix="/api/clients")
|
||||||
|
|
||||||
|
VALID_SCREENSHOT_TYPES = {"periodic", "event_start", "event_stop"}
|
||||||
|
|
||||||
|
|
||||||
|
def _normalize_screenshot_type(raw_type):
|
||||||
|
if raw_type is None:
|
||||||
|
return "periodic"
|
||||||
|
normalized = str(raw_type).strip().lower()
|
||||||
|
if normalized in VALID_SCREENSHOT_TYPES:
|
||||||
|
return normalized
|
||||||
|
return "periodic"
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_screenshot_timestamp(raw_timestamp):
|
||||||
|
if raw_timestamp is None:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
if isinstance(raw_timestamp, (int, float)):
|
||||||
|
ts_value = float(raw_timestamp)
|
||||||
|
if ts_value > 1e12:
|
||||||
|
ts_value = ts_value / 1000.0
|
||||||
|
return datetime.fromtimestamp(ts_value, timezone.utc)
|
||||||
|
|
||||||
|
if isinstance(raw_timestamp, str):
|
||||||
|
ts = raw_timestamp.strip()
|
||||||
|
if not ts:
|
||||||
|
return None
|
||||||
|
if ts.isdigit():
|
||||||
|
ts_value = float(ts)
|
||||||
|
if ts_value > 1e12:
|
||||||
|
ts_value = ts_value / 1000.0
|
||||||
|
return datetime.fromtimestamp(ts_value, timezone.utc)
|
||||||
|
|
||||||
|
ts_normalized = ts.replace("Z", "+00:00") if ts.endswith("Z") else ts
|
||||||
|
parsed = datetime.fromisoformat(ts_normalized)
|
||||||
|
if parsed.tzinfo is None:
|
||||||
|
return parsed.replace(tzinfo=timezone.utc)
|
||||||
|
return parsed.astimezone(timezone.utc)
|
||||||
|
except Exception:
|
||||||
|
return None
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
@clients_bp.route("/sync-all-groups", methods=["POST"])
|
@clients_bp.route("/sync-all-groups", methods=["POST"])
|
||||||
@admin_or_higher
|
@admin_or_higher
|
||||||
@@ -281,24 +329,24 @@ def upload_screenshot(uuid):
|
|||||||
Screenshots are stored as {uuid}.jpg in the screenshots folder.
|
Screenshots are stored as {uuid}.jpg in the screenshots folder.
|
||||||
Keeps last 20 screenshots per client (auto-cleanup).
|
Keeps last 20 screenshots per client (auto-cleanup).
|
||||||
"""
|
"""
|
||||||
import os
|
|
||||||
import base64
|
|
||||||
import glob
|
|
||||||
from datetime import datetime
|
|
||||||
|
|
||||||
session = Session()
|
session = Session()
|
||||||
client = session.query(Client).filter_by(uuid=uuid).first()
|
client = session.query(Client).filter_by(uuid=uuid).first()
|
||||||
if not client:
|
if not client:
|
||||||
session.close()
|
session.close()
|
||||||
return jsonify({"error": "Client nicht gefunden"}), 404
|
return jsonify({"error": "Client nicht gefunden"}), 404
|
||||||
session.close()
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
screenshot_timestamp = None
|
||||||
|
screenshot_type = "periodic"
|
||||||
|
|
||||||
# Handle JSON payload with base64-encoded image
|
# Handle JSON payload with base64-encoded image
|
||||||
if request.is_json:
|
if request.is_json:
|
||||||
data = request.get_json()
|
data = request.get_json()
|
||||||
if "image" not in data:
|
if "image" not in data:
|
||||||
return jsonify({"error": "Missing 'image' field in JSON payload"}), 400
|
return jsonify({"error": "Missing 'image' field in JSON payload"}), 400
|
||||||
|
|
||||||
|
screenshot_timestamp = _parse_screenshot_timestamp(data.get("timestamp"))
|
||||||
|
screenshot_type = _normalize_screenshot_type(data.get("screenshot_type") or data.get("screenshotType"))
|
||||||
|
|
||||||
# Decode base64 image
|
# Decode base64 image
|
||||||
image_data = base64.b64decode(data["image"])
|
image_data = base64.b64decode(data["image"])
|
||||||
@@ -314,8 +362,9 @@ def upload_screenshot(uuid):
|
|||||||
os.makedirs(screenshots_dir, exist_ok=True)
|
os.makedirs(screenshots_dir, exist_ok=True)
|
||||||
|
|
||||||
# Store screenshot with timestamp to track latest
|
# Store screenshot with timestamp to track latest
|
||||||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
now_utc = screenshot_timestamp or datetime.now(timezone.utc)
|
||||||
filename = f"{uuid}_{timestamp}.jpg"
|
timestamp = now_utc.strftime("%Y%m%d_%H%M%S_%f")
|
||||||
|
filename = f"{uuid}_{timestamp}_{screenshot_type}.jpg"
|
||||||
filepath = os.path.join(screenshots_dir, filename)
|
filepath = os.path.join(screenshots_dir, filename)
|
||||||
|
|
||||||
with open(filepath, "wb") as f:
|
with open(filepath, "wb") as f:
|
||||||
@@ -326,9 +375,42 @@ def upload_screenshot(uuid):
|
|||||||
with open(latest_filepath, "wb") as f:
|
with open(latest_filepath, "wb") as f:
|
||||||
f.write(image_data)
|
f.write(image_data)
|
||||||
|
|
||||||
|
# Keep a dedicated copy for high-priority event screenshots.
|
||||||
|
if screenshot_type in ("event_start", "event_stop"):
|
||||||
|
priority_filepath = os.path.join(screenshots_dir, f"{uuid}_priority.jpg")
|
||||||
|
with open(priority_filepath, "wb") as f:
|
||||||
|
f.write(image_data)
|
||||||
|
|
||||||
|
metadata_path = os.path.join(screenshots_dir, f"{uuid}_meta.json")
|
||||||
|
metadata = {}
|
||||||
|
if os.path.exists(metadata_path):
|
||||||
|
try:
|
||||||
|
with open(metadata_path, "r", encoding="utf-8") as meta_file:
|
||||||
|
metadata = json.load(meta_file)
|
||||||
|
except Exception:
|
||||||
|
metadata = {}
|
||||||
|
|
||||||
|
metadata.update({
|
||||||
|
"latest_screenshot_type": screenshot_type,
|
||||||
|
"latest_received_at": now_utc.isoformat(),
|
||||||
|
})
|
||||||
|
if screenshot_type in ("event_start", "event_stop"):
|
||||||
|
metadata["last_priority_screenshot_type"] = screenshot_type
|
||||||
|
metadata["last_priority_received_at"] = now_utc.isoformat()
|
||||||
|
|
||||||
|
with open(metadata_path, "w", encoding="utf-8") as meta_file:
|
||||||
|
json.dump(metadata, meta_file)
|
||||||
|
|
||||||
|
# Update screenshot receive timestamp for monitoring dashboard
|
||||||
|
client.last_screenshot_analyzed = now_utc
|
||||||
|
client.last_screenshot_hash = hashlib.md5(image_data).hexdigest()
|
||||||
|
session.commit()
|
||||||
|
|
||||||
# Cleanup: keep only last 20 timestamped screenshots per client
|
# Cleanup: keep only last 20 timestamped screenshots per client
|
||||||
pattern = os.path.join(screenshots_dir, f"{uuid}_*.jpg")
|
pattern = os.path.join(screenshots_dir, f"{uuid}_*.jpg")
|
||||||
existing_screenshots = sorted(glob.glob(pattern))
|
existing_screenshots = sorted(
|
||||||
|
[path for path in glob.glob(pattern) if not path.endswith("_priority.jpg")]
|
||||||
|
)
|
||||||
|
|
||||||
# Keep last 20, delete older ones
|
# Keep last 20, delete older ones
|
||||||
max_screenshots = 20
|
max_screenshots = 20
|
||||||
@@ -345,11 +427,15 @@ def upload_screenshot(uuid):
|
|||||||
"success": True,
|
"success": True,
|
||||||
"message": f"Screenshot received for client {uuid}",
|
"message": f"Screenshot received for client {uuid}",
|
||||||
"filename": filename,
|
"filename": filename,
|
||||||
"size": len(image_data)
|
"size": len(image_data),
|
||||||
|
"screenshot_type": screenshot_type,
|
||||||
}), 200
|
}), 200
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
|
session.rollback()
|
||||||
return jsonify({"error": f"Failed to process screenshot: {str(e)}"}), 500
|
return jsonify({"error": f"Failed to process screenshot: {str(e)}"}), 500
|
||||||
|
finally:
|
||||||
|
session.close()
|
||||||
|
|
||||||
|
|
||||||
@clients_bp.route("/<uuid>", methods=["DELETE"])
|
@clients_bp.route("/<uuid>", methods=["DELETE"])
|
||||||
|
|||||||
@@ -104,6 +104,9 @@ def get_events():
|
|||||||
"end_time": e.end.isoformat() if e.end else None,
|
"end_time": e.end.isoformat() if e.end else None,
|
||||||
"is_all_day": False,
|
"is_all_day": False,
|
||||||
"media_id": e.event_media_id,
|
"media_id": e.event_media_id,
|
||||||
|
"slideshow_interval": e.slideshow_interval,
|
||||||
|
"page_progress": e.page_progress,
|
||||||
|
"auto_progress": e.auto_progress,
|
||||||
"type": e.event_type.value if e.event_type else None,
|
"type": e.event_type.value if e.event_type else None,
|
||||||
"icon": get_icon_for_type(e.event_type.value if e.event_type else None),
|
"icon": get_icon_for_type(e.event_type.value if e.event_type else None),
|
||||||
# Recurrence metadata
|
# Recurrence metadata
|
||||||
@@ -267,6 +270,8 @@ def detach_event_occurrence(event_id, occurrence_date):
|
|||||||
'event_type': master.event_type,
|
'event_type': master.event_type,
|
||||||
'event_media_id': master.event_media_id,
|
'event_media_id': master.event_media_id,
|
||||||
'slideshow_interval': getattr(master, 'slideshow_interval', None),
|
'slideshow_interval': getattr(master, 'slideshow_interval', None),
|
||||||
|
'page_progress': getattr(master, 'page_progress', None),
|
||||||
|
'auto_progress': getattr(master, 'auto_progress', None),
|
||||||
'created_by': master.created_by,
|
'created_by': master.created_by,
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -318,6 +323,8 @@ def detach_event_occurrence(event_id, occurrence_date):
|
|||||||
event_type=master_data['event_type'],
|
event_type=master_data['event_type'],
|
||||||
event_media_id=master_data['event_media_id'],
|
event_media_id=master_data['event_media_id'],
|
||||||
slideshow_interval=master_data['slideshow_interval'],
|
slideshow_interval=master_data['slideshow_interval'],
|
||||||
|
page_progress=data.get("page_progress", master_data['page_progress']),
|
||||||
|
auto_progress=data.get("auto_progress", master_data['auto_progress']),
|
||||||
recurrence_rule=None,
|
recurrence_rule=None,
|
||||||
recurrence_end=None,
|
recurrence_end=None,
|
||||||
skip_holidays=False,
|
skip_holidays=False,
|
||||||
@@ -361,11 +368,15 @@ def create_event():
|
|||||||
event_type = data["event_type"]
|
event_type = data["event_type"]
|
||||||
event_media_id = None
|
event_media_id = None
|
||||||
slideshow_interval = None
|
slideshow_interval = None
|
||||||
|
page_progress = None
|
||||||
|
auto_progress = None
|
||||||
|
|
||||||
# Präsentation: event_media_id und slideshow_interval übernehmen
|
# Präsentation: event_media_id und slideshow_interval übernehmen
|
||||||
if event_type == "presentation":
|
if event_type == "presentation":
|
||||||
event_media_id = data.get("event_media_id")
|
event_media_id = data.get("event_media_id")
|
||||||
slideshow_interval = data.get("slideshow_interval")
|
slideshow_interval = data.get("slideshow_interval")
|
||||||
|
page_progress = data.get("page_progress")
|
||||||
|
auto_progress = data.get("auto_progress")
|
||||||
if not event_media_id:
|
if not event_media_id:
|
||||||
return jsonify({"error": "event_media_id required for presentation"}), 400
|
return jsonify({"error": "event_media_id required for presentation"}), 400
|
||||||
|
|
||||||
@@ -443,6 +454,8 @@ def create_event():
|
|||||||
is_active=True,
|
is_active=True,
|
||||||
event_media_id=event_media_id,
|
event_media_id=event_media_id,
|
||||||
slideshow_interval=slideshow_interval,
|
slideshow_interval=slideshow_interval,
|
||||||
|
page_progress=page_progress,
|
||||||
|
auto_progress=auto_progress,
|
||||||
autoplay=autoplay,
|
autoplay=autoplay,
|
||||||
loop=loop,
|
loop=loop,
|
||||||
volume=volume,
|
volume=volume,
|
||||||
@@ -519,6 +532,10 @@ def update_event(event_id):
|
|||||||
event.event_type = data.get("event_type", event.event_type)
|
event.event_type = data.get("event_type", event.event_type)
|
||||||
event.event_media_id = data.get("event_media_id", event.event_media_id)
|
event.event_media_id = data.get("event_media_id", event.event_media_id)
|
||||||
event.slideshow_interval = data.get("slideshow_interval", event.slideshow_interval)
|
event.slideshow_interval = data.get("slideshow_interval", event.slideshow_interval)
|
||||||
|
if "page_progress" in data:
|
||||||
|
event.page_progress = data.get("page_progress")
|
||||||
|
if "auto_progress" in data:
|
||||||
|
event.auto_progress = data.get("auto_progress")
|
||||||
# Video-specific fields
|
# Video-specific fields
|
||||||
if "autoplay" in data:
|
if "autoplay" in data:
|
||||||
event.autoplay = data.get("autoplay")
|
event.autoplay = data.get("autoplay")
|
||||||
|
|||||||
@@ -8,6 +8,7 @@ from server.routes.holidays import holidays_bp
|
|||||||
from server.routes.academic_periods import academic_periods_bp
|
from server.routes.academic_periods import academic_periods_bp
|
||||||
from server.routes.groups import groups_bp
|
from server.routes.groups import groups_bp
|
||||||
from server.routes.clients import clients_bp
|
from server.routes.clients import clients_bp
|
||||||
|
from server.routes.client_logs import client_logs_bp
|
||||||
from server.routes.auth import auth_bp
|
from server.routes.auth import auth_bp
|
||||||
from server.routes.users import users_bp
|
from server.routes.users import users_bp
|
||||||
from server.routes.system_settings import system_settings_bp
|
from server.routes.system_settings import system_settings_bp
|
||||||
@@ -46,6 +47,7 @@ else:
|
|||||||
app.register_blueprint(auth_bp)
|
app.register_blueprint(auth_bp)
|
||||||
app.register_blueprint(users_bp)
|
app.register_blueprint(users_bp)
|
||||||
app.register_blueprint(clients_bp)
|
app.register_blueprint(clients_bp)
|
||||||
|
app.register_blueprint(client_logs_bp)
|
||||||
app.register_blueprint(groups_bp)
|
app.register_blueprint(groups_bp)
|
||||||
app.register_blueprint(events_bp)
|
app.register_blueprint(events_bp)
|
||||||
app.register_blueprint(event_exceptions_bp)
|
app.register_blueprint(event_exceptions_bp)
|
||||||
@@ -66,13 +68,31 @@ def index():
|
|||||||
return "Hello from Infoscreen‐API!"
|
return "Hello from Infoscreen‐API!"
|
||||||
|
|
||||||
|
|
||||||
|
@app.route("/screenshots/<uuid>/priority")
|
||||||
|
def get_priority_screenshot(uuid):
|
||||||
|
normalized_uuid = uuid[:-4] if uuid.lower().endswith('.jpg') else uuid
|
||||||
|
priority_filename = f"{normalized_uuid}_priority.jpg"
|
||||||
|
priority_path = os.path.join("screenshots", priority_filename)
|
||||||
|
if os.path.exists(priority_path):
|
||||||
|
return send_from_directory("screenshots", priority_filename)
|
||||||
|
return get_screenshot(uuid)
|
||||||
|
|
||||||
|
|
||||||
@app.route("/screenshots/<uuid>")
|
@app.route("/screenshots/<uuid>")
|
||||||
|
@app.route("/screenshots/<uuid>.jpg")
|
||||||
def get_screenshot(uuid):
|
def get_screenshot(uuid):
|
||||||
pattern = os.path.join("screenshots", f"{uuid}*.jpg")
|
normalized_uuid = uuid[:-4] if uuid.lower().endswith('.jpg') else uuid
|
||||||
|
latest_filename = f"{normalized_uuid}.jpg"
|
||||||
|
latest_path = os.path.join("screenshots", latest_filename)
|
||||||
|
if os.path.exists(latest_path):
|
||||||
|
return send_from_directory("screenshots", latest_filename)
|
||||||
|
|
||||||
|
pattern = os.path.join("screenshots", f"{normalized_uuid}_*.jpg")
|
||||||
files = glob.glob(pattern)
|
files = glob.glob(pattern)
|
||||||
if not files:
|
if not files:
|
||||||
# Dummy-Bild als Redirect oder direkt als Response
|
# Dummy-Bild als Redirect oder direkt als Response
|
||||||
return jsonify({"error": "Screenshot not found", "dummy": "https://placehold.co/400x300?text=No+Screenshot"}), 404
|
return jsonify({"error": "Screenshot not found", "dummy": "https://placehold.co/400x300?text=No+Screenshot"}), 404
|
||||||
|
files.sort(reverse=True)
|
||||||
filename = os.path.basename(files[0])
|
filename = os.path.basename(files[0])
|
||||||
return send_from_directory("screenshots", filename)
|
return send_from_directory("screenshots", filename)
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user