Files
infoscreen/MQTT_EVENT_PAYLOAD_GUIDE.md
Olaf 03e3c11e90 feat: crash recovery, service_failed monitoring, broker health fields, command expiry sweep
- Add GET /api/clients/crashed endpoint (process_status=crashed or stale heartbeat)
- Add restart_app command action with same lifecycle + lockout as reboot_host
- Scheduler: crash auto-recovery loop (CRASH_RECOVERY_ENABLED flag, lockout, MQTT publish)
- Scheduler: unconditional command expiry sweep per poll cycle (sweep_expired_commands)
- Listener: subscribe to infoscreen/+/service_failed; persist service_failed_at + unit
- Listener: extract broker_connection block from health payload; persist reconnect_count + last_disconnect_at
- DB migration b1c2d3e4f5a6: service_failed_at, service_failed_unit, mqtt_reconnect_count, mqtt_last_disconnect_at on clients
- Add GET /api/clients/service_failed and POST /api/clients/<uuid>/clear_service_failed
- Monitoring overview API: include mqtt_reconnect_count + mqtt_last_disconnect_at per client
- Frontend: orange service-failed alert panel (hidden when empty, auto-refresh, quittieren action)
- Frontend: MQTT reconnect count + last disconnect in client detail panel
- MQTT auth hardening: listener/scheduler/server use env credentials; broker enforces allow_anonymous false
- Client command lifecycle foundation: ClientCommand model, reboot_host/shutdown_host, full ACK lifecycle
- Docs: TECH-CHANGELOG, DEV-CHANGELOG, MQTT_EVENT_PAYLOAD_GUIDE, copilot-instructions updated
- Add implementation-plans/, RESTART_VALIDATION_CHECKLIST.md, TODO.md
2026-04-05 10:17:56 +00:00

12 KiB

MQTT Event Payload Guide

Overview

This document describes the MQTT message structure used by the Infoscreen system to deliver event information from the scheduler to display clients. It covers best practices, payload formats, and versioning strategies.

MQTT Topics

Event Distribution

  • Topic: infoscreen/events/{group_id}
  • Retained: Yes
  • Format: JSON array of event objects
  • Purpose: Delivers active events to client groups

Per-Client Configuration

  • Topic: infoscreen/{uuid}/group_id
  • Retained: Yes
  • Format: Integer (group ID)
  • Purpose: Assigns clients to groups

TV Power Intent (Phase 1)

  • Topic: infoscreen/groups/{group_id}/power/intent
  • QoS: 1
  • Retained: Yes
  • Format: JSON object
  • Purpose: Group-level desired power state for clients assigned to that group

Phase 1 is group-only. Per-client power intent topics and client state/ack topics are deferred to Phase 2.

Example payload:

{
  "schema_version": "1.0",
  "intent_id": "9cf26d9b-87a3-42f1-8446-e90bb6f6ce63",
  "group_id": 12,
  "desired_state": "on",
  "reason": "active_event",
  "issued_at": "2026-03-31T10:15:30Z",
  "expires_at": "2026-03-31T10:17:00Z",
  "poll_interval_sec": 30,
  "active_event_ids": [148],
  "event_window_start": "2026-03-31T10:15:00Z",
  "event_window_end": "2026-03-31T11:00:00Z"
}

Contract notes:

  • intent_id changes only on semantic transition (desired_state/reason changes).
  • Heartbeat republishes keep intent_id stable while refreshing issued_at and expires_at.
  • Expiry is poll-based: max(3 x poll_interval_sec, 90).

Service-Failed Notification (client → server, retained)

  • Topic: infoscreen/{uuid}/service_failed
  • QoS: 1
  • Retained: Yes
  • Direction: client → server
  • Purpose: Client signals that systemd has exhausted restart attempts (StartLimitBurst exceeded) — manual intervention is required.

Example payload:

{
  "event": "service_failed",
  "unit": "infoscreen-simclient.service",
  "client_uuid": "9b8d1856-ff34-4864-a726-12de072d0f77",
  "failed_at": "2026-04-05T08:00:00Z"
}

Contract notes:

  • Message is retained so the server receives it even after a broker restart.
  • Server persists service_failed_at and service_failed_unit to the clients table.
  • To clear after resolution: POST /api/clients/<uuid>/clear_service_failed — clears the DB flag and publishes an empty retained payload to delete the retained message from the broker.
  • Empty payload (empty bytes) on this topic = retain-clear in transit; listener ignores it.

Client Command Intent (Phase 1)

  • Topic: infoscreen/{uuid}/commands
  • QoS: 1
  • Retained: No
  • Format: JSON object
  • Purpose: Per-client control commands (currently restart and shutdown)

Compatibility note:

  • During restart transition, server also publishes legacy restart command to clients/{uuid}/restart with payload { "action": "restart" }.
  • During topic naming transition, server also publishes command payload to infoscreen/{uuid}/command.

Example payload:

{
  "schema_version": "1.0",
  "command_id": "5d1f8b4b-7e85-44fb-8f38-3f5d5da5e2e4",
  "client_uuid": "9b8d1856-ff34-4864-a726-12de072d0f77",
  "action": "reboot_host",
  "issued_at": "2026-04-03T12:48:10Z",
  "expires_at": "2026-04-03T12:52:10Z",
  "requested_by": 1,
  "reason": "operator_request"
}

Contract notes:

  • Clients must reject stale commands where local UTC time is greater than expires_at.
  • Clients must deduplicate by command_id and never execute a duplicate command twice.
  • schema_version is required for forward-compatibility.
  • Allowed command action values in v1: reboot_host, shutdown_host, restart_app.
  • restart_app = soft app restart (no OS reboot); reboot_host = full OS reboot.
  • API mapping for operators: restart endpoint emits reboot_host; shutdown endpoint emits shutdown_host.

Client Command Acknowledgements (Phase 1)

  • Topic: infoscreen/{uuid}/commands/ack
  • QoS: 1 (recommended)
  • Retained: No
  • Format: JSON object
  • Purpose: Client reports command lifecycle progression back to server

Compatibility note:

  • During topic naming transition, listener also accepts acknowledgements from infoscreen/{uuid}/command/ack.

Example payload:

{
  "command_id": "5d1f8b4b-7e85-44fb-8f38-3f5d5da5e2e4",
  "status": "execution_started",
  "error_code": null,
  "error_message": null
}

Allowed status values:

  • accepted
  • execution_started
  • completed
  • failed

Message Structure

General Principles

  1. Type Safety: Always include event_type to allow clients to parse appropriately
  2. Backward Compatibility: Add new fields without removing old ones
  3. Extensibility: Use nested objects for event-type-specific data
  4. UTC Timestamps: All times in ISO 8601 format with timezone info

Base Event Structure

Every event includes these common fields:

{
  "id": 123,
  "title": "Event Title",
  "start": "2025-10-19T09:00:00+00:00",
  "end": "2025-10-19T09:30:00+00:00",
  "group_id": 1,
  "event_type": "presentation|website|webuntis|video|message|other",
  "recurrence_rule": "FREQ=WEEKLY;BYDAY=MO,WE,FR" or null,
  "recurrence_end": "2025-12-31T23:59:59+00:00" or null
}

Event Type-Specific Payloads

Presentation Events

{
  "id": 123,
  "event_type": "presentation",
  "title": "Morning Announcements",
  "start": "2025-10-19T09:00:00+00:00",
  "end": "2025-10-19T09:30:00+00:00",
  "group_id": 1,
  "presentation": {
    "type": "slideshow",
    "files": [
      {
        "name": "slides.pdf",
        "url": "http://server:8000/api/files/converted/abc123.pdf",
        "checksum": null,
        "size": null
      }
    ],
    "slide_interval": 10000,
    "auto_advance": true,
    "page_progress": true,
    "auto_progress": true
  }
}

Fields:

  • type: Always "slideshow" for presentations
  • files: Array of file objects with download URLs
  • slide_interval: Milliseconds between slides (default: 5000)
  • auto_advance: Whether to automatically advance slides
  • page_progress: Show page number indicator
  • auto_progress: Enable automatic progression

Website Events

{
  "id": 124,
  "event_type": "website",
  "title": "School Website",
  "start": "2025-10-19T09:00:00+00:00",
  "end": "2025-10-19T09:30:00+00:00",
  "group_id": 1,
  "website": {
    "type": "browser",
    "url": "https://example.com/page"
  }
}

Fields:

  • type: Always "browser" for website display
  • url: Full URL to display in embedded browser

WebUntis Events

{
  "id": 125,
  "event_type": "webuntis",
  "title": "Schedule Display",
  "start": "2025-10-19T09:00:00+00:00",
  "end": "2025-10-19T09:30:00+00:00",
  "group_id": 1,
  "website": {
    "type": "browser",
    "url": "https://webuntis.example.com/schedule"
  }
}

Note: WebUntis events use the same payload structure as website events. The URL is fetched from system settings (supplement_table_url) rather than being specified per-event. Clients treat webuntis and website event types identically—both display a website.

Video Events

{
  "id": 126,
  "event_type": "video",
  "title": "Video Playback",
  "start": "2025-10-19T09:00:00+00:00",
  "end": "2025-10-19T09:30:00+00:00",
  "group_id": 1,
  "video": {
    "type": "media",
    "url": "http://server:8000/api/eventmedia/stream/123/video.mp4",
    "autoplay": true,
    "loop": false,
    "volume": 0.8
  }
}

Fields:

  • type: Always "media" for video playback
  • url: Video streaming URL with range request support
  • autoplay: Whether to start playing automatically (default: true)
  • loop: Whether to loop the video (default: false)
  • volume: Playback volume from 0.0 to 1.0 (default: 0.8)

Message Events (Future)

{
  "id": 127,
  "event_type": "message",
  "title": "Important Announcement",
  "start": "2025-10-19T09:00:00+00:00",
  "end": "2025-10-19T09:30:00+00:00",
  "group_id": 1,
  "message": {
    "type": "html",
    "content": "<h1>Important</h1><p>Message content</p>",
    "style": "default"
  }
}

Best Practices

1. Type-Based Parsing

Clients should:

  1. Read the event_type field first
  2. Switch/dispatch based on type
  3. Parse type-specific nested objects (presentation, website, etc.)
// Example client parsing
function parseEvent(event) {
  switch (event.event_type) {
    case 'presentation':
      return handlePresentation(event.presentation);
    case 'website':
    case 'webuntis':
      return handleWebsite(event.website);
    case 'video':
      return handleVideo(event.video);
    // ...
  }
}

2. Graceful Degradation

  • Always provide fallback values for optional fields
  • Validate URLs before attempting to load
  • Handle missing or malformed data gracefully

3. Performance Optimization

  • Cache downloaded presentation files
  • Use checksums to avoid re-downloading unchanged content
  • Preload resources before event start time

4. Time Handling

  • Always parse ISO 8601 timestamps with timezone awareness
  • Compare event start/end times in UTC
  • Account for clock drift on embedded devices

5. Error Recovery

  • Retry failed downloads with exponential backoff
  • Log errors but continue operation
  • Display fallback content if event data is invalid

Message Flow

  1. Scheduler queries active events from database
  2. Scheduler formats events with type-specific payloads
  3. Scheduler publishes JSON array to infoscreen/events/{group_id} (retained)
  4. Client receives retained message on connect
  5. Client parses events and schedules display
  6. Client downloads resources (presentations, etc.)
  7. Client displays events at scheduled times

Versioning Strategy

Adding New Event Types

  1. Add enum value to EventType in models/models.py
  2. Update scheduler's format_event_with_media() in scheduler/db_utils.py
  3. Update events API in server/routes/events.py
  4. Add icon mapping in get_icon_for_type()
  5. Document payload structure in this guide

Adding Fields to Existing Types

  • Safe: Add new optional fields to nested objects
  • Unsafe: Remove or rename existing fields
  • Migration: Provide both old and new field names during transition

Example: Adding a New Field

{
  "event_type": "presentation",
  "presentation": {
    "type": "slideshow",
    "files": [...],
    "slide_interval": 10000,
    "transition_effect": "fade"  // NEW FIELD (optional)
  }
}

Old clients ignore unknown fields; new clients use enhanced features.

Common Pitfalls

  1. Hardcoding Event Types: Use event_type field, not assumptions
  2. Timezone Confusion: Always use UTC internally
  3. Missing Error Handling: Network failures, malformed URLs, etc.
  4. Resource Leaks: Clean up downloaded files periodically
  5. Not Handling Recurrence: Events may repeat; check recurrence_rule

System Settings Integration

Some event types rely on system-wide settings rather than per-event configuration:

WebUntis / Supplement Table URL

  • Setting Key: supplement_table_url
  • API Endpoint: GET/POST /api/system-settings/supplement-table
  • Usage: Automatically applied when creating webuntis events
  • Default: Empty string (must be configured by admin)
  • Description: This URL is shared for both Vertretungsplan (supplement table) and WebUntis displays

Presentation Defaults

  • presentation_interval: Default slide interval (seconds)
  • presentation_page_progress: Show page indicators by default
  • presentation_auto_progress: Auto-advance by default

These are applied when creating new events but can be overridden per-event.

Testing Recommendations

  1. Unit Tests: Validate payload serialization/deserialization
  2. Integration Tests: Full scheduler → MQTT → client flow
  3. Edge Cases: Empty event lists, missing URLs, malformed data
  4. Performance Tests: Large file downloads, many events
  5. Time Tests: Events across midnight, timezone boundaries, DST
  • AUTH_SYSTEM.md - Authentication and authorization
  • DATABASE_GUIDE.md - Database schema and models
  • .github/copilot-instructions.md - System architecture overview
  • scheduler/scheduler.py - Event publishing implementation
  • scheduler/db_utils.py - Event formatting logic

Changelog

  • 2025-10-19: Initial documentation
    • Documented base event structure
    • Added presentation and website/webuntis payload formats
    • Established best practices and versioning strategy