# Server Team Action Items — Infoscreen Client This document lists everything the server/infrastructure/frontend team must implement to complete the client integration. The client-side code is production-ready for all items listed here. --- ## 1. MQTT Broker Hardening (prerequisite for everything else) - Disable anonymous access on the broker. - Create one broker account **per client device**: - Username convention: `infoscreen-client-` (e.g. `infoscreen-client-9b8d1856`) - Provision the password to the device `.env` as `MQTT_PASSWORD_BROKER=` - Create a **server/publisher account** (e.g. `infoscreen-server`) for all server-side publishes. - Enforce ACLs: | Topic | Publisher | |---|---| | `infoscreen/{uuid}/commands` | server only | | `infoscreen/{uuid}/command` (alias) | server only | | `infoscreen/{uuid}/group_id` | server only | | `infoscreen/events/{group_id}` | server only | | `infoscreen/groups/+/power/intent` | server only | | `infoscreen/{uuid}/commands/ack` | client only | | `infoscreen/{uuid}/command/ack` | client only | | `infoscreen/{uuid}/heartbeat` | client only | | `infoscreen/{uuid}/health` | client only | | `infoscreen/{uuid}/logs/#` | client only | | `infoscreen/{uuid}/service_failed` | client only | --- ## 2. Reboot / Shutdown Command — Ack Lifecycle Client publishes ack status updates to two topics per command (canonical + transitional alias): - `infoscreen/{uuid}/commands/ack` - `infoscreen/{uuid}/command/ack` **Ack payload schema (v1, frozen):** ```json { "command_id": "07aab032-53c2-45ef-a5a3-6aa58e9d9fae", "status": "accepted | execution_started | completed | failed", "error_code": null, "error_message": null } ``` **Status lifecycle:** | Status | When | Notes | |---|---|---| | `accepted` | Command received and validated | Immediate | | `execution_started` | Helper invoked | Immediate after accepted | | `completed` | Execution confirmed | For `reboot_host`: arrives after reconnect (10–90 s after `execution_started`) | | `failed` | Helper returned error | `error_code` and `error_message` will be set | **Server must:** - Track `command_id` through the full lifecycle and update status in DB/UI. - Surface `failed` + `error_code` to the operator UI. - Expect `reboot_host` `completed` to arrive after a reconnect delay — do not treat the gap as a timeout. - Use `expires_at` from the original command to determine when to abandon waiting. --- ## 3. Health Dashboard — Broker Connection Fields (Gap 2) Every `infoscreen/{uuid}/health` payload now includes a `broker_connection` block: ```json { "timestamp": "2026-04-05T08:00:00.000000+00:00", "expected_state": { "event_id": 42 }, "actual_state": { "process": "display_manager", "pid": 1234, "status": "running" }, "broker_connection": { "broker_reachable": true, "reconnect_count": 2, "last_disconnect_at": "2026-04-04T10:30:00Z" } } ``` **Server must:** - Display `reconnect_count` and `last_disconnect_at` per device in the health dashboard. - Implement alerting heuristic: - **All** clients go silent simultaneously → likely broker outage, not device crash. - **Single** client goes silent → device crash, network failure, or process hang. --- ## 4. Service-Failed MQTT Notification (Gap 3) When systemd gives up restarting a service after repeated crashes (`StartLimitBurst` exceeded), the client automatically publishes a **retained** message: **Topic:** `infoscreen/{uuid}/service_failed` **Payload:** ```json { "event": "service_failed", "unit": "infoscreen-simclient.service", "client_uuid": "9b8d1856-ff34-4864-a726-12de072d0f77", "failed_at": "2026-04-05T08:00:00Z" } ``` **Server must:** - Subscribe to `infoscreen/+/service_failed` on startup (retained — message survives broker restart). - Alert the operator immediately when this topic receives a payload. - **Clear the retained message** once the device is acknowledged or recovered: ``` mosquitto_pub -t "infoscreen/{uuid}/service_failed" -n --retain ``` --- ## 5. No Server Action Required These items are fully implemented client-side and require no server changes: - systemd watchdog (`WatchdogSec=60`) — hangs detected and process restarted automatically. - Command deduplication — `command_id` deduplicated with 24-hour TTL. - Ack retry backoff — client retries ack publish on broker disconnect until `expires_at`. - Mock helper / test mode (`COMMAND_MOCK_REBOOT_IMMEDIATE_COMPLETE`) — development only.