Files
infoscreen-dev/CHANGELOG.md
RobbStarkAustria 0cd0d95612 feat: remote commands, systemd units, process observability, broker auth split
- Command intake (reboot/shutdown) on infoscreen/{uuid}/commands with ack lifecycle
- MQTT_USER/MQTT_PASSWORD_BROKER split from identity vars; configure_mqtt_security() updated
- infoscreen-simclient.service: Type=notify, WatchdogSec=60, Restart=on-failure
- infoscreen-notify-failure@.service + script: retained MQTT alert when systemd gives up (Gap 3)
- _sd_notify() watchdog keepalive in simclient main loop (Gap 1)
- broker_connection block in health payload: reconnect_count, last_disconnect_at (Gap 2)
- COMMAND_MOCK_REBOOT_IMMEDIATE_COMPLETE canary flag with safety guard
- SERVER_TEAM_ACTIONS.md: server-side integration action items
- Docs: README, CHANGELOG, src/README, copilot-instructions updated
- 43 tests passing
2026-04-05 08:36:50 +02:00

71 lines
4.8 KiB
Markdown

# Changelog
## April 2026
### Remote Command Intake
- Added MQTT command intake on `infoscreen/{client_id}/commands` (supports `reboot` and `shutdown`).
- Added command acknowledgement publishing to `infoscreen/{client_id}/commands/ack` and `infoscreen/{client_id}/command/ack` with states `accepted`, `rejected`, `execution_started`, `completed`, `failed`.
- Added `COMMAND_HELPER_PATH` environment variable; command execution delegated to an external shell helper so `simclient.py` requires no elevated privileges.
- Added deduplication of commands by `command_id` with configurable TTL (`COMMAND_DEDUPE_TTL_HOURS`) and max-entries cap (`COMMAND_DEDUPE_MAX_ENTRIES`).
- Added execution timeout (`COMMAND_EXEC_TIMEOUT_SEC`).
- Added `COMMAND_MOCK_REBOOT_IMMEDIATE_COMPLETE` flag for canary and test environments — immediately completes a mock reboot without waiting for process restart. Safety-guarded: only activates when the helper basename is `mock-command-helper.sh`.
### MQTT Broker Authentication Split
- Split broker connection credentials (`MQTT_USER`, `MQTT_PASSWORD_BROKER`) from legacy per-device identity fields (`MQTT_USERNAME`, `MQTT_PASSWORD`).
- `configure_mqtt_security()` now prefers `MQTT_USER`/`MQTT_PASSWORD_BROKER` for broker login, with fallback to legacy vars if broker-specific vars are absent.
### Systemd Service Units
- Added `scripts/infoscreen-simclient.service` — systemd unit for `simclient.py` with `Type=notify`, `WatchdogSec=60`, `Restart=on-failure`, `StartLimitBurst=5`.
- Added `scripts/start-simclient.sh` — launcher script mirroring `start-display-manager.sh`.
- Updated `scripts/infoscreen-display.service` with `OnFailure=infoscreen-notify-failure@%n.service`.
- Updated `src/pi-setup.sh` to install and enable both units plus the failure notifier template.
### Process Watchdog (Gap 1 — Hung Process Detection)
- Added zero-dependency `_sd_notify()` raw socket helper in `simclient.py` (no `systemd-python` package required).
- Sends `READY=1` on main loop entry and `WATCHDOG=1` on every 5-second iteration.
- Service unit uses `Type=notify` and `WatchdogSec=60`; systemd will restart the process if it stops sending keepalives for 60 seconds.
### OnFailure MQTT Notifier (Gap 3 — systemd Give-Up Detection)
- Added `scripts/infoscreen-notify-failure@.service` — systemd template unit triggered by `OnFailure=`.
- Added `scripts/infoscreen-notify-failure.sh` — publishes a retained JSON payload to `infoscreen/{uuid}/service_failed` via `mosquitto_pub` so the monitoring dashboard gets an alert even when the process is fully dead.
- Payload: `{"event":"service_failed","unit":"<unit-name>","client_uuid":"...","failed_at":"<ISO-UTC>"}`.
### Health Payload Broker Connection Block (Gap 2 — Broker vs. Process Ambiguity)
- Added `broker_connection` block to the health payload: `broker_reachable`, `reconnect_count`, `connect_count`, `last_disconnect_at`.
- `simclient.py` now tracks `reconnect_count` and `connect_count` on every `on_connect` callback and `last_disconnect` timestamp on `on_disconnect`.
- `publish_health_message()` accepts an optional `connection_state` parameter; both heartbeat-success call sites pass the enriched state.
### TV Power Coordination
- Added Phase 1 TV power coordination on `infoscreen/groups/{group_id}/power/intent`.
- Added `POWER_CONTROL_MODE` with `local`, `hybrid`, and `mqtt` behavior.
- Added `src/power_intent_state.json` and `src/power_state.json` for power IPC and telemetry.
- Added `infoscreen/{client_id}/power/state` publishing from `simclient.py`.
- Added turn-off guard logic to avoid unintended TV-off races at event boundaries.
- Added [TV_POWER_RUNBOOK.md](TV_POWER_RUNBOOK.md) and test tooling in `scripts/test-power-intent.sh`.
## March 2026
- Hardened event-trigger screenshots (`event_start`, `event_stop`) against periodic overwrite races.
- Improved `latest.jpg` and `meta.json` synchronization for reliable dashboard updates.
- Added self-healing for stale or invalid pending screenshot trigger metadata.
- Improved display environment fallbacks (`DISPLAY=:0`, `XAUTHORITY`) for non-interactive starts.
- Allowed periodic idle captures in development mode so dashboard previews stay fresh without active events.
- Added content-type-aware trigger delays for event screenshots.
- Changed screenshot transmission to a 1-second polling tick so triggered sends fire within <=1s.
- Migrated dashboard payload to grouped schema v2 (`message`, `content`, `runtime`, `metadata`).
## November 2025
- Implemented the two-process screenshot pipeline (`display_manager.py` capture, `simclient.py` transmission).
- Added Wayland/X11 screenshot tool fallback chains.
- Extended dashboard payloads with screenshot and system metadata.
- Extended scheduler event type support for `presentation`, `webuntis`, `webpage`, and `website`.
- Added website autoscroll support via CDP injection and extension fallback.