- Add GET /api/clients/crashed endpoint (process_status=crashed or stale heartbeat) - Add restart_app command action with same lifecycle + lockout as reboot_host - Scheduler: crash auto-recovery loop (CRASH_RECOVERY_ENABLED flag, lockout, MQTT publish) - Scheduler: unconditional command expiry sweep per poll cycle (sweep_expired_commands) - Listener: subscribe to infoscreen/+/service_failed; persist service_failed_at + unit - Listener: extract broker_connection block from health payload; persist reconnect_count + last_disconnect_at - DB migration b1c2d3e4f5a6: service_failed_at, service_failed_unit, mqtt_reconnect_count, mqtt_last_disconnect_at on clients - Add GET /api/clients/service_failed and POST /api/clients/<uuid>/clear_service_failed - Monitoring overview API: include mqtt_reconnect_count + mqtt_last_disconnect_at per client - Frontend: orange service-failed alert panel (hidden when empty, auto-refresh, quittieren action) - Frontend: MQTT reconnect count + last disconnect in client detail panel - MQTT auth hardening: listener/scheduler/server use env credentials; broker enforces allow_anonymous false - Client command lifecycle foundation: ClientCommand model, reboot_host/shutdown_host, full ACK lifecycle - Docs: TECH-CHANGELOG, DEV-CHANGELOG, MQTT_EVENT_PAYLOAD_GUIDE, copilot-instructions updated - Add implementation-plans/, RESTART_VALIDATION_CHECKLIST.md, TODO.md
9.8 KiB
9.8 KiB
DEV-CHANGELOG
This changelog tracks all changes made in the development workspace, including internal, experimental, and in-progress updates. Entries here may not be reflected in public releases or the user-facing changelog.
Unreleased (development workspace)
- Crash detection API: Added
GET /api/clients/crashedreturning clients withprocess_status=crashedor stale heartbeat; includescrash_reasonfield (process_crashed|heartbeat_stale). - Crash auto-recovery (scheduler): Feature-flagged loop (
CRASH_RECOVERY_ENABLED) scans crash candidates, issuesreboot_hostcommand, publishes to primary + compat MQTT topics; lockout window and expiry configurable via env. - Command expiry sweep (scheduler): Unconditional per-cycle sweep in
sweep_expired_commands()marks non-terminalClientCommandrows pastexpires_atasexpired. restart_appaction registered inserver/routes/clients.pyAPI action map; sends same command lifecycle asreboot_host; safety lockout covers both actions.service_failedlistener: subscribes toinfoscreen/+/service_failedon every connect; persistsservice_failed_at+service_failed_unittoClient; empty payload (retain clear) silently ignored.- Broker connection health: Listener health handler now extracts
broker_connection.reconnect_count+broker_connection.last_disconnect_atand persists toClient. - DB migration
b1c2d3e4f5a6: addsservice_failed_at,service_failed_unit,mqtt_reconnect_count,mqtt_last_disconnect_attoclientstable. - Model update:
models/models.pyClient class updated with all four new columns. GET /api/clients/service_failed: lists clients withservice_failed_atset, admin-or-higher gated.POST /api/clients/<uuid>/clear_service_failed: clears DB flag and publishes empty retained MQTT toinfoscreen/{uuid}/service_failed.- Monitoring overview includes
mqtt_reconnect_count+mqtt_last_disconnect_atper client. - Frontend monitoring: orange service-failed alert panel (hidden when count=0), auto-refresh 15s, per-row Quittieren action.
- Frontend monitoring: client detail now shows MQTT reconnect count + last disconnect timestamp.
- Frontend types:
ServiceFailedClient,ServiceFailedClientsResponse; helpersfetchServiceFailedClients(),clearServiceFailed()added todashboard/src/apiClients.ts. MQTT_EVENT_PAYLOAD_GUIDE.md: addedservice_failedtopic contract.- MQTT auth hardening: Listener and scheduler now connect to broker with env-configured credentials (
MQTT_BROKER_HOST,MQTT_BROKER_PORT,MQTT_USER,MQTT_PASSWORD) instead of anonymous fixed host/port defaults; optional TLS env toggles added in code path (MQTT_TLS_*). - Broker auth enforcement:
mosquitto/config/mosquitto.confnow disables anonymous access and enables password-file authentication.docker-compose.ymlMQTT service now bootstraps/update password entries from env (MQTT_USER/MQTT_PASSWORD, optional canary user) before starting broker. - Compose wiring: Added MQTT credential env propagation for listener/scheduler in both base and dev override compose files and switched MQTT healthcheck publish to authenticated mode.
- Backend implementation: Introduced client command lifecycle foundation for remote control in
server/routes/clients.pywith command persistence (ClientCommand), schema-based MQTT publish toinfoscreen/{uuid}/commands(QoS1, non-retained), new endpointsPOST /api/clients/<uuid>/shutdownandGET /api/clients/commands/<command_id>, and restart safety lockout (blocked_safetyafter 3 restarts in 15 minutes). Added migrationserver/alembic/versions/aa12bb34cc56_add_client_commands_table.pyand model updates inmodels/models.py. Restart path keeps transitional legacy MQTT publish toclients/{uuid}/restartfor compatibility. - Listener integration:
listener/listener.pynow subscribes toinfoscreen/+/commands/ackand updates command lifecycle states from client ACK payloads (accepted,execution_started,completed,failed). - Frontend API client prep: Extended
dashboard/src/apiClients.tswithClientCommandtyping and helper calls for lifecycle consumption (shutdownClient,fetchClientCommandStatus), and updatedrestartClientto accept optional reason payload. - Contract freeze clarification: implementation-plan docs now explicitly freeze canonical MQTT topics (
infoscreen/{uuid}/commands,infoscreen/{uuid}/commands/ack) and JSON schemas with examples; added transitional singular-topic compatibility aliases (infoscreen/{uuid}/command,infoscreen/{uuid}/command/ack) in server publish and listener ingest. - Action value canonicalization: command payload actions are now frozen as host-level values (
reboot_host,shutdown_host). API endpoint mapping is explicit (/restart->reboot_host,/shutdown->shutdown_host), and docs/examples were updated to removerestartpayload ambiguity. - Client helper snippets: Added frozen payload validation artifacts
implementation-plans/reboot-command-payload-schemas.mdandimplementation-plans/reboot-command-payload-schemas.json(copy-ready snippets plus machine-validated JSON Schema). - Documentation alignment: Added active reboot implementation handoff docs under
implementation-plans/and linked them inREADME.mdfor immediate cross-team access (reboot-implementation-handoff-share.md,reboot-implementation-handoff-client-team.md,reboot-kickoff-summary.md). - Programminfo GUI regression/fix:
dashboard/public/program-info.jsoncould not be loaded in Programminfo menu due to invalid JSON in the new alpha.16 changelog line (malformed quote in a text entry). Fixed JSON entry and verified file parses correctly again. - Dashboard holiday banner fix:
dashboard/src/dashboard.tsx—loadHolidayStatusnow uses a stableuseCallbackwith empty deps, preventing repeated re-creation on render.useEffectdepends only on the stable callback reference. - Dashboard Syncfusion stale-render fix:
MessageComponentin the holiday banner now receiveskey={${severity}:${text}}to force remount when severity or text changes; without this Syncfusion cached stale DOM and the banner did not update reactively. - Dashboard German text: Replaced transliterated forms (ae/oe/ue) with correct Umlauts throughout visible dashboard UI strings —
Präsentation,für,prüfen,Ferienüberschneidungen,verfügbar,Vorfälle,Ausfälle. - TV power intent (Phase 1): Scheduler publishes retained QoS1 group-level intents to
infoscreen/groups/{group_id}/power/intentwith transition+heartbeat semantics, startup/reconnect republish, and poll-based expiry (max(3 × poll_interval_sec, 90s)). - TV power validation: Added unit/integration/canary coverage in
scheduler/test_power_intent_utils.py,scheduler/test_power_intent_scheduler.py, andtest_power_intent_canary.py. - Monitoring system completion: End-to-end monitoring pipeline is active (MQTT logs/health → listener persistence → monitoring APIs → superadmin dashboard).
- Monitoring API: Added/active endpoints
GET /api/client-logs/monitoring-overviewandGET /api/client-logs/recent-errors; per-client logs viaGET /api/client-logs/<uuid>/logs. - Dashboard monitoring UI: Superadmin monitoring page is integrated and displays client health status, screenshots, process metadata, and recent error activity.
- Bugfix: Presentation flags
page_progressandauto_progressnow persist reliably across create/update and detached-occurrence flows. - Frontend (Settings → Events): Added Presentations defaults (slideshow interval, page-progress, auto-progress) with load/save via
/api/system-settings; UI uses Syncfusion controls. - Backend defaults: Seeded
presentation_interval("10"),presentation_page_progress("true"),presentation_auto_progress("true") inserver/init_defaults.pywhen missing. - Data model: Added per-event fields
page_progressandauto_progressonEvent; Alembic migration applied successfully. - Event modal (dashboard): Extended to show and persist presentation
pageProgress/autoProgress; applies system defaults on create and preserves per-event values on edit; payload includespage_progress,auto_progress, andslideshow_interval. - Scheduler behavior: Now publishes only currently active events per group (at "now"); clears retained topics by publishing
[]for groups with no active events; normalizes naive timestamps and compares times in UTC; presentation payloads includepage_progressandauto_progress. - Recurrence handling: Still queries a 7‑day window to expand recurring events and apply exceptions; recurring events only deactivate after
recurrence_end(UNTIL). - Logging: Temporarily added filter diagnostics during debugging; removed verbose logs after verification.
- WebUntis event type: Implemented new
webuntistype. Event creation resolves URL from systemsupplement_table_url; returns 400 if not configured. WebUntis behaves like Website on clients (shared website payload). - Settings consolidation: Removed separate
webuntis_url(if present during dev); WebUntis and Vertretungsplan sharesupplement_table_url. Removed/api/system-settings/webuntis-urlendpoints; use/api/system-settings/supplement-table. - Scheduler payloads: Added top-level
event_typefor all events; introduced unified nestedwebsitepayload for bothwebsiteandwebuntisevents:{ "type": "browser", "url": "…" }. - Frontend: Program info bumped to
2025.1.0-alpha.13; changelog includes WebUntis/Website unification and settings update. Event modal shows no per-event URL for WebUntis. - Documentation: Added
MQTT_EVENT_PAYLOAD_GUIDE.mdandWEBUNTIS_EVENT_IMPLEMENTATION.md. Updated.github/copilot-instructions.mdandREADME.mdfor unified Website/WebUntis handling and system settings usage.
Note: These changes are available in the development environment and may be included in future releases. For released changes, see TECH-CHANGELOG.md.