feat: crash recovery, service_failed monitoring, broker health fields, command expiry sweep
- Add GET /api/clients/crashed endpoint (process_status=crashed or stale heartbeat) - Add restart_app command action with same lifecycle + lockout as reboot_host - Scheduler: crash auto-recovery loop (CRASH_RECOVERY_ENABLED flag, lockout, MQTT publish) - Scheduler: unconditional command expiry sweep per poll cycle (sweep_expired_commands) - Listener: subscribe to infoscreen/+/service_failed; persist service_failed_at + unit - Listener: extract broker_connection block from health payload; persist reconnect_count + last_disconnect_at - DB migration b1c2d3e4f5a6: service_failed_at, service_failed_unit, mqtt_reconnect_count, mqtt_last_disconnect_at on clients - Add GET /api/clients/service_failed and POST /api/clients/<uuid>/clear_service_failed - Monitoring overview API: include mqtt_reconnect_count + mqtt_last_disconnect_at per client - Frontend: orange service-failed alert panel (hidden when empty, auto-refresh, quittieren action) - Frontend: MQTT reconnect count + last disconnect in client detail panel - MQTT auth hardening: listener/scheduler/server use env credentials; broker enforces allow_anonymous false - Client command lifecycle foundation: ClientCommand model, reboot_host/shutdown_host, full ACK lifecycle - Docs: TECH-CHANGELOG, DEV-CHANGELOG, MQTT_EVENT_PAYLOAD_GUIDE, copilot-instructions updated - Add implementation-plans/, RESTART_VALIDATION_CHECKLIST.md, TODO.md
This commit is contained in:
20
.env.example
20
.env.example
@@ -20,8 +20,18 @@ DB_HOST=db
|
||||
# MQTT
|
||||
MQTT_BROKER_HOST=mqtt
|
||||
MQTT_BROKER_PORT=1883
|
||||
# MQTT_USER=your_mqtt_user
|
||||
# MQTT_PASSWORD=your_mqtt_password
|
||||
# Required for authenticated broker access
|
||||
MQTT_USER=your_mqtt_user
|
||||
MQTT_PASSWORD=replace_with_a_32plus_char_random_password
|
||||
# Optional: dedicated canary client account
|
||||
MQTT_CANARY_USER=your_canary_mqtt_user
|
||||
MQTT_CANARY_PASSWORD=replace_with_a_different_32plus_char_random_password
|
||||
# Optional TLS settings
|
||||
MQTT_TLS_ENABLED=false
|
||||
MQTT_TLS_CA_CERT=
|
||||
MQTT_TLS_CERTFILE=
|
||||
MQTT_TLS_KEYFILE=
|
||||
MQTT_TLS_INSECURE=false
|
||||
MQTT_KEEPALIVE=60
|
||||
|
||||
# Dashboard
|
||||
@@ -39,6 +49,12 @@ HEARTBEAT_GRACE_PERIOD_PROD=170
|
||||
# Optional: force periodic republish even without changes
|
||||
# REFRESH_SECONDS=0
|
||||
|
||||
# Crash recovery (scheduler auto-recovery)
|
||||
# CRASH_RECOVERY_ENABLED=false
|
||||
# CRASH_RECOVERY_GRACE_SECONDS=180
|
||||
# CRASH_RECOVERY_LOCKOUT_MINUTES=15
|
||||
# CRASH_RECOVERY_COMMAND_EXPIRY_SECONDS=240
|
||||
|
||||
# Default superadmin bootstrap (server/init_defaults.py)
|
||||
# REQUIRED: Must be set for superadmin creation
|
||||
DEFAULT_SUPERADMIN_USERNAME=superadmin
|
||||
|
||||
Reference in New Issue
Block a user