feat(client-monitoring): finalize client-side monitoring and UTC logging
- add process health bridge and monitoring flow between display_manager and simclient - publish health + warn/error log topics over MQTT - standardize log/payload/screenshot timestamps to UTC (Z) to avoid DST drift - improve video handling: python-vlc fullscreen enforcement and runtime PID reporting - update README and copilot instructions with monitoring architecture and troubleshooting - add Phase 3 monitoring implementation documentation - update gitignore for new runtime/log artifacts
This commit is contained in:
17
.github/copilot-instructions.md
vendored
17
.github/copilot-instructions.md
vendored
@@ -14,8 +14,9 @@
|
||||
- **Display logic**: `src/display_manager.py` (controls presentations/video/web)
|
||||
- **MQTT client**: `src/simclient.py` (event management, heartbeat, discovery)
|
||||
- **Runtime state**: `src/current_event.json` (current active event)
|
||||
- **Process health bridge**: `src/current_process_health.json` (display_manager -> simclient)
|
||||
- **Config**: `src/config/client_uuid.txt`, `src/config/last_group_id.txt`, `.env`
|
||||
- **Logs**: `logs/display_manager.log`, `logs/simclient.log`
|
||||
- **Logs**: `logs/display_manager.log`, `logs/simclient.log`, `logs/monitoring.log`
|
||||
- **Screenshots**: `src/screenshots/` (shared volume between processes)
|
||||
|
||||
### Common Tasks Quick Reference
|
||||
@@ -23,6 +24,8 @@
|
||||
|------|------|-------------------|
|
||||
| Add event type | `display_manager.py` | `start_display_for_event()` |
|
||||
| Modify presentation | `display_manager.py` | `start_presentation()` |
|
||||
| Modify process monitoring | `display_manager.py` | `ProcessHealthState`, `process_events()` |
|
||||
| Publish health/log topics | `simclient.py` | `read_health_state()`, `publish_health_message()`, `publish_log_message()` |
|
||||
| Change MQTT topics | `simclient.py` | Topic constants/handlers |
|
||||
| Update screenshot | `display_manager.py` | `_capture_screenshot()` |
|
||||
| File downloads | `simclient.py` | `resolve_file_url()` |
|
||||
@@ -62,6 +65,8 @@
|
||||
### MQTT Communication Patterns
|
||||
- **Discovery**: `infoscreen/discovery` → `infoscreen/{client_id}/discovery_ack`
|
||||
- **Heartbeat**: Regular `infoscreen/{client_id}/heartbeat` messages
|
||||
- **Health**: `infoscreen/{client_id}/health` (event/process/pid/status)
|
||||
- **Client logs**: `infoscreen/{client_id}/logs/error|warn` (selective forwarding)
|
||||
### MQTT Reconnection & Heartbeat (Nov 2025)
|
||||
- The client uses Paho MQTT v2 callback API with `client.loop_start()` and `client.reconnect_delay_set()` to handle automatic reconnection.
|
||||
- `on_connect` re-subscribes to all topics (`discovery_ack`, `config`, `group_id`, current group events) and re-sends discovery on reconnect to re-register with the server.
|
||||
@@ -255,11 +260,20 @@ FILE_SERVER_SCHEME=http # http or https
|
||||
- **Fallback**: External vlc binary
|
||||
- **Fields**: `url`, `autoplay` (bool), `loop` (bool), `volume` (0.0-1.0 → 0-100)
|
||||
- **URL rewriting**: `server` host → configured file server
|
||||
- **Fullscreen**: enforced for python-vlc on startup (with short retry toggles); external fallback uses `--fullscreen`
|
||||
- **Monitoring PID semantics**: python-vlc runs in-process, so PID is `display_manager.py` runtime PID; external fallback uses external `vlc` PID
|
||||
- **HW decode errors**: `h264_v4l2m2m` failures are normal if V4L2 M2M unavailable; use software decode
|
||||
- Robust payload parsing with fallbacks
|
||||
- Topic-specific message handlers
|
||||
- Retained message support where appropriate
|
||||
|
||||
### Logging & Timestamp Policy (Mar 2026)
|
||||
- Client logs are standardized to UTC with `Z` suffix to avoid DST/localtime drift.
|
||||
- Applies to `display_manager.log`, `simclient.log`, and `monitoring.log`.
|
||||
- MQTT payload timestamps for heartbeat/dashboard/health/log messages are UTC ISO timestamps.
|
||||
- Screenshot metadata timestamps included by `simclient.py` are UTC ISO timestamps.
|
||||
- Prefer UTC-aware calls (`datetime.now(timezone.utc)`) and UTC log formatters for new code.
|
||||
|
||||
## Hardware Considerations
|
||||
|
||||
### Target Platform
|
||||
@@ -490,6 +504,7 @@ The screenshot capture and transmission system has been implemented with separat
|
||||
- **Large payloads**: Reduce `SCREENSHOT_MAX_WIDTH` or `SCREENSHOT_JPEG_QUALITY`
|
||||
- **Stale screenshots**: Check `latest.jpg` symlink, verify display_manager is running
|
||||
- **MQTT errors**: Check dashboard topic logs for publish return codes
|
||||
- **Pulse overflow in remote sessions**: warnings like `pulse audio output error: overflow, flushing` can occur with NoMachine/dummy displays; if HDMI playback is stable, treat as environment-related
|
||||
### Testing & Troubleshooting
|
||||
**Setup:**
|
||||
- X11: `sudo apt install scrot imagemagick`
|
||||
|
||||
Reference in New Issue
Block a user