add superadmin monitoring dashboard with protected route, menu entry, and monitoring data client add monitoring overview API endpoint and improve log serialization/aggregation for dashboard use extend listener health/log handling with robust status/event/timestamp normalization and screenshot payload extraction improve screenshot persistence and retrieval (timestamp-aware uploads, latest screenshot endpoint fallback) fix page_progress and auto_progress persistence/serialization across create, update, and detached occurrence flows align technical and project docs to reflect implemented monitoring and no-version-bump backend changes add documentation sync log entry and include minor compose env indentation cleanup
30 KiB
Client-Side Monitoring Specification
Version: 1.0
Date: 2026-03-10
For: Infoscreen Client Implementation
Server Endpoint: 192.168.43.201:8000 (or your production server)
MQTT Broker: 192.168.43.201:1883 (or your production MQTT broker)
1. Overview
Each infoscreen client must implement health monitoring and logging capabilities to report status to the central server via MQTT.
1.1 Goals
- Detect failures: Process crashes, frozen screens, content mismatches
- Provide visibility: Real-time health status visible on server dashboard
- Enable remote diagnosis: Centralized log storage for debugging
- Auto-recovery: Attempt automatic restart on failure
1.2 Architecture
┌─────────────────────────────────────────┐
│ Infoscreen Client │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Media Player │ │ Watchdog │ │
│ │ (VLC/Chrome) │◄───│ Monitor │ │
│ └──────────────┘ └──────┬───────┘ │
│ │ │
│ ┌──────────────┐ │ │
│ │ Event Mgr │ │ │
│ │ (receives │ │ │
│ │ schedule) │◄───────────┘ │
│ └──────┬───────┘ │
│ │ │
│ ┌──────▼───────────────────────┐ │
│ │ MQTT Client │ │
│ │ - Heartbeat (every 60s) │ │
│ │ - Logs (error/warn/info) │ │
│ │ - Health metrics (every 5s) │ │
│ └──────┬────────────────────────┘ │
└─────────┼──────────────────────────────┘
│
│ MQTT over TCP
▼
┌─────────────┐
│ MQTT Broker │
│ (server) │
└─────────────┘
1.3 Current Compatibility Notes
- The server now accepts both the original specification payloads and the currently implemented Phase 3 client payloads.
infoscreen/{uuid}/healthmay currently contain a reduced payload with onlyexpected_state.event_idandactual_state.process|pid|status. Additionalhealth_metricsfields from this specification remain recommended.event_idis still specified as an integer. For compatibility with the current Phase 3 client, the server also tolerates string values such asevent_123and extracts the numeric suffix where possible.- If the client sends
process_healthinsideinfoscreen/{uuid}/dashboard, the server treats it as a fallback source forcurrent_process,process_pid,process_status, andcurrent_event_id. - Long term, the preferred client payload remains the structure in this specification so the server can surface richer monitoring data such as screen state and resource metrics.
2. MQTT Protocol Specification
2.1 Connection Parameters
Broker: 192.168.43.201 (or DNS hostname)
Port: 1883 (standard MQTT)
Protocol: MQTT v3.1.1
Client ID: "infoscreen-{client_uuid}"
Clean Session: false (retain subscriptions)
Keep Alive: 60 seconds
Username/Password: (if configured on broker)
2.2 QoS Levels
- Heartbeat: QoS 0 (fire and forget, high frequency)
- Logs (ERROR/WARN): QoS 1 (at least once delivery, important)
- Logs (INFO): QoS 0 (optional, high volume)
- Health metrics: QoS 0 (frequent, latest value matters)
3. Topic Structure & Payload Formats
3.1 Log Messages
Topic Pattern:
infoscreen/{client_uuid}/logs/{level}
Where {level} is one of: error, warn, info
Payload Format (JSON):
{
"timestamp": "2026-03-10T07:30:00Z",
"message": "Human-readable error description",
"context": {
"event_id": 42,
"process": "vlc",
"error_code": "NETWORK_TIMEOUT",
"additional_key": "any relevant data"
}
}
Field Specifications:
| Field | Type | Required | Description |
|---|---|---|---|
timestamp |
string (ISO 8601 UTC) | Yes | When the event occurred. Use YYYY-MM-DDTHH:MM:SSZ format |
message |
string | Yes | Human-readable description of the event (max 1000 chars) |
context |
object | No | Additional structured data (will be stored as JSON) |
Example Topics:
infoscreen/9b8d1856-ff34-4864-a726-12de072d0f77/logs/error
infoscreen/9b8d1856-ff34-4864-a726-12de072d0f77/logs/warn
infoscreen/9b8d1856-ff34-4864-a726-12de072d0f77/logs/info
When to Send Logs:
ERROR (Always send):
- Process crashed (VLC/Chromium/PDF viewer terminated unexpectedly)
- Content failed to load (404, network timeout, corrupt file)
- Hardware failure detected (display off, audio device missing)
- Exception caught in main event loop
- Maximum restart attempts exceeded
WARN (Always send):
- Process restarted automatically (after crash)
- High resource usage (CPU >80%, RAM >90%)
- Slow performance (frame drops, lag)
- Non-critical failures (screenshot capture failed, cache full)
- Fallback content displayed (primary source unavailable)
INFO (Send in development, optional in production):
- Process started successfully
- Event transition (switched from video to presentation)
- Content loaded successfully
- Watchdog service started/stopped
3.2 Health Metrics
Topic Pattern:
infoscreen/{client_uuid}/health
Payload Format (JSON):
{
"timestamp": "2026-03-10T07:30:00Z",
"expected_state": {
"event_id": 42,
"event_type": "video",
"media_file": "presentation.mp4",
"started_at": "2026-03-10T07:15:00Z"
},
"actual_state": {
"process": "vlc",
"pid": 1234,
"status": "running",
"uptime_seconds": 900,
"position": 45.3,
"duration": 180.0
},
"health_metrics": {
"screen_on": true,
"last_frame_update": "2026-03-10T07:29:58Z",
"frames_dropped": 2,
"network_errors": 0,
"cpu_percent": 15.3,
"memory_mb": 234
}
}
Field Specifications:
expected_state:
| Field | Type | Required | Description |
|---|---|---|---|
event_id |
integer | Yes | Current event ID from scheduler |
event_type |
string | Yes | presentation, video, website, webuntis, message |
media_file |
string | No | Filename or URL of current content |
started_at |
string (ISO 8601) | Yes | When this event started playing |
actual_state:
| Field | Type | Required | Description |
|---|---|---|---|
process |
string | Yes | vlc, chromium, pdf_viewer, none |
pid |
integer | No | Process ID (if running) |
status |
string | Yes | running, crashed, starting, stopped |
uptime_seconds |
integer | No | How long process has been running |
position |
float | No | Current playback position (seconds, for video/audio) |
duration |
float | No | Total content duration (seconds) |
health_metrics:
| Field | Type | Required | Description |
|---|---|---|---|
screen_on |
boolean | Yes | Is display powered on? |
last_frame_update |
string (ISO 8601) | No | Last time screen content changed |
frames_dropped |
integer | No | Video frames dropped (performance indicator) |
network_errors |
integer | No | Count of network errors in last interval |
cpu_percent |
float | No | CPU usage (0-100) |
memory_mb |
integer | No | RAM usage in megabytes |
Sending Frequency:
- Normal operation: Every 5 seconds
- During startup/transition: Every 1 second
- After error: Immediately + every 2 seconds until recovered
3.3 Enhanced Heartbeat
The existing heartbeat topic should be enhanced to include process status.
Topic Pattern:
infoscreen/{client_uuid}/heartbeat
Enhanced Payload Format (JSON):
{
"uuid": "9b8d1856-ff34-4864-a726-12de072d0f77",
"timestamp": "2026-03-10T07:30:00Z",
"current_process": "vlc",
"process_pid": 1234,
"process_status": "running",
"current_event_id": 42
}
New Fields (add to existing heartbeat):
| Field | Type | Required | Description |
|---|---|---|---|
current_process |
string | No | Name of active media player process |
process_pid |
integer | No | Process ID |
process_status |
string | No | running, crashed, starting, stopped |
current_event_id |
integer | No | Event ID currently being displayed |
Sending Frequency:
- Keep existing: Every 60 seconds
- Include new fields if available
4. Process Monitoring Requirements
4.1 Processes to Monitor
| Media Type | Process Name | How to Detect |
|---|---|---|
| Video | vlc |
ps aux | grep vlc or pgrep vlc |
| Website/WebUntis | chromium or chromium-browser |
pgrep chromium |
| PDF Presentation | evince, okular, or custom viewer |
pgrep {viewer_name} |
4.2 Monitoring Checks (Every 5 seconds)
Check 1: Process Alive
Goal: Verify expected process is running
Method:
- Get list of running processes (psutil or `ps`)
- Check if expected process name exists
- Match PID if known
Result:
- If missing → status = "crashed"
- If found → status = "running"
Action on crash:
- Send ERROR log immediately
- Attempt restart (max 3 attempts)
- Send WARN log on each restart
- If max restarts exceeded → send ERROR log, display fallback
Check 2: Process Responsive
Goal: Detect frozen processes
Method:
- For VLC: Query HTTP interface (status.json)
- For Chromium: Use DevTools Protocol (CDP)
- For custom viewers: Check last screen update time
Result:
- If same frame >30 seconds → likely frozen
- If playback position not advancing → frozen
Action on freeze:
- Send WARN log
- Force refresh (reload page, seek video, next slide)
- If refresh fails → restart process
Check 3: Content Match
Goal: Verify correct content is displayed
Method:
- Compare expected event_id with actual media/URL
- Check scheduled time window (is event still active?)
Result:
- Mismatch → content error
Action:
- Send WARN log
- Reload correct event from scheduler
5. Process Control Interface Requirements
5.1 VLC Control
Requirement: Enable VLC HTTP interface for monitoring
Launch Command:
vlc --intf http --http-host 127.0.0.1 --http-port 8080 --http-password "vlc_password" \
--fullscreen --loop /path/to/video.mp4
Status Query:
curl http://127.0.0.1:8080/requests/status.json --user ":vlc_password"
Response Fields to Monitor:
{
"state": "playing", // "playing", "paused", "stopped"
"position": 0.25, // 0.0-1.0 (25% through)
"time": 45, // seconds into playback
"length": 180, // total duration in seconds
"volume": 256 // 0-512
}
5.2 Chromium Control
Requirement: Enable Chrome DevTools Protocol (CDP)
Launch Command:
chromium --remote-debugging-port=9222 --kiosk --app=https://example.com
Status Query:
curl http://127.0.0.1:9222/json
Response Fields to Monitor:
[
{
"url": "https://example.com",
"title": "Page Title",
"type": "page"
}
]
Advanced: Use CDP WebSocket for events (page load, navigation, errors)
5.3 PDF Viewer (Custom or Standard)
Option A: Standard Viewer (e.g., Evince)
- No built-in API
- Monitor via process check + screenshot comparison
Option B: Custom Python Viewer
- Implement REST API for status queries
- Track: current page, total pages, last transition time
6. Watchdog Service Architecture
6.1 Service Components
Component 1: Process Monitor Thread
Responsibilities:
- Check process alive every 5 seconds
- Detect crashes and frozen processes
- Attempt automatic restart
- Send health metrics via MQTT
State Machine:
IDLE → STARTING → RUNNING → (if crash) → RESTARTING → RUNNING
→ (if max restarts) → FAILED
Component 2: MQTT Publisher Thread
Responsibilities:
- Maintain MQTT connection
- Send heartbeat every 60 seconds
- Send logs on-demand (queued from other components)
- Send health metrics every 5 seconds
- Reconnect on connection loss
Component 3: Event Manager Integration
Responsibilities:
- Receive event schedule from server
- Notify watchdog of expected process/content
- Launch media player processes
- Handle event transitions
6.2 Service Lifecycle
On Startup:
- Load configuration (client UUID, MQTT broker, etc.)
- Connect to MQTT broker
- Send INFO log: "Watchdog service started"
- Wait for first event from scheduler
During Operation:
- Monitor loop runs every 5 seconds
- Check expected vs actual process state
- Send health metrics
- Handle failures (log + restart)
On Shutdown:
- Send INFO log: "Watchdog service stopping"
- Gracefully stop monitored processes
- Disconnect from MQTT
- Exit cleanly
7. Auto-Recovery Logic
7.1 Restart Strategy
Step 1: Detect Failure
Trigger: Process not found in process list
Action:
- Log ERROR: "Process {name} crashed"
- Increment restart counter
- Check if within retry limit (max 3)
Step 2: Attempt Restart
If restart_attempts < MAX_RESTARTS:
- Log WARN: "Attempting restart ({attempt}/{MAX_RESTARTS})"
- Kill any zombie processes
- Wait 2 seconds (cooldown)
- Launch process with same parameters
- Wait 5 seconds for startup
- Verify process is running
- If success: reset restart counter, log INFO
- If fail: increment counter, repeat
Step 3: Permanent Failure
If restart_attempts >= MAX_RESTARTS:
- Log ERROR: "Max restart attempts exceeded, failing over"
- Display fallback content (static image with error message)
- Send notification to server (separate alert topic, optional)
- Wait for manual intervention or scheduler event change
7.2 Restart Cooldown
Purpose: Prevent rapid restart loops that waste resources
Implementation:
After each restart attempt:
- Wait 2 seconds before next restart
- After 3 failures: wait 30 seconds before trying again
- Reset counter on successful run >5 minutes
8. Resource Monitoring
8.1 System Metrics to Track
CPU Usage:
Method: Read /proc/stat or use psutil.cpu_percent()
Frequency: Every 5 seconds
Threshold: Warn if >80% for >60 seconds
Memory Usage:
Method: Read /proc/meminfo or use psutil.virtual_memory()
Frequency: Every 5 seconds
Threshold: Warn if >90% for >30 seconds
Display Status:
Method: Check DPMS state or xset query
Frequency: Every 30 seconds
Threshold: Error if display off (unexpected)
Network Connectivity:
Method: Ping server or check MQTT connection
Frequency: Every 60 seconds
Threshold: Warn if no server connectivity
9. Development vs Production Mode
9.1 Development Mode
Enable via: Environment variable DEBUG=true or ENV=development
Behavior:
- Send INFO level logs
- More verbose logging to console
- Shorter monitoring intervals (faster feedback)
- Screenshot capture every 30 seconds
- No rate limiting on logs
9.2 Production Mode
Enable via: ENV=production
Behavior:
- Send only ERROR and WARN logs
- Minimal console output
- Standard monitoring intervals
- Screenshot capture every 60 seconds
- Rate limiting: max 10 logs per minute per level
10. Configuration File Format
10.1 Recommended Config: JSON
File: /etc/infoscreen/config.json or ~/.config/infoscreen/config.json
{
"client": {
"uuid": "9b8d1856-ff34-4864-a726-12de072d0f77",
"hostname": "infoscreen-room-101"
},
"mqtt": {
"broker": "192.168.43.201",
"port": 1883,
"username": "",
"password": "",
"keepalive": 60
},
"monitoring": {
"enabled": true,
"health_interval_seconds": 5,
"heartbeat_interval_seconds": 60,
"max_restart_attempts": 3,
"restart_cooldown_seconds": 2
},
"logging": {
"level": "INFO",
"send_info_logs": false,
"console_output": true,
"local_log_file": "/var/log/infoscreen/watchdog.log"
},
"processes": {
"vlc": {
"http_port": 8080,
"http_password": "vlc_password"
},
"chromium": {
"debug_port": 9222
}
}
}
11. Error Scenarios & Expected Behavior
Scenario 1: VLC Crashes Mid-Video
1. Watchdog detects: process_status = "crashed"
2. Send ERROR log: "VLC process crashed"
3. Attempt 1: Restart VLC with same video, seek to last position
4. If success: Send INFO log "VLC restarted successfully"
5. If fail: Repeat 2 more times
6. After 3 failures: Send ERROR "Max restarts exceeded", show fallback
Scenario 2: Network Timeout Loading Website
1. Chromium fails to load page (CDP reports error)
2. Send WARN log: "Page load timeout"
3. Attempt reload (Chromium refresh)
4. If success after 10s: Continue monitoring
5. If timeout again: Send ERROR, try restarting Chromium
Scenario 3: Display Powers Off (Hardware)
1. DPMS check detects display off
2. Send ERROR log: "Display powered off"
3. Attempt to wake display (xset dpms force on)
4. If success: Send INFO log
5. If fail: Hardware issue, alert admin
Scenario 4: High CPU Usage
1. CPU >80% for 60 seconds
2. Send WARN log: "High CPU usage: 85%"
3. Check if expected (e.g., video playback is normal)
4. If unexpected: investigate process causing it
5. If critical (>95%): consider restarting offending process
12. Testing & Validation
12.1 Manual Tests (During Development)
Test 1: Process Crash Simulation
# Start video, then kill VLC manually
killall vlc
# Expected: ERROR log sent, automatic restart within 5 seconds
Test 2: MQTT Connectivity
# Subscribe to all client topics on server
mosquitto_sub -h 192.168.43.201 -t "infoscreen/{uuid}/#" -v
# Expected: See heartbeat every 60s, health every 5s
Test 3: Log Levels
# Trigger error condition and verify log appears in database
curl http://192.168.43.201:8000/api/client-logs/test
# Expected: See new log entry with correct level/message
12.2 Acceptance Criteria
✅ Client must:
- Send heartbeat every 60 seconds without gaps
- Send ERROR log within 5 seconds of process crash
- Attempt automatic restart (max 3 times)
- Report health metrics every 5 seconds
- Survive MQTT broker restart (reconnect automatically)
- Survive network interruption (buffer logs, send when reconnected)
- Use correct timestamp format (ISO 8601 UTC)
- Only send logs for real client UUID (FK constraint)
13. Python Libraries (Recommended)
For process monitoring:
psutil- Cross-platform process and system utilities
For MQTT:
paho-mqtt- Official MQTT client (use v2.x with Callback API v2)
For VLC control:
requests- HTTP client for status queries
For Chromium control:
websocket-clientorpychrome- Chrome DevTools Protocol
For datetime:
datetime(stdlib) - Usedatetime.now(timezone.utc).isoformat()
Example requirements.txt:
paho-mqtt>=2.0.0
psutil>=5.9.0
requests>=2.31.0
python-dateutil>=2.8.0
14. Security Considerations
14.1 MQTT Security
- If broker requires auth, store credentials in config file with restricted permissions (
chmod 600) - Consider TLS/SSL for MQTT (port 8883) if on untrusted network
- Use unique client ID to prevent impersonation
14.2 Process Control APIs
- VLC HTTP password should be random, not default
- Chromium debug port should bind to
127.0.0.1only (not0.0.0.0) - Restrict file system access for media player processes
14.3 Log Content
- Do not log: Passwords, API keys, personal data
- Sanitize: File paths (strip user directories), URLs (remove query params with tokens)
15. Performance Targets
| Metric | Target | Acceptable | Critical |
|---|---|---|---|
| Health check interval | 5s | 10s | 30s |
| Crash detection time | <5s | <10s | <30s |
| Restart time | <10s | <20s | <60s |
| MQTT publish latency | <100ms | <500ms | <2s |
| CPU usage (watchdog) | <2% | <5% | <10% |
| RAM usage (watchdog) | <50MB | <100MB | <200MB |
| Log message size | <1KB | <10KB | <100KB |
16. Troubleshooting Guide (For Client Development)
Issue: Logs not appearing in server database
Check:
- Is MQTT broker reachable? (
mosquitto_pubtest from client) - Is client UUID correct and exists in
clientstable? - Is timestamp format correct (ISO 8601 with 'Z')?
- Check server listener logs for errors
Issue: Health metrics not updating
Check:
- Is health loop running? (check watchdog service status)
- Is MQTT connected? (check connection status in logs)
- Is payload JSON valid? (use JSON validator)
Issue: Process restarts in loop
Check:
- Is media file/URL accessible?
- Is process command correct? (test manually)
- Check process exit code (crash reason)
- Increase restart cooldown to avoid rapid loops
17. Complete Message Flow Diagram
┌─────────────────────────────────────────────────────────┐
│ Infoscreen Client │
│ │
│ Event Occurs: │
│ - Process crashed │
│ - High CPU usage │
│ - Content loaded │
│ │
│ ┌────────────────┐ │
│ │ Decision Logic │ │
│ │ - Is it ERROR?│ │
│ │ - Is it WARN? │ │
│ │ - Is it INFO? │ │
│ └────────┬───────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────┐ │
│ │ Build JSON Payload │ │
│ │ { │ │
│ │ "timestamp": "...", │ │
│ │ "message": "...", │ │
│ │ "context": {...} │ │
│ │ } │ │
│ └────────┬───────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────┐ │
│ │ MQTT Publish │ │
│ │ Topic: infoscreen/{uuid}/logs/error │
│ │ QoS: 1 │ │
│ └────────┬───────────────────────┘ │
└───────────┼──────────────────────────────────────────┘
│
│ TCP/IP (MQTT Protocol)
│
▼
┌──────────────┐
│ MQTT Broker │
│ (Mosquitto) │
└──────┬───────┘
│
│ Topic: infoscreen/+/logs/#
│
▼
┌──────────────────────────────┐
│ Listener Service │
│ (Python) │
│ │
│ - Parse JSON │
│ - Validate UUID │
│ - Store in database │
└──────┬───────────────────────┘
│
▼
┌──────────────────────────────┐
│ MariaDB Database │
│ │
│ Table: client_logs │
│ - client_uuid │
│ - timestamp │
│ - level │
│ - message │
│ - context (JSON) │
└──────┬───────────────────────┘
│
│ SQL Query
│
▼
┌──────────────────────────────┐
│ API Server (Flask) │
│ │
│ GET /api/client-logs/{uuid}/logs
│ GET /api/client-logs/summary
└──────┬───────────────────────┘
│
│ HTTP/JSON
│
▼
┌──────────────────────────────┐
│ Dashboard (React) │
│ │
│ - Display logs │
│ - Filter by level │
│ - Show health status │
└───────────────────────────────┘
18. Quick Reference Card
MQTT Topics Summary
infoscreen/{uuid}/logs/error → Critical failures
infoscreen/{uuid}/logs/warn → Non-critical issues
infoscreen/{uuid}/logs/info → Informational (dev mode)
infoscreen/{uuid}/health → Health metrics (every 5s)
infoscreen/{uuid}/heartbeat → Enhanced heartbeat (every 60s)
JSON Timestamp Format
from datetime import datetime, timezone
timestamp = datetime.now(timezone.utc).isoformat()
# Output: "2026-03-10T07:30:00+00:00" or "2026-03-10T07:30:00Z"
Process Status Values
"running" - Process is alive and responding
"crashed" - Process terminated unexpectedly
"starting" - Process is launching (startup phase)
"stopped" - Process intentionally stopped
Restart Logic
Max attempts: 3
Cooldown: 2 seconds between attempts
Reset: After 5 minutes of successful operation
19. Contact & Support
Server API Documentation:
- Base URL:
http://192.168.43.201:8000 - Health check:
GET /health - Test logs:
GET /api/client-logs/test(no auth) - Full API docs: See
CLIENT_MONITORING_IMPLEMENTATION_GUIDE.mdon server
MQTT Broker:
- Host:
192.168.43.201 - Port:
1883(standard),9001(WebSocket) - Test tool:
mosquitto_pub/mosquitto_sub
Database Schema:
- Table:
client_logs - Foreign Key:
client_uuid→clients.uuid(ON DELETE CASCADE) - Constraint: UUID must exist in clients table before logging
Server-Side Logs:
# View listener logs (processes MQTT messages)
docker compose logs -f listener
# View server logs (API requests)
docker compose logs -f server
20. Appendix: Example Implementations
A. Minimal Python Watchdog (Pseudocode)
import time
import json
import psutil
import paho.mqtt.client as mqtt
from datetime import datetime, timezone
class MinimalWatchdog:
def __init__(self, client_uuid, mqtt_broker):
self.uuid = client_uuid
self.mqtt_client = mqtt.Client(callback_api_version=mqtt.CallbackAPIVersion.VERSION2)
self.mqtt_client.connect(mqtt_broker, 1883, 60)
self.mqtt_client.loop_start()
self.expected_process = None
self.restart_attempts = 0
self.MAX_RESTARTS = 3
def send_log(self, level, message, context=None):
topic = f"infoscreen/{self.uuid}/logs/{level}"
payload = {
"timestamp": datetime.now(timezone.utc).isoformat(),
"message": message,
"context": context or {}
}
self.mqtt_client.publish(topic, json.dumps(payload), qos=1)
def is_process_running(self, process_name):
for proc in psutil.process_iter(['name']):
if process_name in proc.info['name']:
return True
return False
def monitor_loop(self):
while True:
if self.expected_process:
if not self.is_process_running(self.expected_process):
self.send_log("error", f"{self.expected_process} crashed")
if self.restart_attempts < self.MAX_RESTARTS:
self.restart_process()
else:
self.send_log("error", "Max restarts exceeded")
time.sleep(5)
# Usage:
watchdog = MinimalWatchdog("9b8d1856-ff34-4864-a726-12de072d0f77", "192.168.43.201")
watchdog.expected_process = "vlc"
watchdog.monitor_loop()
END OF SPECIFICATION
Questions? Refer to:
CLIENT_MONITORING_IMPLEMENTATION_GUIDE.md(server repo)- Server API:
http://192.168.43.201:8000/api/client-logs/test - MQTT test:
mosquitto_sub -h 192.168.43.201 -t infoscreen/#