- fix race where periodic captures could overwrite pending event_start and event_stop metadata before simclient published - keep latest.jpg and meta.json synchronized so triggered screenshots are not lost - add stale pending trigger self-healing to recover from old or invalid metadata states - improve non-interactive capture reliability with DISPLAY and XAUTHORITY fallbacks - allow periodic idle captures in development mode so dashboard previews stay fresh without active events - add deeper simclient screenshot diagnostics for trigger and metadata handling - add regression test script for metadata preservation and trigger delivery - add root-cause and fix documentation for the screenshot MQTT issue - align and deduplicate README screenshot and troubleshooting sections; update release notes to March 2026 - fix scripts/start-dev.sh .env loading to ignore comments safely and remove export invalid identifier warnings
95 lines
3.9 KiB
Markdown
95 lines
3.9 KiB
Markdown
# Screenshot MQTT Transmission Issue - Root Cause & Fix
|
|
|
|
## Issue Summary
|
|
Event-triggered screenshots (event_start, event_stop) were being captured by display_manager.py but **NOT being transmitted** via MQTT from simclient.py, resulting in empty or missing data on the dashboard.
|
|
|
|
## Root Cause: Race Condition in Metadata Handling
|
|
|
|
### The Problem Timeline
|
|
1. **T=06:05:33.516Z** - Event starts (event_115)
|
|
- display_manager captures `screenshot_20260329_060533.jpg` (event_start)
|
|
- Writes `meta.json` with `"send_immediately": true, "type": "event_start"`
|
|
|
|
2. **T=06:05:33.517-06:05:47 (up to 14 seconds later)**
|
|
- simclient's screenshot_service_thread sleeps 1-2 seconds
|
|
- WINDOW: Still hasn't read the event_start meta.json
|
|
|
|
3. **T=06:05:47.935Z** - Periodic screenshot capture
|
|
- display_manager captures `screenshot_20260329_060547.jpg` (periodic)
|
|
- **BUG**: Calls `_write_screenshot_meta("periodic", ...)` which **overwrites meta.json**
|
|
- NEW meta.json: `"send_immediately": false, "type": "periodic"`
|
|
|
|
4. **T=06:05:48 (next tick)**
|
|
- simclient finally reads meta.json
|
|
- Sees: `send_immediately=false, type=periodic`
|
|
- Never transmits the event_start screenshot!
|
|
|
|
Result: Event-triggered screenshot lost, periodic screenshot sent late instead.
|
|
|
|
## Symptoms Observed
|
|
- Display manager logs show event_start/event_stop captures with correct file sizes
|
|
- MQTT messages from simclient show no screenshot data or empty arrays
|
|
- Dashboard receives only periodic screenshots, missing event transitions
|
|
- meta.json only contains periodic metadata, never event-triggered
|
|
|
|
## The Fix
|
|
|
|
### Part 1: display_manager.py - Protect Event Metadata
|
|
Modified `_write_screenshot_meta()` method to **prevent periodic screenshots from overwriting pending event-triggered metadata**:
|
|
|
|
```python
|
|
# Before writing a periodic screenshot's metadata, check if event-triggered
|
|
# metadata is still pending (send_immediately=True)
|
|
if not send_immediately and capture_type == "periodic":
|
|
if existing_meta.get('send_immediately'):
|
|
# Skip writing - preserve the event-triggered metadata
|
|
logging.debug(f"Skipping periodic meta to preserve pending {existing_meta['type']}")
|
|
return
|
|
```
|
|
|
|
**Result**: Once event_start metadata is written, it stays there until simclient processes it (within 1 second), uninterrupted by periodic captures.
|
|
|
|
### Part 2: simclient.py - Enhanced Logging
|
|
Added diagnostic logging to screenshot_service_thread to show:
|
|
- When meta.json is detected and its contents
|
|
- When triggered screenshots are being sent
|
|
- File information for troubleshooting
|
|
|
|
**Result**: Better visibility into what's happening with metadata processing.
|
|
|
|
##Verification
|
|
|
|
Test script `test-screenshot-meta-fix.sh` confirms:
|
|
```
|
|
[PROTECTED] Not overwriting pending event_start (send_immediately=True)
|
|
Current meta.json preserved: {"type": "event_start", "send_immediately": true, ...}
|
|
[SUCCESS] Event-triggered metadata preserved!
|
|
```
|
|
|
|
## How It Works Now
|
|
|
|
1. display_manager captures event_start, writes meta.json with `send_immediately=true`
|
|
2. Next periodic capture: `_write_screenshot_meta()` detects pending flag, **skips updating** meta.json
|
|
3. simclient reads meta.json within 1 second, sees `send_immediately=true`
|
|
4. Immediately calls `send_screenshot_heartbeat()`, transmits event_start screenshot
|
|
5. Clears the `send_immediately` flag
|
|
6. On next periodic capture, meta.json is safely updated
|
|
|
|
## Key Files Modified
|
|
- `src/display_manager.py` - Line ~1742: `_write_screenshot_meta()` protection logic
|
|
- `src/simclient.py` - Line ~727: Enhanced logging in `screenshot_service_thread()`
|
|
|
|
## Testing
|
|
Run the verification test:
|
|
```bash
|
|
./test-screenshot-meta-fix.sh
|
|
```
|
|
|
|
Expected output: `[SUCCESS] Event-triggered metadata preserved!`
|
|
|
|
## Impact
|
|
- Event-start and event-end screenshots now properly transmitted to MQTT
|
|
- Dashboard now receives complete event lifecycle data
|
|
- Clearer logs help diagnose future screenshot transmission issues
|
|
|