Files
infoscreen-dev/SCREENSHOT_MQTT_FIX.md
RobbStarkAustria d6090a6179 fix(screenshots): harden event-triggered MQTT screenshot flow and cleanup docs
- fix race where periodic captures could overwrite pending event_start and event_stop metadata before simclient published
- keep latest.jpg and meta.json synchronized so triggered screenshots are not lost
- add stale pending trigger self-healing to recover from old or invalid metadata states
- improve non-interactive capture reliability with DISPLAY and XAUTHORITY fallbacks
- allow periodic idle captures in development mode so dashboard previews stay fresh without active events
- add deeper simclient screenshot diagnostics for trigger and metadata handling
- add regression test script for metadata preservation and trigger delivery
- add root-cause and fix documentation for the screenshot MQTT issue
- align and deduplicate README screenshot and troubleshooting sections; update release notes to March 2026
- fix scripts/start-dev.sh .env loading to ignore comments safely and remove export invalid identifier warnings
2026-03-29 10:38:29 +02:00

3.9 KiB

Screenshot MQTT Transmission Issue - Root Cause & Fix

Issue Summary

Event-triggered screenshots (event_start, event_stop) were being captured by display_manager.py but NOT being transmitted via MQTT from simclient.py, resulting in empty or missing data on the dashboard.

Root Cause: Race Condition in Metadata Handling

The Problem Timeline

  1. T=06:05:33.516Z - Event starts (event_115)

    • display_manager captures screenshot_20260329_060533.jpg (event_start)
    • Writes meta.json with "send_immediately": true, "type": "event_start"
  2. T=06:05:33.517-06:05:47 (up to 14 seconds later)

    • simclient's screenshot_service_thread sleeps 1-2 seconds
    • WINDOW: Still hasn't read the event_start meta.json
  3. T=06:05:47.935Z - Periodic screenshot capture

    • display_manager captures screenshot_20260329_060547.jpg (periodic)
    • BUG: Calls _write_screenshot_meta("periodic", ...) which overwrites meta.json
    • NEW meta.json: "send_immediately": false, "type": "periodic"
  4. T=06:05:48 (next tick)

    • simclient finally reads meta.json
    • Sees: send_immediately=false, type=periodic
    • Never transmits the event_start screenshot!

Result: Event-triggered screenshot lost, periodic screenshot sent late instead.

Symptoms Observed

  • Display manager logs show event_start/event_stop captures with correct file sizes
  • MQTT messages from simclient show no screenshot data or empty arrays
  • Dashboard receives only periodic screenshots, missing event transitions
  • meta.json only contains periodic metadata, never event-triggered

The Fix

Part 1: display_manager.py - Protect Event Metadata

Modified _write_screenshot_meta() method to prevent periodic screenshots from overwriting pending event-triggered metadata:

# Before writing a periodic screenshot's metadata, check if event-triggered
# metadata is still pending (send_immediately=True)
if not send_immediately and capture_type == "periodic":
    if existing_meta.get('send_immediately'):
        # Skip writing - preserve the event-triggered metadata
        logging.debug(f"Skipping periodic meta to preserve pending {existing_meta['type']}")
        return

Result: Once event_start metadata is written, it stays there until simclient processes it (within 1 second), uninterrupted by periodic captures.

Part 2: simclient.py - Enhanced Logging

Added diagnostic logging to screenshot_service_thread to show:

  • When meta.json is detected and its contents
  • When triggered screenshots are being sent
  • File information for troubleshooting

Result: Better visibility into what's happening with metadata processing.

##Verification

Test script test-screenshot-meta-fix.sh confirms:

[PROTECTED] Not overwriting pending event_start (send_immediately=True)
Current meta.json preserved: {"type": "event_start", "send_immediately": true, ...}
[SUCCESS] Event-triggered metadata preserved!

How It Works Now

  1. display_manager captures event_start, writes meta.json with send_immediately=true
  2. Next periodic capture: _write_screenshot_meta() detects pending flag, skips updating meta.json
  3. simclient reads meta.json within 1 second, sees send_immediately=true
  4. Immediately calls send_screenshot_heartbeat(), transmits event_start screenshot
  5. Clears the send_immediately flag
  6. On next periodic capture, meta.json is safely updated

Key Files Modified

  • src/display_manager.py - Line ~1742: _write_screenshot_meta() protection logic
  • src/simclient.py - Line ~727: Enhanced logging in screenshot_service_thread()

Testing

Run the verification test:

./test-screenshot-meta-fix.sh

Expected output: [SUCCESS] Event-triggered metadata preserved!

Impact

  • Event-start and event-end screenshots now properly transmitted to MQTT
  • Dashboard now receives complete event lifecycle data
  • Clearer logs help diagnose future screenshot transmission issues