docs: refactor docs structure and tighten assistant instruction policy
shrink root README into a landing page with a docs map and focused contributor guidance add TV_POWER_RUNBOOK as the canonical TV power rollout and canary runbook add CHANGELOG and move project history out of README-style docs refactor src README into a developer-focused guide (architecture, runtime files, MQTT, debugging) prune redundant older HDMI docs and keep a canonical HDMI_CEC_SETUP path update copilot instructions to a high-signal policy format with strict anti-shadow-README design rules align references across docs to current files, scripts, and TV power behavior
This commit is contained in:
213
TV_POWER_RUNBOOK.md
Normal file
213
TV_POWER_RUNBOOK.md
Normal file
@@ -0,0 +1,213 @@
|
||||
# TV Power Runbook
|
||||
|
||||
Operational runbook for Phase 1 TV power coordination using MQTT power intent plus local HDMI-CEC fallback.
|
||||
|
||||
## Scope
|
||||
|
||||
This runbook covers:
|
||||
|
||||
- `POWER_CONTROL_MODE` rollout
|
||||
- canary validation
|
||||
- expected log signatures
|
||||
- rollback
|
||||
- common failure checks
|
||||
|
||||
Contract reference:
|
||||
|
||||
- [TV_POWER_INTENT_SERVER_CONTRACT_V1.md](TV_POWER_INTENT_SERVER_CONTRACT_V1.md)
|
||||
|
||||
## Topics and Runtime Files
|
||||
|
||||
Phase 1 topic:
|
||||
|
||||
- `infoscreen/groups/{group_id}/power/intent`
|
||||
|
||||
Telemetry topic:
|
||||
|
||||
- `infoscreen/{client_id}/power/state`
|
||||
|
||||
Runtime files:
|
||||
|
||||
- `src/power_intent_state.json`
|
||||
- `src/power_state.json`
|
||||
- `src/current_process_health.json`
|
||||
|
||||
## Power Control Modes
|
||||
|
||||
- `local`: ignore MQTT intent and use local event-time CEC logic.
|
||||
- `hybrid`: prefer fresh MQTT intent and fall back to local timing when missing, stale, or invalid.
|
||||
- `mqtt`: MQTT intent is authoritative; stale or missing intent triggers safe delayed-off behavior.
|
||||
|
||||
Recommended rollout path:
|
||||
|
||||
1. Start with `local`.
|
||||
2. Canary with `hybrid`.
|
||||
3. Roll out `hybrid` fleet-wide after stable observation.
|
||||
4. Use `mqtt` only if you explicitly want strict server authority.
|
||||
|
||||
## Gate 1: Local Mode
|
||||
|
||||
Set in `.env`:
|
||||
|
||||
```bash
|
||||
POWER_CONTROL_MODE=local
|
||||
```
|
||||
|
||||
Expected startup log signature:
|
||||
|
||||
```text
|
||||
[INFO] Power control mode: local
|
||||
```
|
||||
|
||||
Expected behavior:
|
||||
|
||||
- No MQTT power intent application.
|
||||
- Existing CEC behavior remains unchanged.
|
||||
|
||||
## Gate 2: Hybrid Canary
|
||||
|
||||
On one client or one canary group:
|
||||
|
||||
```bash
|
||||
POWER_CONTROL_MODE=hybrid
|
||||
./scripts/restart-all.sh
|
||||
```
|
||||
|
||||
Expected startup logs:
|
||||
|
||||
```text
|
||||
[INFO] Power state service thread started
|
||||
[INFO] Subscribed to power intent topic: infoscreen/groups/<id>/power/intent
|
||||
[INFO] Power control mode: hybrid
|
||||
```
|
||||
|
||||
### Valid ON Intent
|
||||
|
||||
Expected sequence:
|
||||
|
||||
```text
|
||||
[INFO] Power intent accepted: id=<uuid> desired_state=on reason=active_event ...
|
||||
[INFO] Applying MQTT power intent ON id=<uuid> reason=active_event
|
||||
[INFO] TV turned ON successfully
|
||||
[INFO] Power state published: state=on source=mqtt_intent result=ok
|
||||
```
|
||||
|
||||
### Valid OFF Intent
|
||||
|
||||
Expected sequence:
|
||||
|
||||
```text
|
||||
[INFO] Power intent accepted: id=<uuid> desired_state=off reason=no_active_event ...
|
||||
[INFO] Applying MQTT power intent OFF id=<uuid> reason=no_active_event
|
||||
[INFO] Power state published: state=off source=mqtt_intent result=ok
|
||||
```
|
||||
|
||||
### Expired Intent
|
||||
|
||||
Expected rejection:
|
||||
|
||||
```text
|
||||
[WARNING] Rejected power intent: intent expired
|
||||
```
|
||||
|
||||
### Malformed Intent
|
||||
|
||||
Expected rejection:
|
||||
|
||||
```text
|
||||
[WARNING] Rejected power intent: missing required field: intent_id
|
||||
```
|
||||
|
||||
### Retained Clear
|
||||
|
||||
When you clear the retained topic, the broker delivers an empty payload.
|
||||
|
||||
Expected log:
|
||||
|
||||
```text
|
||||
[INFO] Power intent retained message cleared (empty payload)
|
||||
```
|
||||
|
||||
This is normal and should not be treated as a parse error.
|
||||
|
||||
## Validation Commands
|
||||
|
||||
Use:
|
||||
|
||||
```bash
|
||||
./scripts/test-power-intent.sh
|
||||
./scripts/test-hdmi-cec.sh
|
||||
```
|
||||
|
||||
Useful test-power-intent paths:
|
||||
|
||||
- Option 1: publish valid ON intent.
|
||||
- Option 2: publish valid OFF intent.
|
||||
- Option 3: publish stale intent.
|
||||
- Option 4: publish malformed intent.
|
||||
- Option 5: clear retained topic with an empty retained payload.
|
||||
- Option 6: inspect runtime JSON files.
|
||||
- Option 8: subscribe to the power-state topic.
|
||||
|
||||
Useful manual checks:
|
||||
|
||||
```bash
|
||||
tail -f logs/display_manager.log src/simclient.log
|
||||
cat src/power_intent_state.json
|
||||
cat src/power_state.json
|
||||
cat src/current_process_health.json
|
||||
```
|
||||
|
||||
## Rollback
|
||||
|
||||
To leave canary mode:
|
||||
|
||||
```bash
|
||||
POWER_CONTROL_MODE=local
|
||||
./scripts/restart-all.sh
|
||||
```
|
||||
|
||||
Expected result:
|
||||
|
||||
- MQTT power intent handling becomes inactive.
|
||||
- Local CEC fallback remains in place.
|
||||
|
||||
## Fleet Rollout Gate
|
||||
|
||||
Roll out `hybrid` more widely only after:
|
||||
|
||||
- zero unintended TV-off events between adjacent events,
|
||||
- valid ON/OFF actions apply cleanly,
|
||||
- duplicate refreshes are logged as `result=skipped`,
|
||||
- stale and malformed intents are rejected without side effects,
|
||||
- retained clear events no longer produce noisy warnings.
|
||||
|
||||
Suggested observation window:
|
||||
|
||||
- at least 7 days on a canary client or canary group.
|
||||
|
||||
## Common Symptoms
|
||||
|
||||
| Symptom | Check | Likely Action |
|
||||
|---|---|---|
|
||||
| Intent never arrives | `src/power_intent_state.json` missing or invalid | Check broker connectivity and group assignment |
|
||||
| `intent expired` appears repeatedly | client clock and server publish cadence | verify NTP and server refresh interval |
|
||||
| TV turns off between adjacent events | `src/power_state.json` shows `local_fallback` or stale intent at transition | inspect server timing and boundary coverage |
|
||||
| Repeated power state publishes with `skipped` | duplicate intent refreshes only | normal dedupe behavior |
|
||||
| Clear retained intent logs warning | old code path still running | restart services and verify latest code |
|
||||
|
||||
## Dashboard Observability
|
||||
|
||||
`src/current_process_health.json` includes a `power_control` block similar to:
|
||||
|
||||
```json
|
||||
"power_control": {
|
||||
"mode": "hybrid",
|
||||
"source": "mqtt_intent",
|
||||
"last_intent_id": "4a7fe3bc-...",
|
||||
"last_action": "on",
|
||||
"last_power_at": "2026-04-01T06:00:05Z"
|
||||
}
|
||||
```
|
||||
|
||||
This is the fastest local check for what the display manager last did and why.
|
||||
Reference in New Issue
Block a user