docs: refactor docs structure and tighten assistant instruction policy

shrink root README into a landing page with a docs map and focused contributor guidance
add TV_POWER_RUNBOOK as the canonical TV power rollout and canary runbook
add CHANGELOG and move project history out of README-style docs
refactor src README into a developer-focused guide (architecture, runtime files, MQTT, debugging)
prune redundant older HDMI docs and keep a canonical HDMI_CEC_SETUP path
update copilot instructions to a high-signal policy format with strict anti-shadow-README design rules
align references across docs to current files, scripts, and TV power behavior
This commit is contained in:
RobbStarkAustria
2026-04-01 10:01:58 +02:00
parent fb0980aa88
commit 82f43f75ba
20 changed files with 2228 additions and 2267 deletions

View File

@@ -0,0 +1,206 @@
# TV Power Coordination Task List (Server + Client)
## Goal
Prevent unintended TV power-off during adjacent events while enabling coordinated, server-driven power intent via MQTT with robust client-side fallback.
## Scope
- Server publishes explicit TV power intent and event-window context.
- Client executes HDMI-CEC power actions with timer-safe behavior.
- Client falls back to local schedule/end-time logic if server intent is missing or stale.
- Existing event playback behavior remains backward compatible.
## Ownership Proposal
- Server team: Scheduler integration, power-intent publisher, reliability semantics.
- Client team: MQTT handler, state machine, CEC execution, fallback and observability.
---
## 1. MQTT Contract (Shared Spec)
### 1.1 Topics
- Command/intent topic (retained):
- infoscreen/{client_id}/power/intent
- Optional group-wide command topic (retained):
- infoscreen/groups/{group_id}/power/intent
- Client state/ack topic:
- infoscreen/{client_id}/power/state
### 1.2 QoS and retain
- intent topics: QoS 1, retained=true
- state topic: QoS 0 or 1 (recommend QoS 0 initially), retained=false
### 1.3 Intent payload schema (v1)
```json
{
"schema_version": "1.0",
"intent_id": "uuid-or-monotonic-id",
"issued_at": "2026-03-31T12:00:00Z",
"expires_at": "2026-03-31T12:10:00Z",
"target": {
"client_id": "optional-if-group-topic",
"group_id": "optional"
},
"power": {
"desired_state": "on",
"reason": "event_window_active",
"grace_seconds": 30
},
"event_window": {
"start": "2026-03-31T12:00:00Z",
"end": "2026-03-31T13:00:00Z"
}
}
```
### 1.4 State payload schema (client -> server)
```json
{
"schema_version": "1.0",
"intent_id": "last-applied-intent-id",
"client_id": "...",
"reported_at": "2026-03-31T12:00:01Z",
"power": {
"applied_state": "on",
"source": "mqtt_intent|local_fallback",
"result": "ok|skipped|error",
"detail": "free text"
}
}
```
### 1.5 Idempotency and ordering rules
- Client applies only newest valid intent by issued_at then intent_id tie-break.
- Duplicate intent_id must be ignored after first successful apply.
- Expired intents must not trigger new actions.
- Retained intent must be immediately usable after client reconnect.
### 1.6 Safety rules
- desired_state=on cancels any pending delayed-off timer before action.
- desired_state=off may schedule delayed-off, never immediate off during an active event window.
- If payload is malformed, client logs and ignores it.
---
## 2. Server Team Task List
### 2.1 Contract + scheduler mapping
- Finalize field names and UTC timestamp format with client team.
- Define when scheduler emits on/off intents for adjacent/overlapping events.
- Ensure contiguous events produce uninterrupted desired_state=on coverage.
### 2.2 Publisher implementation
- Add publisher for infoscreen/{client_id}/power/intent.
- Support retained messages and QoS 1.
- Include expires_at for stale-intent protection.
- Emit new intent_id for every semantic state transition.
### 2.3 Reconnect and replay behavior
- On scheduler restart, republish current effective intent as retained.
- On event edits/cancellations, publish replacement retained intent.
### 2.4 Conflict policy
- Define precedence when both group and per-client intents exist.
- Recommended: per-client overrides group intent.
### 2.5 Monitoring and diagnostics
- Record publish attempts, broker ack results, and active retained payload.
- Add operational dashboard panels for intent age and last transition.
### 2.6 Server acceptance criteria
- Adjacent event windows do not produce off intent between events.
- Reconnect test: fresh client receives retained intent and powers correctly.
- Expired intent is never acted on by a conforming client.
---
## 3. Client Team Task List
### 3.1 MQTT subscription + parsing
- Subscribe to infoscreen/{client_id}/power/intent.
- Optionally subscribe to infoscreen/groups/{group_id}/power/intent.
- Parse schema_version=1.0 payload with strict validation.
### 3.2 Power state controller integration
- Add power-intent handler in display manager path that owns HDMI-CEC decisions.
- On desired_state=on:
- cancel delayed-off timer
- call CEC on only if needed
- On desired_state=off:
- schedule delayed off using configured grace_seconds (or local default)
- re-check active event before executing off
### 3.3 Fallback behavior (critical)
- If MQTT unreachable, intent missing, invalid, or expired:
- fall back to existing local event-time logic
- use event end as off trigger with existing delayed-off safety
- If local logic sees active event, enforce cancel of pending off timer.
### 3.4 Adjacent-event race hardening
- Guarantee pending off timer is canceled on any newly active event.
- Ensure event switch path never requests off while next event is active.
- Add explicit logging for timer create/cancel/fire with reason and event_id.
### 3.5 State publishing
- Publish apply results to infoscreen/{client_id}/power/state.
- Include source=mqtt_intent or local_fallback.
- Include last intent_id and result details for troubleshooting.
### 3.6 Config flags
- Add feature toggle:
- POWER_CONTROL_MODE=local|mqtt|hybrid (recommend default: hybrid)
- hybrid behavior:
- prefer valid mqtt intent
- automatically fall back to local logic
### 3.7 Client acceptance criteria
- Adjacent events: no unintended off between two active windows.
- Broker outage during event: TV remains on via local fallback.
- Broker recovery: retained intent reconciles state without oscillation.
- Duplicate/old intents do not cause repeated CEC toggles.
---
## 4. Integration Test Matrix (Joint)
## 4.1 Happy paths
- Single event start -> on intent -> TV on.
- Event end -> off intent -> delayed off -> TV off.
- Adjacent events (end==start or small gap) -> uninterrupted TV on.
## 4.2 Failure paths
- Broker down before event start.
- Broker down during active event.
- Malformed retained intent at reconnect.
- Delayed off armed, then new event starts before timer fires.
## 4.3 Consistency checks
- Client state topic reflects actual applied source and result.
- Logs include intent_id correlation across server and client.
---
## 5. Rollout Plan
### Phase 1: Contract and feature flags
- Freeze schema and topic naming.
- Ship client support behind POWER_CONTROL_MODE=hybrid.
### Phase 2: Server publisher rollout
- Enable publishing for test group only.
- Verify retained and reconnect behavior.
### Phase 3: Production enablement
- Enable hybrid mode fleet-wide.
- Observe for 1 week: off-between-adjacent-events incidents must be zero.
### Phase 4: Optional tightening
- If metrics are stable, evaluate mqtt-first policy while retaining local safety fallback.
---
## 6. Definition of Done
- Shared MQTT contract approved by both teams.
- Server and client implementations merged with tests.
- Adjacent-event regression test added and passing.
- Operational runbook updated (topics, payloads, fallback behavior, troubleshooting).
- Production monitoring confirms no unintended mid-schedule TV power-off.