2 Commits

Author SHA1 Message Date
brianhansonxyz 27cad05cd9 sessions: document baseline revert + parked-controlsd architecture
Briefing for the on-device agent covering commits 47321e3, f7e602c,
and 887b9c9: what was reverted vs kept, the UI re-wire, the parked
controlsd swap mechanism, build prerequisites, runtime verification
commands, and a troubleshooting playbook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 09:18:34 -05:00
brianhansonxyz 887b9c9e12 parked-controlsd mode: shut down heavy stack while ignition+park
Adds a second controlsd variant that runs while ignition is on but the
car is in Park. It only listens to CAN and publishes carState — no
model, no planner, no lateral/long control, no actuator commands — so
modeld, locationd, calibrationd, plannerd, radard, paramsd, torqued,
dmonitoring*, soundd, loggerd all stay stopped while parked.

Manager swaps between the two via mutually-exclusive predicates:
  - controlsd_parked: ignition AND not started
  - controlsd (full): started (= ignition AND not_parked)

Thermald owns the swap. It already subscribes to carState; we add a
new onroad condition `not_parked` derived from gearShifter, with a
1.5s hysteresis on going into parked (R/P/D thrash protection) and
zero hysteresis on going out (instant wake on shift to D/R/N). At
boot we assume parked so the heavy stack waits for carState to
confirm gear has actually left Park.

Manager predicates can only see persistent Params, not pandaStates,
so thermald exposes ignition as a new IgnitionOn param (edge-written).
Reverse is treated as not-parked — driver is moving.

Files:
- selfdrive/controls/controlsd_parked.py (new, ~50 lines)
- selfdrive/thermald/thermald.py: not_parked condition + IgnitionOn
- selfdrive/manager/process_config.py: parked_only predicate + entry
- selfdrive/manager/manager.py: seed IgnitionOn=False
- common/params.cc: register IgnitionOn

The full controlsd is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 09:13:41 -05:00
6 changed files with 460 additions and 0 deletions
+1
View File
@@ -138,6 +138,7 @@ std::unordered_map<std::string, uint32_t> keys = {
{"GsmRoaming", PERSISTENT},
{"HardwareSerial", PERSISTENT},
{"HasAcceptedTerms", PERSISTENT},
{"IgnitionOn", CLEAR_ON_MANAGER_START},
{"IMEI", PERSISTENT},
{"InstallDate", PERSISTENT},
{"IsDriverViewEnabled", CLEAR_ON_MANAGER_START},
+54
View File
@@ -0,0 +1,54 @@
#!/usr/bin/env python3
"""
CLEARPILOT: minimal controlsd variant that runs while ignition is on but the
car is in Park. Keeps CAN parsing and carState publishing alive (so thermald
can see gearShifter and decide when to swap us out for the full controlsd),
but skips all of the heavy onroad work — no model, no planner, no lateral or
longitudinal control, no actuator commands.
Manager swaps between this and the full controlsd via predicate flips:
- this runs when: ignition AND not started
- full runs when: started (which requires ignition AND not_parked)
The two are mutually exclusive — only one publishes carState at a time.
"""
from types import SimpleNamespace
from openpilot.common.realtime import Priority, config_realtime_process
from openpilot.selfdrive.car.card import CarD
def _make_default_frogpilot_variables() -> SimpleNamespace:
"""Safe defaults for fields read inside CarInterface.update / CarState.update.
We're not actuating anything here; these only need to keep the update path
from raising AttributeError. False/0 across the board is the safe baseline."""
fv = SimpleNamespace()
fv.conditional_experimental_mode = False
fv.experimental_mode_via_distance = False
fv.traffic_mode = False
fv.sport_plus = False
fv.long_pitch = False
fv.no_lat_lane_change = False
return fv
def main():
config_realtime_process(4, Priority.CTRL_HIGH)
# CarD's __init__ blocks until it sees CAN + a pandaState, then calls get_car
# to fingerprint and write CarParams. Same path the full controlsd takes.
card = CarD()
card.initialize()
fv = _make_default_frogpilot_variables()
# state_update drains CAN, parses carState, publishes carState/carOutput/carParams.
# Internally blocks via drain_sock_raw(wait_for_one=True), so the loop is
# naturally paced by CAN traffic — no extra sleep needed.
while True:
card.state_update(fv)
if __name__ == "__main__":
main()
+3
View File
@@ -363,6 +363,9 @@ def manager_init(frogpilot_functions) -> None:
params.put("GitRemote", get_origin())
params.put_bool("IsTestedBranch", is_tested_branch())
params.put_bool("IsReleaseBranch", is_release_branch())
# CLEARPILOT: thermald is the source of truth for IgnitionOn; seed False so
# the parked-controlsd predicate evaluates to False before thermald's first tick.
params.put_bool("IgnitionOn", False)
# set dongle id
reg_res = register(show_spinner=True)
+11
View File
@@ -43,6 +43,12 @@ def only_onroad(started: bool, params, CP: car.CarParams) -> bool:
def only_offroad(started, params, CP: car.CarParams) -> bool:
return not started
# CLEARPILOT: predicate for the parked controlsd variant. Runs while ignition
# is on but the car is in Park (so started=False because thermald has gated it
# off). Mutually exclusive with the full controlsd, which uses only_onroad.
def parked_only(started, params, CP: car.CarParams) -> bool:
return params.get_bool("IgnitionOn") and not started
# FrogPilot functions
def allow_logging(started, params, CP: car.CarParams) -> bool:
allow_logging = not (params.get_bool("DeviceManagement") and params.get_bool("NoLogging"))
@@ -82,6 +88,11 @@ procs = [
PythonProcess("calibrationd", "selfdrive.locationd.calibrationd", only_onroad),
PythonProcess("torqued", "selfdrive.locationd.torqued", only_onroad),
PythonProcess("controlsd", "selfdrive.controls.controlsd", only_onroad),
# CLEARPILOT: lightweight CAN listener that runs while ignition is on and the
# car is parked. Publishes carState (so thermald can see gear); does no model,
# planner, or actuator work. Manager swaps it out for the full controlsd as
# soon as gear leaves Park.
PythonProcess("controlsd_parked", "selfdrive.controls.controlsd_parked", parked_only),
PythonProcess("deleter", "system.loggerd.deleter", always_run),
PythonProcess("dmonitoringd", "selfdrive.monitoring.dmonitoringd", driverview, enabled=(not PC or WEBCAM)),
# PythonProcess("qcomgpsd", "system.qcomgpsd.qcomgpsd", qcomgps, enabled=TICI),
+36
View File
@@ -170,7 +170,17 @@ def thermald_thread(end_event, hw_queue) -> None:
onroad_conditions: dict[str, bool] = {
"ignition": False,
# CLEARPILOT: park-aware gating. When False, manager runs controlsd_parked
# (CAN listener only) instead of the full onroad stack. Latched + hysteresis
# on going-into-parked to avoid R↔P↔D thrash; flips out of parked instantly.
# Initialized False (assume parked) so the full stack waits for carState
# to confirm gear has actually left Park before spinning up.
"not_parked": False,
}
is_parked = True
parked_since: float | None = None # monotonic ts when gear first read as Park
PARKED_HYSTERESIS_S = 1.5
ignition_param_prev: bool | None = None
startup_conditions: dict[str, bool] = {}
startup_conditions_prev: dict[str, bool] = {}
@@ -247,6 +257,32 @@ def thermald_thread(end_event, hw_queue) -> None:
onroad_conditions["ignition"] = False
cloudlog.error("panda timed out onroad")
# CLEARPILOT: derive is_parked from carState gearShifter. Whichever controlsd
# variant is currently running publishes carState; we just read the gear.
# Going INTO parked has hysteresis (PARKED_HYSTERESIS_S) so brief P touches
# during low-speed parking don't kick the heavy stack off. Going OUT of
# parked is instant so the full stack starts spinning up the moment the
# driver shifts to D/R/N.
if sm.updated['carState']:
gear = sm['carState'].gearShifter
gear_is_park = gear == car.CarState.GearShifter.park
now_mono = time.monotonic()
if gear_is_park:
if parked_since is None:
parked_since = now_mono
if (not is_parked) and (now_mono - parked_since) >= PARKED_HYSTERESIS_S:
is_parked = True
else:
parked_since = None
is_parked = False
onroad_conditions["not_parked"] = not is_parked
# CLEARPILOT: expose ignition as a Param so manager predicates (which only
# see persistent Params, not pandaStates) can gate controlsd_parked.
if ignition_param_prev != onroad_conditions["ignition"]:
params.put_bool("IgnitionOn", onroad_conditions["ignition"])
ignition_param_prev = onroad_conditions["ignition"]
try:
last_hw_state = hw_queue.get_nowait()
except queue.Empty:
@@ -0,0 +1,355 @@
# Session: 2026-04-26 — Baseline Revert + Parked-Controlsd Mode
## Context
This session was driven by a regression: the steering wheel "feels like
it pulls right" during normal driving, with no clear smoking gun. The
suspicion was that one of the variable-FPS / standstill-throttling
changes (added to reduce parked-state fan noise and CPU load) bled into
on-road driving behavior in a hard-to-isolate way.
Strategy: revert all driving-relevant logic to a known-good baseline
captured in `/projects/openpilot/archive/clearpilot` (HEAD `980f0aa`,
July 2024), keep all of the ClearPilot UI/dashcam/telemetry/bench-mode
infrastructure intact on top, then attack the parked-fan-noise problem
fresh from a different angle that doesn't touch driving logic at all.
Three commits landed in this order on branch `clearpilot`:
| SHA | Title |
|---|---|
| `47321e3` | restore driving logic to pre-variable-fps baseline |
| `f7e602c` | controlsd: re-wire UI hooks on top of restored baseline |
| `887b9c9` | parked-controlsd mode: shut down heavy stack while ignition+park |
Pre-revert tip was `62a403d`. The on-device agent should treat
**`62a403d` as "the broken version"** when looking at history.
---
## Commit 1: `47321e3` — Baseline restore
Reverted the following files wholesale to their `980f0aa` archive copy:
- `selfdrive/controls/controlsd.py`
- `selfdrive/controls/lib/events.py`
- `selfdrive/controls/lib/longitudinal_planner.py`
- `selfdrive/modeld/modeld.py`
- `selfdrive/modeld/dmonitoringmodeld.py`
- `selfdrive/locationd/calibrationd.py`
- `selfdrive/locationd/paramsd.py`
- `selfdrive/locationd/torqued.py`
- `selfdrive/car/interfaces.py`
- `selfdrive/car/hyundai/carstate.py` (CAN-FD telemetry preserved as a
commented block at the bottom of `update_canfd` — re-enable by
uncommenting; the `tlog` import is also commented out).
- `selfdrive/monitoring/dmonitoringd.py`
- `selfdrive/frogpilot/controls/frogpilot_planner.py`
- `common/realtime.py`
### Intentionally NOT restored (kept as the post-`62a403d` version)
- `selfdrive/thermald/*` — fan/power tuning kept as-is.
- `selfdrive/car/hyundai/carcontroller.py` and `hyundaicanfd.py`
reviewed; the only delta vs baseline is hoisting the
`no_lat_lane_change` Params read out of the 100Hz hot path
(~5% carcontroller CPU) by passing the bit as an argument. This is
a perf-only change with no behavioral effect outside lane changes.
- `cereal/services.py`, `cereal/custom.capnp` — additive only.
`custom.capnp` adds `latRequested @3` and `noLatLaneChange @4` fields
to `FrogPilotCarControl`; capnp tag numbers are append-only so this is
safe to leave in even though baseline code doesn't write to them.
- `selfdrive/manager/*`, `common/params.cc` — heavy ClearPilot
infrastructure (bench mode, log dir, dashcamd, gpsd, ClearPilot params).
- All `selfdrive/ui/`, `selfdrive/clearpilot/`, `system/clearpilot/`.
### Things removed by the restore (no longer in the tree)
- Standstill frame skipping in modeld (was: skip GPU inference 19/20
frames at standstill, report 0 dropped frames to fool controlsd).
- Standstill frame skipping in dmonitoringmodeld.
- Model standby logic + `ModelStandby`/`ModelStandbyTs` reads in
controlsd's comm-issue suppression path.
- Parked-cycle skip in `state_control()` (10Hz vs 100Hz when in Park).
- Calibrationd validity decoupling from `sm.all_checks()`.
- Post-engage 2s commIssue/location/params suppression window.
- Per-cycle carstate write-gating for `CarSpeedLimit`/`CarIsMetric` in
carstate.py.
- The diff-based carstate telemetry calls (preserved commented out).
These were the candidates for the steering-pull regression. The on-device
agent should **not re-introduce any of these** without a deliberate plan
and a test session. The user's intent is to get baseline driving feel
confirmed first, then re-introduce optimizations one at a time and drive
each.
---
## Commit 2: `f7e602c` — UI hooks on top of baseline
Baseline `controlsd.py` doesn't have the UI plumbing the ClearPilot UI
expects. This commit re-adds only the things needed for the existing UI
to keep working — pure params-write plumbing, no actuator effect.
### Changes (all in `selfdrive/controls/controlsd.py`)
1. **Import `SpeedState`** from `openpilot.selfdrive.clearpilot.speed_logic`.
2. **`Controls.__init__`**:
- `params_memory.put_bool("ScreenDisplayMode", 0)`
`params_memory.put_int("ScreenDisplayMode", 0)` (UI reads it as int).
- Added `self.speed_state = SpeedState()`, `self.speed_state_frame = 0`,
`self.was_driving_gear = False`.
3. **SubMaster** — added `gpsLocation` to the subscriber list, with
`ignore_alive` / `ignore_avg_freq` / `ignore_valid` so missing GPS
doesn't trigger commIssue.
4. **`clearpilot_state_control(...)`** rewritten from a simple 3-state
cycle into the documented 5-state ScreenDisplayMode machine:
- Auto-wake on park→drive edge if currently in screen-off (state 3).
- LFA button transitions in drive: `0→4, 1→2, 2→3, 3→4, 4→2`.
- LFA button transitions outside drive: any except 3 → 3, state 3 → 0.
- Speed/cruise-warning overlay tick at ~2Hz (`speed_state.update(...)`)
reading `gpsLocation`, `CarSpeedLimit` param, `self.is_metric`,
`CS.cruiseState`. This is what writes
`ClearpilotSpeedDisplay`/`ClearpilotSpeedLimitDisplay`/
`ClearpilotCruiseWarning` for the UI overlay.
5. **Lane-change suppression sync** — at the existing baseline
`clearpilot_disable_lat_on_lane_change` block (around line 687), in
addition to the existing `params_memory.put_bool("no_lat_lane_change", ...)`,
also set `self.frogpilot_variables.no_lat_lane_change = True/False`.
This is required because the kept (post-`62a403d`) carcontroller
reads off `frogpilot_variables`, not Params.
### What does NOT change
No edits to lateral or longitudinal control paths. No new actuator-side
behavior. The UI features (nightrider/screen-off/auto day-night, speed
overlay, cruise warning chime) get wired back up purely through param
writes.
---
## Commit 3: `887b9c9` — Parked-controlsd mode
Architectural fix for the original problem this whole session was
chasing: while ignition is on but the car is in Park, the entire onroad
stack (modeld, planner, control, locationd, calibrationd, paramsd,
torqued, dmonitoring*, soundd, loggerd) is running and burning CPU/fan
even though none of it is needed.
Solution: redefine "onroad" as **ignition AND not parked** instead of
just **ignition**. Reuse the existing `started`-based process gating in
manager. Add a tiny second controlsd variant that runs while parked,
just to keep CAN flowing so thermald can see when gear leaves Park.
### Files
#### NEW: `selfdrive/controls/controlsd_parked.py`
Minimal entry point. Roughly:
```python
def main():
config_realtime_process(4, Priority.CTRL_HIGH)
card = CarD() # blocks until first CAN, fingerprints car
card.initialize()
fv = _make_default_frogpilot_variables() # safe False/0 SimpleNamespace
while True:
card.state_update(fv) # publishes carState/carOutput/carParams
```
`CarD.state_update` blocks via `drain_sock_raw(wait_for_one=True)`, so
the loop is paced by CAN traffic — no extra sleep, no CPU spin.
The default `frogpilot_variables` sets these to safe values so
`CarInterfaceBase.update` doesn't `AttributeError`:
`conditional_experimental_mode`, `experimental_mode_via_distance`,
`traffic_mode`, `sport_plus`, `long_pitch`, `no_lat_lane_change` — all
False.
#### `selfdrive/thermald/thermald.py`
- `onroad_conditions` now also has `"not_parked"`. Initialized to
`False` (assume parked at boot) so the heavy stack waits for carState
to confirm gear has left Park before spinning up.
- New module-level loop variables: `is_parked = True`,
`parked_since: float | None = None`, `PARKED_HYSTERESIS_S = 1.5`,
`ignition_param_prev: bool | None = None`.
- New block right after the panda-disconnect check (around line 258 in
current state) reads `sm['carState'].gearShifter`:
- Gear == Park: latch `parked_since`, flip `is_parked = True` after
1.5s of continuous Park (hysteresis).
- Gear != Park: clear `parked_since`, `is_parked = False` immediately
(no hysteresis going out).
- Reverse is treated as **not parked** — driver is moving.
- `onroad_conditions["not_parked"] = not is_parked` every tick.
- New `IgnitionOn` Params write, edge-driven (only on change of
`onroad_conditions["ignition"]`) so we don't hammer the persistent
filesystem 2x/sec.
#### `selfdrive/manager/process_config.py`
- New predicate:
```python
def parked_only(started, params, CP):
return params.get_bool("IgnitionOn") and not started
```
- New process entry directly after the existing `controlsd` entry:
```python
PythonProcess("controlsd_parked", "selfdrive.controls.controlsd_parked", parked_only),
```
- `controlsd` is unchanged (still `only_onroad`). Mutually exclusive
with `parked_only` because `started` is the negation of the relevant
condition.
#### `selfdrive/manager/manager.py`
- Single line in `manager_init()` seeding `IgnitionOn=False` so the
predicate evaluates correctly before thermald's first tick.
#### `common/params.cc`
- New entry in the alphabetical I-block:
```cpp
{"IgnitionOn", CLEAR_ON_MANAGER_START},
```
- Manager predicates can only see persistent Params (not pandaStates
or `/dev/shm/params`), which is why thermald has to expose ignition
this way.
### State machine summary
| Ignition | Gear | Predicates true | What runs |
|----------|-------------|-----------------------------|-------------------------------|
| off | any | none | always_run only (ui, thermald, pandad, deleter, …) |
| on | Park (>1.5s)| parked_only | always_run + controlsd_parked |
| on | not Park | only_onroad (=> all `started`) | full onroad stack |
The transition between rows 2 and 3 is purely manager process
swapping driven by predicate flips — no IPC handshake, no
self-termination. Thermald sees gear change, flips `not_parked`,
`should_start = all(onroad_conditions.values())` flips, manager kills
the wrong variant and spawns the right one on its next tick.
---
## What the on-device agent needs to know / debug
### Build prerequisites
A new param key was added (`IgnitionOn`), so the C++ params whitelist
needs a fresh build. Per `CLAUDE.md` "Adding New Params":
```bash
chown -R comma:comma /data/openpilot
rm -f /data/openpilot/prebuilt /data/openpilot/common/params.o /data/openpilot/common/libcommon.a
su - comma -c "bash /data/openpilot/build_only.sh"
```
`build_only.sh` already deletes `prebuilt` but **does not** delete
`params.o` / `libcommon.a` — verify those are gone before building or
the new key won't be picked up and `Params().put_bool("IgnitionOn", …)`
will throw `UnknownKeyName` in manager_init or thermald.
### How to verify the swap is working
```bash
# 1. Watch which controlsd variant is running
watch -n 1 'ps -ef | grep -E "controlsd(_parked)?" | grep -v grep'
# 2. Watch the gating signals
watch -n 1 'echo "IgnitionOn:"; cat /data/params/d/IgnitionOn 2>/dev/null; \
echo; echo "deviceState.started:" ; \
python3 -c "import cereal.messaging as m; \
s=m.sub_sock(\"deviceState\", timeout=1000); \
print(m.recv_one(s).deviceState.started)"'
# 3. Watch gear via carState
python3 -c "
import cereal.messaging as m
s = m.sub_sock('carState', timeout=2000)
while True:
msg = m.recv_one(s)
if msg: print(msg.carState.gearShifter)
"
```
Expected behavior:
- Ignition off: neither variant in `ps`. `IgnitionOn` is `0`/missing.
- Ignition on, in Park: `controlsd_parked` in `ps`, `controlsd` is not.
`started` is False. After ~1.5s of confirmed Park, modeld and friends
should have stopped.
- Shift to Drive: `controlsd_parked` disappears within ~500ms, full
`controlsd` appears, all the onroad processes spin up. There will be
a brief carState gap during the swap (~0.52s).
### What to suspect first if something breaks
1. **Manager crash on startup** with `UnknownKeyName: IgnitionOn`. Means
`params.o`/`libcommon.a` weren't rebuilt. Delete them and rebuild.
2. **`controlsd_parked` keeps respawning / dying.** Check
`/data/log2/current/controlsd_parked.log`. Most likely either:
- `CarD.__init__` is hanging on `get_one_can` because pandad isn't
up yet — should only matter on the very first boot.
- A `frogpilot_variables` attribute we missed defaulting; add it to
`_make_default_frogpilot_variables` in `controlsd_parked.py`.
3. **Full controlsd never spawns after shift to Drive.** Check
`IgnitionOn` (should be `1`), check carState.gearShifter (should not
be `park`/`unknown`), check thermald.log for `should_start` logic.
Also check that `carState` is actually being published by
`controlsd_parked`.
4. **Steering still pulls right.** The whole point of commit 1 is to
rule out the variable-FPS work as the cause. If the symptom persists
on baseline-restored driving logic, the suspect list shifts to:
- Anything still on the kept side (carcontroller's
`no_lat_lane_change` plumbing, thermald-related changes,
custom.capnp additions).
- Hardware: a calibration that drifted, an alignment issue, panda
CAN bus issue, or torque tuning that was modified outside of these
files.
- The custom driving model selected (`Params("Model")`). Confirm
which `.thneed`/`.onnx` is loaded — none of the model files
themselves were changed in this session.
5. **Panda safety alerts during park transition.** If panda logs a
"lost heartbeat" or drops to NOOUTPUT mode in the swap window, we
need controlsd_parked to issue a no-op carcontrol heartbeat to keep
panda happy. Not implemented in this session — flag for follow-up.
### Open follow-up items (NOT done in this session)
- **Cold-start latency on shift-from-Park.** Modeld load + calibration
warmup may produce a noticeable gap before lateral/long are ready.
Anticipatory wake on `CS.brakePressed && in_park` is the planned
mitigation if needed.
- **Methodically reintroduce optimizations.** Once baseline driving
feels right, the standstill optimizations (modeld 1fps, fan clamps,
etc.) can come back one at a time, each with a drive test.
- **The CAN-FD telemetry block in `selfdrive/car/hyundai/carstate.py`**
is preserved as a commented block at the bottom of `update_canfd()`.
Re-enabling requires uncommenting + restoring the `tlog` import at
the top of the file.
---
## Key file index for fast navigation
```
selfdrive/controls/controlsd.py # full controlsd, restored to baseline + UI hooks re-added
selfdrive/controls/controlsd_parked.py # NEW: parked-only CAN listener
selfdrive/controls/clearpilot_state_control # in controlsd.py, ~line 1255 — 5-state ScreenDisplayMode + speed_state tick
selfdrive/thermald/thermald.py # gear-aware not_parked + IgnitionOn writer (~line 258)
selfdrive/manager/process_config.py # parked_only predicate + new entry
selfdrive/manager/manager.py # IgnitionOn seed in manager_init
common/params.cc # IgnitionOn registered (CLEAR_ON_MANAGER_START)
selfdrive/car/card.py # CarD class — used by both controlsd variants, unchanged
selfdrive/clearpilot/speed_logic.py # SpeedState class — unchanged, called from controlsd.clearpilot_state_control
selfdrive/car/hyundai/carstate.py # restored baseline + commented telemetry block at end of update_canfd
```
## Reproducing the diff per commit
```bash
git show 47321e3 # baseline restore
git show f7e602c # UI hooks
git show 887b9c9 # parked mode
git diff 62a403d..887b9c9 # full session delta
```