13 Commits

Author SHA1 Message Date
brianhansonxyz 74e7c9e627 parked-controlsd: tester-present heartbeat; cruise-set keeps full stack; dashcam idle on park
Three independent changes for the parked-controlsd architecture.

controlsd_parked: send the Hyundai HDA2 tester-present heartbeat
("\x02\x3E\x80\x00\x00\x00\x00\x00") at 1 Hz to 0x730 on E-CAN while we're
the active controlsd variant. The full carcontroller normally sends this
to keep the ADAS ECU held in its disabled diagnostic session — when full
controlsd hands off to parked-controlsd, the heartbeat used to stop, the
ECU would time out (~5 s default-session timeout) and snap back to stock,
lighting up the LKAS / blind-spot warning icons on the cluster. Continuing
the heartbeat from the parked variant keeps the ECU disabled across the
swap. The panda safety filter only allows tester-present on 0x730 so this
is the only "graceful release" mechanism available to us.

thermald: cruise-set override on the parked check. If carState.cruiseState
.speed > 0 (engaged OR paused-with-speed-set), stay in not_parked even if
gear is in P. The user can shift to park at a stop, glance at the cluster
to verify cruise is still set, and roll forward without waiting for full
controlsd to spin up. PARKED_HYSTERESIS_S still applies for the
gear-in-park-no-cruise → parked transition.

dashcamd: close the trip immediately on gear shift to PARK (was: 10-min
idle timer before close). User wants the dashcam idle in park and a fresh
trip on every drive engagement; brief drive-thru / fuel-stop across-trip
continuity isn't valued.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 14:41:44 -05:00
brianhansonxyz ab9158bfb7 adopt pre-modelrevert clearpilot tree (d639e28) as the new head
Discard the modelrevert tree adoption (8b4b7e0) and the in-process park
short-circuits / cached-output / dashcam-idle work that came with it
(0dc8002, 37e095e). Restore the clearpilot tree as it stood at d639e28 —
the parked-controlsd manager-process split, the GPS-disable in locationd,
the controlsd UI hooks, the boardd ignition-edge safety_setter_thread
fix. After a full /data/params/d wipe and re-calibration drive, the
modelrevert-tree variant overcorrected on turns; reverting to the
parked-controlsd architecture (which Brian had previously vetted and
documented in 887b9c9 + 27cad05) and starting fresh.

Single new commit, no merge — file state matches d639e28 byte-for-byte.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 14:17:25 -05:00
brianhansonxyz 7a584a7e39 Revert "sessions: document hyundai canfd steer torque bump"
This reverts commit 5d76576a15.
2026-04-26 13:22:46 -05:00
brianhansonxyz 0dc8002c6d revert hyundai canfd torque to stock; park-publish keepalive; dashcam idle on park
Three independent changes bundled together.

Revert hyundai canfd steering torque limits to comma stock defaults
(270 / 2 / 3 / 112) in both panda safety and openpilot CarControllerParams.
The 270->324 bump caused overcorrection on turns and weaving on straights.
Web research turned up no public reports of any 4th gen Tucson NX4 owner
bumping STEER_MAX — the documented Tucson tuning effort is entirely on
lateralTuning (latAccelFactor ~2.9-3.1, friction ~0.095), not the cap.
hoomoose's EV6/Ioniq 5 PR #25723 is the canonical "raise STEER_MAX
without dropping latAccelFactor causes overcorrection" data point — and
even that change was reverted upstream. Right next move for this car is
to tune latAccelFactor / friction, not the torque ceiling.

plannerd: keep publishing longitudinalPlan at the normal cadence in park,
but skip update() compute. Skipping publishes entirely caused
longitudinalPlan to time out the alive flag at controlsd, which fired a
real commIssue ("not_alive") on park->drive. Stale published values are
fine because controlsd's own park short-circuit ignores the
longitudinalPlan content while parked. Also gate publish_ui_plan on
not-parked: it reads longitudinal_planner.a_desired_trajectory_full
which is only set inside update(), so calling it without a prior update
crashes plannerd with AttributeError (which fires "Process Not Running"
on screen). uiPlan is UI-only, not on controlsd's commIssue check list,
so going silent in park is fine.

frogpilot_process: same idea — keep publishing frogpilotPlan in park to
keep alive, skip the heavy update() compute.

dashcamd: close the trip immediately on gear shift to PARK (was: 10-min
idle timer before close). User wants the dashcam idle in park and a
fresh trip on every drive engagement; brief drive-thru / fuel-stop
across-trip continuity isn't valued.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 13:18:46 -05:00
brianhansonxyz 37e095eab7 controlsd: fix park short-circuit lac_log type mismatch crashing controlsd
The park short-circuit added in cea422b created lac_log as
LateralDebugState. publish_logs assigns lac_log to
controlsState.lateralControlState.<subtype> based on
CP.lateralTuning.which() — for torque-tuned cars (Hyundai Tucson here)
that's torqueState, expecting LateralTorqueState. The mismatch threw
a capnp KjException ("Value type mismatch") on the first state_control
cycle in park, killing controlsd. Manager respawned it, it crashed
again on the next park cycle, repeat — controlsState was never reliably
published, so the UI never saw the started/onroad transition and didn't
flip to the camera view.

Fix: call self.LaC.update(False, ...) which internally early-returns
when active is False (cheap) but produces the correct lac_log subtype
for whichever controller this car uses (LateralTorqueState for torque,
LateralPIDState for pid, LateralAngleState for angle). Pass through
the same sm reads the regular path uses; they're cached so the cost is
negligible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 12:43:08 -05:00
brianhansonxyz 5d76576a15 sessions: document hyundai canfd steer torque bump
Reference for the 270->324 change in commit 4058269. Covers what each
constant means, the two-files-must-stay-in-lockstep rule, the path to
push higher (324 -> 384, the community-consensus safe ceiling), risks
and symptoms to watch for at higher values, verification steps, and
rollback procedure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 12:31:10 -05:00
brianhansonxyz 4058269762 hyundai canfd: bump steer max 270->324 (~+20%), rate_up 2->3, rate_down 3->5
Tucson (and other Hyundai CAN-FD HDA2 cars) get pinned to comma's
conservative steer ceiling of 270 in the CAN-FD branch of
selfdrive/car/hyundai/values.py and the matching panda safety limit
HYUNDAI_CANFD_STEERING_LIMITS. Comma's non-CAN-FD HKG default is 384;
PR commaai/openpilot#25723 merged 384 for HDA1 EV6/Ioniq 5 with
operator quote "max steer needed to be 384 to make basic turns."

Bumping to 324 (~+20%) keeps a safe margin under the proven 384
ceiling while giving enough headroom for moderate clover-style on-ramps
without the controller hitting the limit. Rate limits also nudged up
(2->3 up, 3->5 down) to match the slightly higher ceiling and reduce
ramp time / release lag; chose 5 instead of comma's 7 for delta_down to
keep release smoother. max_rt_delta scaled proportionally (112->134) so
the 250ms real-time check tracks the new ceiling.

Both files MUST stay in lockstep — panda safety enforces independently
and would reject larger commands if only one side was bumped. Panda
firmware re-flashes automatically on next pandad start because the
safety code hash changed.

References:
  https://github.com/commaai/openpilot/pull/25723   (HDA1 270->384)
  https://github.com/commaai/openpilot/issues/24122 (HKG torque blacklist)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 12:29:06 -05:00
brianhansonxyz 8b4b7e04b5 adopt modelrevert tree as the new clearpilot baseline
Replaces clearpilot's working state wholesale with the modelrevert branch's
tree (modelrevert tip cea422b). Discards the parked-controlsd manager-process
split and the two session READMEs that documented it; keeps the simpler
in-process park short-circuits (controlsd state_control, plannerd, frogpilot_process)
and the cached-output decimation (modeld, dmonitoringmodeld) that achieve
the same goal with less moving parts. Also brings in the locationd GPS
ignore, the calibrationd valid=calStatus gate, and the model-revert lineage's
controlsd / paramsd / torqued / events.py / carstate.py / interfaces.py.

This is a single new commit on clearpilot (no merge), so the branch advances
linearly while the file state matches modelrevert exactly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 12:17:35 -05:00
brianhansonxyz d639e28057 unblock startup chain + disable GPS in locationd; document session
Four logically-related changes that together get the manager booting
cleanly end-to-end after the prior session's baseline revert + parked-mode
split, plus a session doc.

boardd: trigger safety_setter_thread on ignition rising edge, not on
IsOnroad rising edge. The parked-mode split broke the stock assumption
that ignition rising implies IsOnroad rising — IsOnroad now requires
`started`, which requires thermald to see carState != park, which
requires controlsd_parked to publish carState, which can't happen until
boardd acks OBD multiplexing. Triggering on ignition edge restores the
"set safety as soon as the bus is alive" intent for both controlsd
variants.

controlsd: drop the two `params_memory.put_bool("no_lat_lane_change",
...)` calls. The key was never registered in common/params.cc so
state_control crashed with UnknownKeyName on first cycle. The carcontroller
reads off `frogpilot_variables.no_lat_lane_change` (in-process), which
controlsd already sets; the UI reads `frogpilotCarControl.NoLatLaneChange`
from cereal, which nobody was setting in the restored controlsd. Add
`self.FPCC.noLatLaneChange = True/False` in the same lane-change branch
so the UI lane-edge indicator reflects state. No actuator change.

cereal/services.py: restore deviceState/managerState declarations to 2Hz
to match the restored DT_TRML=0.5 (thermald at 2Hz). Earlier fan-control
work bumped both to 5Hz; the realtime.py revert undid the thermald rate
bump but services.py wasn't reverted, so freq window [4.0, 6.0]Hz failed
on every cycle and controlsd fired commIssue continuously.

locationd: add a `clearpilot_disable_gps` const at the top of handle_gps
and OR it into the existing reject condition. With it true, every
gpsLocation message falls through to determine_gps_mode() — openpilot's
stock no-GPS path. last_gps_msg never updates, is_gps_ok() permanently
false. gpsd is untouched so UI / dashcamd / clock-set / night-mode auto-
switch keep working unchanged. The user's "drift right on straight roads"
symptom went away after this edit; the previous gpsd.py was hard-coding
vNED=[0,0,0] while the car was moving, feeding the Kalman contradictory
GPS-vs-IMU velocity observations that propagated into latcontrol_torque
through liveLocationKalman.angularVelocityCalibrated. Reversible by
flipping the const to false.

sessions/: a single README documenting this session. Includes the
calibrationd-still-stale investigation — liveCalibration.valid stuck at
False because of a Python SubMaster freq_ok issue with carState under
poll='cameraOdometry' (likely MSGQ NUM_READERS=12 eviction with too many
subscribers). 7ee923b already solved this exact failure mode by gating
calibrationd's publish on calStatus instead of sm.all_checks(); that
commit was reverted in 47321e3 as part of the variable-FPS rollback
but is unrelated to that family and is the natural next move.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 11:48:11 -05:00
brianhansonxyz 27cad05cd9 sessions: document baseline revert + parked-controlsd architecture
Briefing for the on-device agent covering commits 47321e3, f7e602c,
and 887b9c9: what was reverted vs kept, the UI re-wire, the parked
controlsd swap mechanism, build prerequisites, runtime verification
commands, and a troubleshooting playbook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 09:18:34 -05:00
brianhansonxyz 887b9c9e12 parked-controlsd mode: shut down heavy stack while ignition+park
Adds a second controlsd variant that runs while ignition is on but the
car is in Park. It only listens to CAN and publishes carState — no
model, no planner, no lateral/long control, no actuator commands — so
modeld, locationd, calibrationd, plannerd, radard, paramsd, torqued,
dmonitoring*, soundd, loggerd all stay stopped while parked.

Manager swaps between the two via mutually-exclusive predicates:
  - controlsd_parked: ignition AND not started
  - controlsd (full): started (= ignition AND not_parked)

Thermald owns the swap. It already subscribes to carState; we add a
new onroad condition `not_parked` derived from gearShifter, with a
1.5s hysteresis on going into parked (R/P/D thrash protection) and
zero hysteresis on going out (instant wake on shift to D/R/N). At
boot we assume parked so the heavy stack waits for carState to
confirm gear has actually left Park.

Manager predicates can only see persistent Params, not pandaStates,
so thermald exposes ignition as a new IgnitionOn param (edge-written).
Reverse is treated as not-parked — driver is moving.

Files:
- selfdrive/controls/controlsd_parked.py (new, ~50 lines)
- selfdrive/thermald/thermald.py: not_parked condition + IgnitionOn
- selfdrive/manager/process_config.py: parked_only predicate + entry
- selfdrive/manager/manager.py: seed IgnitionOn=False
- common/params.cc: register IgnitionOn

The full controlsd is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 09:13:41 -05:00
brianhansonxyz f7e602c00b controlsd: re-wire UI hooks on top of restored baseline
Adds back the UI plumbing that the baseline controlsd doesn't have, so
the existing UI features keep working without changing driving logic:

- ScreenDisplayMode 5-state machine (auto-normal, auto-nightrider,
  manual normal, screen-off, manual nightrider) driven by the LFA
  debug button + gear edges. Replaces baseline's simple 0..2 cycle.
  Pure params write — no actuator effect.
- Speed/cruise-warning overlay tick at ~2Hz feeding SpeedState, which
  writes ClearpilotSpeedDisplay / ClearpilotSpeedLimitDisplay /
  ClearpilotCruiseWarning for the onroad UI. Reads gpsLocation,
  CarSpeedLimit, and CS.cruiseState — no actuator effect.
- frogpilot_variables.no_lat_lane_change is now populated alongside
  the existing Params write, so the perf-optimized carcontroller (which
  takes the bit as an argument instead of re-reading Params on the
  100Hz hot path) still sees the lane-change suppression signal.
- ScreenDisplayMode init switched from put_bool to put_int (UI reads
  it as int).
- gpsLocation added to SubMaster (ignore_alive/avg_freq/valid set,
  since gpsd is a ClearPilot addition not present everywhere).

No changes to controlsd's lateral or longitudinal control paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 08:44:52 -05:00
brianhansonxyz 47321e3867 restore driving logic to pre-variable-fps baseline
Wholesale revert of driving-relevant files to the snapshot in
/projects/openpilot/archive/clearpilot (HEAD 980f0aa). Goal: get
known-good driving behavior back, then re-introduce optimizations
slowly to track down a "feels like the wheel pulls right" regression.

Files restored from baseline:
- selfdrive/controls/controlsd.py
- selfdrive/controls/lib/events.py
- selfdrive/controls/lib/longitudinal_planner.py
- selfdrive/modeld/modeld.py
- selfdrive/modeld/dmonitoringmodeld.py
- selfdrive/locationd/calibrationd.py
- selfdrive/locationd/paramsd.py
- selfdrive/locationd/torqued.py
- selfdrive/car/interfaces.py
- selfdrive/car/hyundai/carstate.py (CAN-FD telemetry preserved as a
  commented block for future re-enable)
- selfdrive/monitoring/dmonitoringd.py
- selfdrive/frogpilot/controls/frogpilot_planner.py
- common/realtime.py

Intentionally NOT restored (kept as current):
- selfdrive/thermald/* (fan/power tuning kept)
- selfdrive/car/hyundai/carcontroller.py + hyundaicanfd.py (perf-only
  hoist of no_lat_lane_change Params read; behavior-equivalent)
- cereal/services.py, cereal/custom.capnp (additive only)
- selfdrive/manager/*, common/params.cc (heavy ClearPilot
  infrastructure: bench mode, log dir, dashcamd, gpsd, params)
- All selfdrive/ui/, selfdrive/clearpilot/, system/clearpilot/

UI features will be re-wired in a follow-up commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 08:40:02 -05:00
24 changed files with 1041 additions and 413 deletions
+2 -2
View File
@@ -30,7 +30,7 @@ services: dict[str, tuple] = {
"temperatureSensor": (True, 2., 200),
"temperatureSensor2": (True, 2., 200),
"gpsNMEA": (True, 9.),
"deviceState": (True, 5., 1), # CLEARPILOT: 5Hz — thermald decimates pandaStates (10Hz) every 2 ticks
"deviceState": (True, 2., 1), # CLEARPILOT: matches DT_TRML=0.5 (thermald at 2Hz)
"can": (True, 100., 1223), # decimation gives ~5 msgs in a full segment
"controlsState": (True, 100., 10),
"pandaStates": (True, 10., 1),
@@ -70,7 +70,7 @@ services: dict[str, tuple] = {
"wideRoadEncodeIdx": (False, 20., 1),
"wideRoadCameraState": (True, 20., 20),
"modelV2": (True, 20., 40),
"managerState": (True, 5., 1), # CLEARPILOT: 5Hz — gated by deviceState arrival (see thermald)
"managerState": (True, 2., 1), # CLEARPILOT: gated by deviceState arrival, so matches thermald rate
"uploaderState": (True, 0., 1),
"navInstruction": (True, 1., 10),
"navRoute": (True, 0.),
+1
View File
@@ -138,6 +138,7 @@ std::unordered_map<std::string, uint32_t> keys = {
{"GsmRoaming", PERSISTENT},
{"HardwareSerial", PERSISTENT},
{"HasAcceptedTerms", PERSISTENT},
{"IgnitionOn", CLEAR_ON_MANAGER_START},
{"IMEI", PERSISTENT},
{"InstallDate", PERSISTENT},
{"IsDriverViewEnabled", CLEAR_ON_MANAGER_START},
+1 -1
View File
@@ -12,7 +12,7 @@ from openpilot.system.hardware import PC
# time step for each process
DT_CTRL = 0.01 # controlsd
DT_MDL = 0.05 # model
DT_TRML = 0.25 # thermald and manager — 4 Hz
DT_TRML = 0.5 # thermald and manager
DT_DMON = 0.05 # driver monitoring
+10 -2
View File
@@ -415,6 +415,7 @@ void panda_state_thread(std::vector<Panda *> pandas, bool spoofing_started) {
Panda *peripheral_panda = pandas[0];
bool is_onroad = false;
bool is_onroad_last = false;
bool ignition_last = false;
std::future<bool> safety_future;
std::vector<std::string> connected_serials;
@@ -472,8 +473,14 @@ void panda_state_thread(std::vector<Panda *> pandas, bool spoofing_started) {
is_onroad = params.getBool("IsOnroad");
// set new safety on onroad transition, after params are cleared
if (is_onroad && !is_onroad_last) {
// CLEARPILOT: trigger on ignition rising edge instead of IsOnroad rising edge.
// ClearPilot's parked-mode split breaks the stock assumption that IsOnroad
// rises with ignition: IsOnroad now requires `started`, which requires
// thermald to see carState != park, which requires controlsd_parked to
// finish CarD init, which requires this thread to ack OBD multiplexing.
// Firing on ignition restores the original "set safety as soon as the bus
// is alive" timing for both controlsd variants.
if (ignition && !ignition_last) {
if (!safety_future.valid() || safety_future.wait_for(0ms) == std::future_status::ready) {
safety_future = std::async(std::launch::async, safety_setter_thread, pandas);
} else {
@@ -482,6 +489,7 @@ void panda_state_thread(std::vector<Panda *> pandas, bool spoofing_started) {
}
is_onroad_last = is_onroad;
ignition_last = ignition;
sm.update(0);
const bool engaged = sm.allAliveAndValid({"controlsState"}) && sm["controlsState"].getControlsState().getEnabled();
+57 -27
View File
@@ -13,7 +13,6 @@ from openpilot.selfdrive.car.hyundai.hyundaicanfd import CanBus
from openpilot.selfdrive.car.hyundai.values import HyundaiFlags, CAR, DBC, CAN_GEARS, CAMERA_SCC_CAR, \
CANFD_CAR, Buttons, CarControllerParams
from openpilot.selfdrive.car.interfaces import CarStateBase
from openpilot.selfdrive.clearpilot.telemetry import tlog
PREV_BUTTON_SAMPLES = 8
CLUSTER_SAMPLE_RATE = 20 # frames
@@ -48,10 +47,6 @@ class CarState(CarStateBase):
self.is_metric = False
self.buttons_counter = 0
# CLEARPILOT: cache to avoid per-cycle atomic writes to /dev/shm (eats CPU via fsync/flock)
self._prev_car_speed_limit = None
self._prev_car_is_metric = None
self.cruise_info = {}
# On some cars, CLU15->CF_Clu_VehicleSpeed can oscillate faster than the dash updates. Sample at 5 Hz
@@ -214,15 +209,10 @@ class CarState(CarStateBase):
self.lkas_previously_enabled = self.lkas_enabled
self.lkas_enabled = cp.vl["BCM_PO_11"]["LFA_Pressed"]
# CLEARPILOT: gate on change — see same fix in update_canfd
car_speed_limit = self.calculate_speed_limit(cp, cp_cam) * speed_conv
if car_speed_limit != self._prev_car_speed_limit:
self.params_memory.put_float("CarSpeedLimit", car_speed_limit)
self._prev_car_speed_limit = car_speed_limit
if self.is_metric != self._prev_car_is_metric:
self.params_memory.put("CarIsMetric", "1" if self.is_metric else "0")
self._prev_car_is_metric = self.is_metric
# self.params_memory.put_int("CarSpeedLimitLiteral", self.calculate_speed_limit(cp, cp_cam))
self.params_memory.put_float("CarSpeedLimit", self.calculate_speed_limit(cp, cp_cam) * speed_conv)
self.params_memory.put_float("CarCruiseDisplayActual", cp_cruise.vl["SCC11"]["VSetDis"])
return ret
@@ -425,23 +415,63 @@ class CarState(CarStateBase):
# nonAdaptive = false,
# speedCluster = 0 )
# CLEARPILOT: gate on change — these writes run 100Hz, each is an atomic fsync/flock transaction
car_speed_limit = self.calculate_speed_limit(cp, cp_cam) * speed_factor
if car_speed_limit != self._prev_car_speed_limit:
self.params_memory.put_float("CarSpeedLimit", car_speed_limit)
self._prev_car_speed_limit = car_speed_limit
if self.is_metric != self._prev_car_is_metric:
self.params_memory.put("CarIsMetric", "1" if self.is_metric else "0")
self._prev_car_is_metric = self.is_metric
# print("Set limit")
# print(self.calculate_speed_limit(cp, cp_cam))
# self.params_memory.put_float("CarSpeedLimitLiteral", self.calculate_speed_limit(cp, cp_cam))
self.params_memory.put_float("CarSpeedLimit", self.calculate_speed_limit(cp, cp_cam) * speed_factor)
# CLEARPILOT: telemetry logging — disabled, re-enable when needed
# CLEARPILOT: CAN-FD telemetry — preserved but disabled. Re-enable by uncommenting (also restore the import).
# from openpilot.selfdrive.clearpilot.telemetry import tlog
#
# speed_limit_bus = cp if self.CP.flags & HyundaiFlags.CANFD_HDA2 else cp_cam
# scc = cp_cam.vl["SCC_CONTROL"] if self.CP.flags & HyundaiFlags.CANFD_CAMERA_SCC else cp.vl["SCC_CONTROL"]
# cluster = speed_limit_bus.vl["CLUSTER_SPEED_LIMIT"]
# tlog("car", { ... })
# tlog("cruise", { ... })
# tlog("speed_limit", { ... })
# tlog("buttons", { ... })
#
# tlog("car", {
# "vEgo": round(ret.vEgo, 3),
# "vEgoRaw": round(ret.vEgoRaw, 3),
# "aEgo": round(ret.aEgo, 3),
# "steeringAngleDeg": round(ret.steeringAngleDeg, 1),
# "gear": str(ret.gearShifter),
# "brakePressed": ret.brakePressed,
# "gasPressed": ret.gasPressed,
# "standstill": ret.standstill,
# "leftBlinker": ret.leftBlinker,
# "rightBlinker": ret.rightBlinker,
# })
#
# tlog("cruise", {
# "enabled": ret.cruiseState.enabled,
# "available": ret.cruiseState.available,
# "speed": round(ret.cruiseState.speed, 3),
# "standstill": ret.cruiseState.standstill,
# "accFaulted": ret.accFaulted,
# "ACCMode": scc.get("ACCMode", 0),
# "VSetDis": scc.get("VSetDis", 0),
# "aReqRaw": round(scc.get("aReqRaw", 0), 3),
# "aReqValue": round(scc.get("aReqValue", 0), 3),
# "DISTANCE_SETTING": scc.get("DISTANCE_SETTING", 0),
# "ACC_ObjDist": round(scc.get("ACC_ObjDist", 0), 1),
# })
#
# tlog("speed_limit", {
# "SPEED_LIMIT_1": cluster.get("SPEED_LIMIT_1", 0),
# "SPEED_LIMIT_2": cluster.get("SPEED_LIMIT_2", 0),
# "SPEED_LIMIT_3": cluster.get("SPEED_LIMIT_3", 0),
# "SCHOOL_ZONE": cluster.get("SCHOOL_ZONE", 0),
# "CHIME_1": cluster.get("CHIME_1", 0),
# "CHIME_2": cluster.get("CHIME_2", 0),
# "SPEED_CHANGE_BLINKING": cluster.get("SPEED_CHANGE_BLINKING", 0),
# "calculated": self.calculate_speed_limit(cp, cp_cam),
# "is_metric": self.is_metric,
# })
#
# tlog("buttons", {
# "cruise_button": self.cruise_buttons[-1],
# "main_button": self.main_buttons[-1],
# "lkas_enabled": self.lkas_enabled,
# "main_enabled": self.main_enabled,
# })
return ret
-9
View File
@@ -484,19 +484,10 @@ class CarInterfaceBase(ABC):
self.silent_steer_warning = True
events.add(EventName.steerTempUnavailableSilent)
else:
# CLEARPILOT: log once per instance of this warning
if not getattr(self, '_steer_fault_logged', False):
import sys
print(f"CLP steerTempUnavailable: steerFaultTemporary={cs_out.steerFaultTemporary} "
f"steeringPressed={cs_out.steeringPressed} standstill={cs_out.standstill} "
f"steering_unpressed={self.steering_unpressed} steeringAngleDeg={cs_out.steeringAngleDeg:.1f} "
f"steeringTorque={cs_out.steeringTorque:.1f} vEgo={cs_out.vEgo:.2f}", file=sys.stderr)
self._steer_fault_logged = True
events.add(EventName.steerTempUnavailable)
else:
self.no_steer_warning = False
self.silent_steer_warning = False
self._steer_fault_logged = False
if cs_out.steerFaultPermanent:
events.add(EventName.steerUnavailable)
+12 -11
View File
@@ -12,20 +12,20 @@
* Trip lifecycle state machine:
*
* WAITING:
* - Process starts in this state
* - Process starts in this state. Idle.
* - Waits for valid system time (year >= 2024) AND car in drive gear
* - Transitions to RECORDING when both conditions met
*
* RECORDING:
* - Actively encoding frames, car is in drive
* - Car leaves drive → start 10-min idle timer → IDLE_TIMEOUT
*
* IDLE_TIMEOUT:
* - Car left drive, still recording with timer running
* - Car re-enters drive → cancel timer → RECORDING
* - Timer expires → close trip → WAITING
* - Gear shift into PARK → close trip immediately → WAITING (idle)
* - Ignition off → close trip → WAITING
*
* IDLE_TIMEOUT (deprecated, retained for safety):
* - Was used to keep recording across brief drive-thru / fuel stops.
* - Now unreachable: drive→park transitions close the trip immediately
* and a fresh trip starts on the next drive engagement.
*
* Graceful shutdown (DashcamShutdown param):
* - thermald sets DashcamShutdown="1" before device power-off
* - dashcamd closes current segment, acks, exits
@@ -301,10 +301,11 @@ int main(int argc, char *argv[]) {
}
case RECORDING:
if (!in_drive) {
idle_timer_start = now;
state = IDLE_TIMEOUT;
LOGW("dashcamd: car left drive, starting 10-min idle timer");
// CLEARPILOT: close trip immediately on park (no idle timer). User wants
// dashcam idle in park, fresh trip on each drive engagement.
if (gear == cereal::CarState::GearShifter::PARK) {
LOGW("dashcamd: gear in park, closing trip");
close_trip();
}
break;
+90 -210
View File
@@ -37,10 +37,11 @@ from openpilot.system.version import get_short_branch
from openpilot.selfdrive.frogpilot.controls.lib.frogpilot_functions import CRUISING_SPEED, PROBABILITY, MovingAverageCalculator
from openpilot.selfdrive.frogpilot.controls.lib.model_manager import RADARLESS_MODELS
from openpilot.selfdrive.clearpilot.telemetry import tlog
from openpilot.selfdrive.clearpilot.speed_logic import SpeedState
from openpilot.selfdrive.frogpilot.controls.lib.speed_limit_controller import SpeedLimitController
# CLEARPILOT: UI plumbing for ScreenDisplayMode and the speed/cruise-warning overlay
from openpilot.selfdrive.clearpilot.speed_logic import SpeedState
SOFT_DISABLE_TIME = 3 # seconds
LDW_MIN_SPEED = 31 * CV.MPH_TO_MS
LANE_DEPARTURE_THRESHOLD = 0.1
@@ -49,7 +50,7 @@ CAMERA_OFFSET = 0.04
REPLAY = "REPLAY" in os.environ
SIMULATION = "SIMULATION" in os.environ
TESTING_CLOSET = "TESTING_CLOSET" in os.environ
IGNORE_PROCESSES = {"loggerd", "encoderd", "statsd", "telemetryd", "dashcamd"}
IGNORE_PROCESSES = {"loggerd", "encoderd", "statsd"}
ThermalStatus = log.DeviceState.ThermalStatus
State = log.ControlsState.OpenpilotState
@@ -79,11 +80,14 @@ class Controls:
self.params_storage = Params("/persist/params")
self.params_memory.put_bool("CPTLkasButtonAction", False)
# CLEARPILOT: ScreenDisplayMode is an int (5-state machine: 0..4); UI reads it via getInt
self.params_memory.put_int("ScreenDisplayMode", 0)
# ClearPilot speed processing
# CLEARPILOT: speed/cruise-warning overlay state, ticked at ~2Hz from clearpilot_state_control
self.speed_state = SpeedState()
self.speed_state_frame = 0
# CLEARPILOT: edge tracking for park->drive auto-wake of screen
self.was_driving_gear = False
self.radarless_model = self.params.get("Model", encoding='utf-8') in RADARLESS_MODELS
@@ -114,8 +118,7 @@ class Controls:
'carOutput', 'driverMonitoringState', 'longitudinalPlan', 'liveLocationKalman',
'managerState', 'liveParameters', 'radarState', 'liveTorqueParameters',
'testJoystick', 'frogpilotPlan', 'gpsLocation'] + self.camera_packets + self.sensor_packets,
ignore_alive=ignore+['gpsLocation'], ignore_avg_freq=ignore+['radarState', 'testJoystick', 'gpsLocation'],
ignore_valid=['testJoystick', 'gpsLocation'],
ignore_alive=ignore + ['gpsLocation'], ignore_avg_freq=ignore+['radarState', 'testJoystick', 'gpsLocation'], ignore_valid=['testJoystick', 'gpsLocation'],
frequency=int(1/DT_CTRL))
self.joystick_mode = self.params.get_bool("JoystickDebugMode")
@@ -170,26 +173,6 @@ class Controls:
self.current_alert_types = [ET.PERMANENT]
self.logged_comm_issue = None
self.not_running_prev = None
self.lat_requested_ts = 0 # CLEARPILOT: timestamp when lat was first requested (ramp-up delay + comm issue suppression)
self.prev_lat_requested = False
# CLEARPILOT: gate frogpilotCarControl send on change (3 static bools, was publishing 100Hz)
self._prev_fpcc = (None, None, None, None, None)
self._last_fpcc_send_frame = 0
# CLEARPILOT: hysteresis counters for transient-filter on cascade alerts. A blip
# under 50ms (5 cycles at 100Hz) is typical of MSGQ conflate + slow-polling-consumer
# freq-measurement artifacts — not a real comm issue. Persistent failure still
# alerts after 50ms, well inside a driver's reaction window.
self._hyst_commissue = 0
self._hyst_locationd_tmp = 0
self._hyst_paramsd_tmp = 0
self._hyst_posenet = 0
self.HYST_CYCLES = 5
# CLEARPILOT: parked-cycle skip — in park+ignition-on, run the full step() only
# every 10th cycle. CAN parse + CAN TX still happen every cycle; outer logic
# (events, state machine, PID/MPC, publishing) runs at 10Hz. Cached message
# bytes are re-published on skipped cycles so downstream freq_ok stays OK.
self._cached_controlsState_bytes = None
self._cached_carControl_bytes = None
self.steer_limited = False
self.desired_curvature = 0.0
self.experimental_mode = False
@@ -224,8 +207,6 @@ class Controls:
self.always_on_lateral_main = self.always_on_lateral and self.params.get_bool("AlwaysOnLateralMain")
self.drive_added = False
self.driving_gear = False
self.was_driving_gear = False
self.fcw_random_event_triggered = False
self.holiday_theme_alerted = False
self.onroad_distance_pressed = False
@@ -264,51 +245,27 @@ class Controls:
CS = self.data_sample()
cloudlog.timestamp("Data sampled")
# CLEARPILOT: handle debug-button press + display-mode transitions on every
# cycle — button edge events only live in the cycle's CS.buttonEvents and
# would otherwise be dropped on skipped cycles.
self.handle_screen_mode(CS)
self.update_events(CS)
self.update_frogpilot_events(CS)
self.update_clearpilot_events(CS)
# CLEARPILOT: in park, only run the full step() every 10th cycle. data_sample
# above still runs (CAN parse, button detection). Below, card.controls_update
# still runs (CAN TX heartbeat, counters increment). The skipped outer logic
# (events, state machine, PID/MPC, publishing) causes at most ~100ms lag on
# button→state transitions, which is fine in park. Cached message bytes are
# re-sent so downstream consumers see steady 100Hz.
parked = CS.gearShifter == car.CarState.GearShifter.park
full_cycle = (not parked) or (self.sm.frame % 10 == 0) or (self._cached_controlsState_bytes is None)
cloudlog.timestamp("Events updated")
if full_cycle:
self.update_events(CS)
self.update_frogpilot_events(CS)
self.update_clearpilot_events(CS)
if not self.CP.passive and self.initialized:
# Update control state
self.state_transition(CS)
cloudlog.timestamp("Events updated")
# Compute actuators (runs PID loops and lateral MPC)
CC, lac_log = self.state_control(CS)
CC = self.clearpilot_state_control(CC, CS)
if not self.CP.passive and self.initialized:
# Update control state
self.state_transition(CS)
# Publish data
self.publish_logs(CS, start_time, CC, lac_log)
# Compute actuators (runs PID loops and lateral MPC)
CC, lac_log = self.state_control(CS)
CC = self.clearpilot_state_control(CC, CS)
# Publish data (also sends CAN TX via card.controls_update inside)
self.publish_logs(CS, start_time, CC, lac_log)
self.CS_prev = CS
# Update FrogPilot variables
self.update_frogpilot_variables(CS)
else:
# CAN TX heartbeat: keep counters incrementing and CAN frames flowing to the car
if not self.CP.passive and self.initialized:
self.card.controls_update(self.CC, self.frogpilot_variables)
# Re-publish cached messages so downstream freq_ok checks don't trip
self.pm.send('controlsState', self._cached_controlsState_bytes)
self.pm.send('carControl', self._cached_carControl_bytes)
self.CS_prev = CS
self.CS_prev = CS
# Update FrogPilot variables
self.update_frogpilot_variables(CS)
def data_sample(self):
"""Receive data from sockets and update carState"""
@@ -494,24 +451,6 @@ class Controls:
# All events here should at least have NO_ENTRY and SOFT_DISABLE.
num_events = len(self.events)
# CLEARPILOT: compute model standby suppression early — used by multiple checks below
try:
standby_ts = float(self.params_memory.get("ModelStandbyTs") or "0")
except (ValueError, TypeError):
standby_ts = 0
now = time.monotonic()
model_suppress = (now - standby_ts) < 2.0
# CLEARPILOT: suppress commIssue/location/params errors for 2s after lat was requested
# Model FPS jumps from 4 to 20 on engage, causing transient freq check failures
# lat_requested_ts is set in state_control on the False->True transition of LatRequested
# CLEARPILOT: 2s window — long enough to cover the modeld rate transition burst
# without hiding real comms issues. Extending this further risks masking a genuine
# fault that demands immediate driver takeover.
lat_engage_suppress = (now - self.lat_requested_ts) < 2.0
# CLEARPILOT: expose suppress flags + standby_ts for the debug dumper in step()
not_running = {p.name for p in self.sm['managerState'].processes if not p.running and p.shouldBeRunning}
if self.sm.recv_frame['managerState'] and (not_running - IGNORE_PROCESSES):
self.events.add(EventName.processNotRunning)
@@ -524,17 +463,10 @@ class Controls:
self.events.add(EventName.cameraMalfunction)
elif not self.sm.all_freq_ok(self.camera_packets):
self.events.add(EventName.cameraFrameRate)
# CLEARPILOT: alert only when avg cycle time exceeds 20ms (≈50Hz effective).
# Stock rk.lagging fires at 11.1ms (90% of 10ms) which the Hyundai CAN load
# routinely crosses while driving — that's normal, not a fault. 50Hz control is
# still plenty responsive. `rk.lagging` is still used defensively elsewhere
# (lines ~479, 492) to skip secondary checks when slightly overloaded.
avg_dt = sum(self.rk._dts) / len(self.rk._dts)
alert_lagging = avg_dt > 0.020
if not REPLAY and alert_lagging and not model_suppress and not lat_engage_suppress:
if not REPLAY and self.rk.lagging:
self.events.add(EventName.controlsdLagging)
if not self.radarless_model:
if not model_suppress and (len(self.sm['radarState'].radarErrors) or (not self.rk.lagging and not self.sm.all_checks(['radarState']))):
if len(self.sm['radarState'].radarErrors) or (not self.rk.lagging and not self.sm.all_checks(['radarState'])):
self.events.add(EventName.radarFault)
if not self.sm.valid['pandaStates']:
self.events.add(EventName.usbError)
@@ -546,60 +478,34 @@ class Controls:
# generic catch-all. ideally, a more specific event should be added above instead
has_disable_events = self.events.contains(ET.NO_ENTRY) and (self.events.contains(ET.SOFT_DISABLE) or self.events.contains(ET.IMMEDIATE_DISABLE))
no_system_errors = (not has_disable_events) or (len(self.events) == num_events)
# CLEARPILOT: commIssue fires ONLY when messages actually aren't flowing. If all
# subscribers are alive and CAN isn't timing out, comms are fine — any `valid=False`
# is an upstream self-declaration that cascades from the MSGQ conflate +
# slow-polling-consumer freq_ok measurement artifact. The car drives correctly
# during these cascades (content is still usable), so we stop raising the banner.
# Genuine comms failures — missing messages or CAN RX timeout — still fire.
comms_really_broken = (not self.sm.all_alive()) or self.card.can_rcv_timeout
commissue_cond = comms_really_broken and no_system_errors and not model_suppress and not lat_engage_suppress
if commissue_cond:
self._hyst_commissue += 1
else:
self._hyst_commissue = 0
if self._hyst_commissue >= self.HYST_CYCLES:
self.events.add(EventName.commIssue)
if commissue_cond:
# Log ONCE per entry into comms-really-broken state — no hyst counter so it doesn't
# spam every cycle (hyst changes every cycle and would make logs dict always differ).
if (not self.sm.all_checks() or self.card.can_rcv_timeout) and no_system_errors:
if not self.sm.all_alive():
self.events.add(EventName.commIssue)
elif not self.sm.all_freq_ok():
self.events.add(EventName.commIssueAvgFreq)
else: # invalid or can_rcv_timeout.
self.events.add(EventName.commIssue)
logs = {
'invalid': [s for s, valid in self.sm.valid.items() if not valid],
'not_alive': [s for s, alive in self.sm.alive.items() if not alive],
'not_freq_ok': [s for s, freq_ok in self.sm.freq_ok.items() if not freq_ok],
'can_rcv_timeout': self.card.can_rcv_timeout,
}
if logs != self.logged_comm_issue:
import sys, json
print("CLP commIssue " + json.dumps(logs), file=sys.stderr, flush=True)
cloudlog.event("commIssue", error=True, **logs)
self.logged_comm_issue = logs
else:
self.logged_comm_issue = None
if not (self.CP.notCar and self.joystick_mode):
# CLEARPILOT: suppress locationd/paramsd "temporary error" when we're in the
# freq-only cascade (same root as the commIssue suppression above). These events
# cascade from upstream freq_ok artifacts — locationd's filterInitialized latches
# False if calibrationd tips invalid due to freq, which cascades here.
# Real failures still fire commIssue above via alive/valid/can_rcv_timeout gate.
freq_only_cascade = (not self.sm.all_checks()) and not comms_really_broken
# deviceFalling is real-physics (shock sensor), no hysteresis, no cascade suppression.
if not self.sm['liveLocationKalman'].posenetOK:
self.events.add(EventName.posenetInvalid)
if not self.sm['liveLocationKalman'].deviceStable:
self.events.add(EventName.deviceFalling)
posenet_bad = not self.sm['liveLocationKalman'].posenetOK and not model_suppress and not lat_engage_suppress and not freq_only_cascade
self._hyst_posenet = self._hyst_posenet + 1 if posenet_bad else 0
if self._hyst_posenet >= self.HYST_CYCLES:
self.events.add(EventName.posenetInvalid)
locd_bad = not self.sm['liveLocationKalman'].inputsOK and not model_suppress and not lat_engage_suppress and not freq_only_cascade
self._hyst_locationd_tmp = self._hyst_locationd_tmp + 1 if locd_bad else 0
if self._hyst_locationd_tmp >= self.HYST_CYCLES:
if not self.sm['liveLocationKalman'].inputsOK:
self.events.add(EventName.locationdTemporaryError)
paramsd_bad = not self.sm['liveParameters'].valid and not TESTING_CLOSET and (not SIMULATION or REPLAY) and not model_suppress and not lat_engage_suppress and not freq_only_cascade
self._hyst_paramsd_tmp = self._hyst_paramsd_tmp + 1 if paramsd_bad else 0
if self._hyst_paramsd_tmp >= self.HYST_CYCLES:
if not self.sm['liveParameters'].valid and not TESTING_CLOSET and (not SIMULATION or REPLAY):
self.events.add(EventName.paramsdTemporaryError)
# conservative HW alert. if the data or frequency are off, locationd will throw an error
@@ -613,8 +519,16 @@ class Controls:
if self.cruise_mismatch_counter > int(6. / DT_CTRL):
self.events.add(EventName.cruiseMismatch)
# CLEARPILOT: FCW disabled — car's own radar AEB works better and triggers reliably.
# The comma FCW was producing false positives and adds nothing over the stock system.
# Check for FCW
stock_long_is_braking = self.enabled and not self.CP.openpilotLongitudinalControl and CS.aEgo < -1.25
model_fcw = self.sm['modelV2'].meta.hardBrakePredicted and not CS.brakePressed and not stock_long_is_braking
planner_fcw = self.sm['longitudinalPlan'].fcw and self.enabled
if planner_fcw or model_fcw:
self.events.add(EventName.fcw)
# self.fcw_random_event_triggered = True
# elif self.fcw_random_event_triggered and self.random_events:
# self.events.add(EventName.yourFrogTriedToKillMe)
# self.fcw_random_event_triggered = False
for m in messaging.drain_sock(self.log_sock, wait_for_one=False):
try:
@@ -637,7 +551,7 @@ class Controls:
self.distance_traveled = 0
self.distance_traveled += CS.vEgo * DT_CTRL
if self.sm['modelV2'].frameDropPerc > 20 and not model_suppress:
if self.sm['modelV2'].frameDropPerc > 20:
self.events.add(EventName.modeldLagging)
@@ -724,14 +638,6 @@ class Controls:
# Check if openpilot is engaged and actuators are enabled
self.enabled = self.state in ENABLED_STATES
self.active = self.state in ACTIVE_STATES
# CLEARPILOT: engagement telemetry disabled — was running at 100Hz, causing CPU load
# tlog("engage", {
# "state": self.state.name if hasattr(self.state, 'name') else str(self.state),
# "enabled": self.enabled, "active": self.active,
# "cruise_enabled": CS.cruiseState.enabled, "cruise_available": CS.cruiseState.available,
# "brakePressed": CS.brakePressed,
# })
if self.active:
self.current_alert_types.append(ET.WARNING)
@@ -762,19 +668,8 @@ class Controls:
# Check which actuators can be enabled
standstill = CS.vEgo <= max(self.CP.minSteerSpeed, MIN_LATERAL_CONTROL_SPEED) or CS.standstill
# CLEARPILOT: lat_requested ignores momentary MDPS faults so modeld doesn't drop rate during
# brief steerFaultTemporary blips (stale predictions on recovery caused a fault feedback loop).
# lat_engaged gates the actual steering command and still respects the fault.
lat_requested = (self.active or self.FPCC.alwaysOnLateral) and self.speed_check and \
(not standstill or self.joystick_mode)
lat_engaged = lat_requested and not CS.steerFaultTemporary and not CS.steerFaultPermanent
now_ts = time.monotonic()
if lat_requested and not self.prev_lat_requested:
self.lat_requested_ts = now_ts
self.prev_lat_requested = lat_requested
self.FPCC.latRequested = lat_requested
CC.latActive = lat_engaged and (now_ts - self.lat_requested_ts) >= 0.25
CC.latActive = (self.active or self.FPCC.alwaysOnLateral) and self.speed_check and not CS.steerFaultTemporary and not CS.steerFaultPermanent and \
(not standstill or self.joystick_mode)
CC.longActive = self.enabled and not self.events.contains(ET.OVERRIDE_LONGITUDINAL) and self.CP.openpilotLongitudinalControl
actuators = CC.actuators
@@ -789,13 +684,13 @@ class Controls:
CC.leftBlinker = model_v2.meta.laneChangeDirection == LaneChangeDirection.left
CC.rightBlinker = model_v2.meta.laneChangeDirection == LaneChangeDirection.right
no_lat_lane_change = model_v2.meta.laneChangeState == LaneChangeState.laneChangeStarting and clearpilot_disable_lat_on_lane_change
if no_lat_lane_change:
if model_v2.meta.laneChangeState == LaneChangeState.laneChangeStarting and clearpilot_disable_lat_on_lane_change:
CC.latActive = False
# CLEARPILOT: noLatLaneChange now lives on frogpilotCarControl (see update_frogpilot_variables).
# Also mirrored to frogpilot_variables so in-process carcontroller reads it without IPC.
self.FPCC.noLatLaneChange = no_lat_lane_change
self.frogpilot_variables.no_lat_lane_change = no_lat_lane_change
self.frogpilot_variables.no_lat_lane_change = True
self.FPCC.noLatLaneChange = True
else:
self.frogpilot_variables.no_lat_lane_change = False
self.FPCC.noLatLaneChange = False
if CS.leftBlinker or CS.rightBlinker:
self.last_blinker_frame = self.sm.frame
@@ -1031,8 +926,6 @@ class Controls:
controlsState.lateralControlState.torqueState = lac_log
self.pm.send('controlsState', dat)
# CLEARPILOT: cache for re-publication on parked-skip cycles
self._cached_controlsState_bytes = dat.to_bytes()
# onroadEvents - logged every second or on change
if (self.sm.frame % int(1. / DT_CTRL) == 0) or (self.events.names != self.events_prev):
@@ -1047,7 +940,6 @@ class Controls:
cc_send.valid = CS.canValid
cc_send.carControl = CC
self.pm.send('carControl', cc_send)
self._cached_carControl_bytes = cc_send.to_bytes()
# copy CarControl to pass to CarInterface on the next iteration
self.CC = CC
@@ -1078,11 +970,6 @@ class Controls:
while True:
self.step()
self.rk.monitor_time()
# CLEARPILOT: accumulated debt from startup/blocking calls never recovers.
# Reset the rate keeper when catastrophically behind — prevents a one-time
# ~8s fingerprinting stall from poisoning the lag metric for the whole session.
if self.rk.remaining < -1.0:
self.rk._next_frame_time = time.monotonic() + self.rk._interval
except SystemExit:
e.set()
t.join()
@@ -1153,11 +1040,13 @@ class Controls:
elif self.sm['modelV2'].meta.turnDirection == Desire.turnRight:
self.events.add(EventName.turningRight)
# CLEARPILOT: check crash file once per second, not every cycle (stat syscall at 100Hz hurts)
if self.sm.frame % 100 == 0:
self._crashed_file_exists = os.path.isfile(os.path.join(sentry.CRASHES_DIR, 'error.txt'))
if getattr(self, '_crashed_file_exists', False) and self.crashed_timer < 10:
if os.path.isfile(os.path.join(sentry.CRASHES_DIR, 'error.txt')) and self.crashed_timer < 10:
self.events.add(EventName.openpilotCrashed)
# if self.random_events and not self.openpilot_crashed_triggered:
# self.events.add(EventName.openpilotCrashedRandomEvents)
# self.openpilot_crashed_triggered = True
self.crashed_timer += DT_CTRL
# if self.speed_limit_alert or self.speed_limit_confirmation:
@@ -1280,18 +1169,10 @@ class Controls:
self.FPCC.trafficModeActive = self.frogpilot_variables.traffic_mode and self.params_memory.get_bool("TrafficModeActive")
# CLEARPILOT: send only when any field changes, with 1Hz keepalive
# Saves ~7ms/sec of capnp+zmq work (was sending 100Hz despite static contents)
# latRequested and noLatLaneChange are edge-triggered too (rare flips)
fpcc_tuple = (self.FPCC.alwaysOnLateral, self.FPCC.speedLimitChanged, self.FPCC.trafficModeActive,
self.FPCC.latRequested, self.FPCC.noLatLaneChange)
if fpcc_tuple != self._prev_fpcc or (self.sm.frame - self._last_fpcc_send_frame) >= 100:
fpcc_send = messaging.new_message('frogpilotCarControl')
fpcc_send.valid = CS.canValid
fpcc_send.frogpilotCarControl = self.FPCC
self.pm.send('frogpilotCarControl', fpcc_send)
self._prev_fpcc = fpcc_tuple
self._last_fpcc_send_frame = self.sm.frame
fpcc_send = messaging.new_message('frogpilotCarControl')
fpcc_send.valid = CS.canValid
fpcc_send.frogpilotCarControl = self.FPCC
self.pm.send('frogpilotCarControl', fpcc_send)
if self.params_memory.get_bool("FrogPilotTogglesUpdated"):
self.update_frogpilot_params()
@@ -1364,53 +1245,52 @@ class Controls:
self.frogpilot_variables.use_ev_tables = self.params.get_bool("EVTable")
def update_clearpilot_events(self, CS):
if (len(CS.buttonEvents) > 0):
if (len(CS.buttonEvents) > 0):
print (CS.buttonEvents)
if any(be.pressed and be.type == FrogPilotButtonType.lkas for be in CS.buttonEvents):
self.events.add(EventName.clpDebug)
def handle_screen_mode(self, CS):
"""CLEARPILOT: tracks driving_gear, auto-resets display, and cycles
ScreenDisplayMode on debug-button presses. Must run every cycle so button
edge events aren't lost during parked-skip mode (edges happen in carstate
edge detection and only appear in that one cycle's CS.buttonEvents)."""
self.driving_gear = CS.gearShifter not in (GearShifter.neutral, GearShifter.park, GearShifter.reverse, GearShifter.unknown)
# Uncomment to alert when lkas button pressed
# if any(be.pressed and be.type == FrogPilotButtonType.lkas for be in CS.buttonEvents):
# self.events.add(EventName.clpDebug)
# auto-reset display when shifting into drive from screen-off
if self.driving_gear and not self.was_driving_gear:
def clearpilot_state_control(self, CC, CS):
# CLEARPILOT: pure UI plumbing — does not modify CC/actuators. Maintains
# ScreenDisplayMode (5-state machine driven by the LFA/debug button + gear
# edges) and ticks the speed/cruise-warning overlay at ~2Hz.
driving_gear = CS.gearShifter not in (GearShifter.neutral, GearShifter.park,
GearShifter.reverse, GearShifter.unknown)
# Auto-wake screen when shifting into drive from screen-off
if driving_gear and not self.was_driving_gear:
if self.params_memory.get_int("ScreenDisplayMode") == 3:
self.params_memory.put_int("ScreenDisplayMode", 0)
self.was_driving_gear = self.driving_gear
self.was_driving_gear = driving_gear
# LFA/debug button cycles ScreenDisplayMode. Onroad and offroad use
# different transition tables.
if any(be.pressed and be.type == FrogPilotButtonType.lkas for be in CS.buttonEvents):
current = self.params_memory.get_int("ScreenDisplayMode")
if self.driving_gear:
if driving_gear:
# Onroad: 0→4, 1→2, 2→3, 3→4, 4→2 (never back to auto via button)
transitions = {0: 4, 1: 2, 2: 3, 3: 4, 4: 2}
new_mode = transitions.get(current, 0)
else:
# Not in drive: any except 3 → 3, state 3 → 0
# Not in drive: anything except 3 → 3 (screen off), state 3 → 0 (auto)
new_mode = 0 if current == 3 else 3
self.params_memory.put_int("ScreenDisplayMode", new_mode)
def clearpilot_state_control(self, CC, CS):
# ClearPilot speed processing (~2 Hz at 100 Hz loop)
# Speed/cruise-warning overlay tick (~2Hz at 100Hz loop)
self.speed_state_frame += 1
if self.speed_state_frame % 50 == 0:
gps = self.sm['gpsLocation']
has_gps = self.sm.valid['gpsLocation'] and gps.hasFix
speed_ms = gps.speed if has_gps else 0.0
speed_limit_ms = self.params_memory.get_float("CarSpeedLimit")
is_metric = (self.params_memory.get("CarIsMetric", encoding="utf-8") or "0") == "1"
speed_limit_ms = self.params_memory.get_float("CarSpeedLimit") or 0.0
is_metric = self.is_metric
cruise_speed_ms = CS.cruiseState.speed
cruise_active = CS.cruiseState.enabled
cruise_standstill = CS.cruiseState.standstill
self.speed_state.update(speed_ms, has_gps, speed_limit_ms, is_metric,
cruise_speed_ms, cruise_active, cruise_standstill)
return CC
def main():
+78
View File
@@ -0,0 +1,78 @@
#!/usr/bin/env python3
"""
CLEARPILOT: minimal controlsd variant that runs while ignition is on but the
car is in Park. Keeps CAN parsing and carState publishing alive (so thermald
can see gearShifter and decide when to swap us out for the full controlsd),
but skips all of the heavy onroad work — no model, no planner, no lateral or
longitudinal control, no actuator commands.
Manager swaps between this and the full controlsd via predicate flips:
- this runs when: ignition AND not started
- full runs when: started (which requires ignition AND not_parked)
The two are mutually exclusive — only one publishes carState at a time.
We also keep the Hyundai HDA2 tester-present heartbeat alive while parked
so the ADAS ECU doesn't snap back to default-session and light up LKAS /
blind-spot warning icons on the cluster during the swap.
"""
from types import SimpleNamespace
from openpilot.common.realtime import Priority, config_realtime_process
from openpilot.selfdrive.boardd.boardd import can_list_to_can_capnp
from openpilot.selfdrive.car.card import CarD
from openpilot.selfdrive.car.hyundai.hyundaicanfd import CanBus as HyundaiCanBus
from openpilot.selfdrive.car.hyundai.values import HyundaiFlags
# UDS Tester Present, suppressPositiveResponse — same bytes the full
# carcontroller sends every 100 frames to 0x730 on E-CAN to keep the ADAS
# ECU held in its disabled diagnostic session.
TESTER_PRESENT = b"\x02\x3E\x80\x00\x00\x00\x00\x00"
TESTER_PRESENT_PERIOD_FRAMES = 100 # ~1 Hz at the CAN-paced loop rate
def _make_default_frogpilot_variables() -> SimpleNamespace:
"""Safe defaults for fields read inside CarInterface.update / CarState.update.
We're not actuating anything here; these only need to keep the update path
from raising AttributeError. False/0 across the board is the safe baseline."""
fv = SimpleNamespace()
fv.conditional_experimental_mode = False
fv.experimental_mode_via_distance = False
fv.traffic_mode = False
fv.sport_plus = False
fv.long_pitch = False
fv.no_lat_lane_change = False
return fv
def main():
config_realtime_process(4, Priority.CTRL_HIGH)
# CarD's __init__ blocks until it sees CAN + a pandaState, then calls get_car
# to fingerprint and write CarParams. Same path the full controlsd takes.
card = CarD()
card.initialize()
fv = _make_default_frogpilot_variables()
# Determine if this car wants the Hyundai HDA2 tester-present heartbeat,
# and which bus E-CAN is on for this panda configuration.
is_hda2 = card.CP.carName == "hyundai" and bool(card.CP.flags & HyundaiFlags.CANFD_HDA2.value)
ecan = HyundaiCanBus(card.CP).ECAN if is_hda2 else None
# state_update drains CAN, parses carState, publishes carState/carOutput/carParams.
# Internally blocks via drain_sock_raw(wait_for_one=True), so the loop is
# naturally paced by CAN traffic — no extra sleep needed.
frame = 0
while True:
card.state_update(fv)
if is_hda2 and frame % TESTER_PRESENT_PERIOD_FRAMES == 0:
can_sends = [[0x730, 0, TESTER_PRESENT, ecan]]
card.pm.send('sendcan', can_list_to_can_capnp(can_sends, msgtype='sendcan', valid=True))
frame += 1
if __name__ == "__main__":
main()
+1 -1
View File
@@ -778,8 +778,8 @@ EVENTS: dict[int, dict[str, Alert | AlertCallbackType]] = {
ET.SOFT_DISABLE: soft_disable_alert("Sensor Data Invalid"),
},
# CLEARPILOT: alert suppressed — event still fires for screen toggle and future actions
EventName.clpDebug: {
ET.PERMANENT: clp_debug_notice,
},
EventName.noGps: {
+2 -33
View File
@@ -143,10 +143,6 @@ class LongitudinalPlanner:
self.params = Params()
self.params_memory = Params("/dev/shm/params")
# CLEARPILOT: track model FPS for dynamic dt adjustment
self.model_fps = 20
self._dbg_prev_valid = True # CLEARPILOT: diagnostic logging on valid transitions
self.radarless_model = self.params.get("Model", encoding='utf-8') in RADARLESS_MODELS
self.release = get_short_branch() == "clearpilot"
@@ -177,19 +173,6 @@ class LongitudinalPlanner:
return x, v, a, j
def update(self, sm):
# CLEARPILOT: read actual model FPS and adjust dt accordingly
model_fps_raw = self.params_memory.get("ModelFps")
if model_fps_raw is not None:
try:
fps = int(model_fps_raw)
if fps > 0 and fps != self.model_fps:
self.model_fps = fps
self.dt = 1.0 / fps
self.v_desired_filter.dt = self.dt
self.v_desired_filter.update_alpha(2.0) # rc=2.0, same as __init__
except (ValueError, ZeroDivisionError):
pass
self.mpc.mode = 'blended' if sm['controlsState'].experimentalMode else 'acc'
v_ego = sm['carState'].vEgo
@@ -256,10 +239,7 @@ class LongitudinalPlanner:
self.j_desired_trajectory = np.interp(ModelConstants.T_IDXS[:CONTROL_N], T_IDXS_MPC[:-1], self.mpc.j_solution)
# TODO counter is only needed because radar is glitchy, remove once radar is gone
# CLEARPILOT: scale crash_cnt threshold by FPS — at 20fps need 3 frames (0.15s),
# at 4fps need 1 frame (0.25s already exceeds 0.15s window)
fcw_threshold = max(0, int(0.15 * self.model_fps) - 1) # 20fps->2, 4fps->0
self.fcw = self.mpc.crash_cnt > fcw_threshold and not sm['carState'].standstill
self.fcw = self.mpc.crash_cnt > 2 and not sm['carState'].standstill
if self.fcw:
cloudlog.info("FCW triggered")
@@ -278,18 +258,7 @@ class LongitudinalPlanner:
def publish(self, sm, pm):
plan_send = messaging.new_message('longitudinalPlan')
valid = sm.all_checks(service_list=['carState', 'controlsState'])
# CLEARPILOT: log on transition into invalid — stderr goes to our plannerd.log
if valid != self._dbg_prev_valid and not valid:
import sys
print(
"CLP longitudinalPlan valid=False: "
f"carState(a={sm.alive['carState']},v={sm.valid['carState']},f={sm.freq_ok['carState']}) "
f"controlsState(a={sm.alive['controlsState']},v={sm.valid['controlsState']},f={sm.freq_ok['controlsState']})",
file=sys.stderr, flush=True
)
self._dbg_prev_valid = valid
plan_send.valid = valid
plan_send.valid = sm.all_checks(service_list=['carState', 'controlsState'])
longitudinalPlan = plan_send.longitudinalPlan
longitudinalPlan.modelMonoTime = sm.logMonoTime['modelV2']
@@ -58,9 +58,6 @@ class FrogPilotPlanner:
self.params = Params()
self.params_memory = Params("/dev/shm/params")
# CLEARPILOT: track valid transitions so we only log when it flips, not every cycle
self._dbg_prev_valid = True
self.cem = ConditionalExperimentalMode()
self.lead_one = Lead()
self.mtsc = MapTurnSpeedController()
@@ -245,18 +242,7 @@ class FrogPilotPlanner:
def publish(self, sm, pm):
frogpilot_plan_send = messaging.new_message('frogpilotPlan')
valid = sm.all_checks(service_list=['carState', 'controlsState'])
# CLEARPILOT: log on transition into invalid — stderr goes to our plannerd.log
if valid != self._dbg_prev_valid and not valid:
import sys
print(
"CLP frogpilotPlan valid=False: "
f"carState(a={sm.alive['carState']},v={sm.valid['carState']},f={sm.freq_ok['carState']}) "
f"controlsState(a={sm.alive['controlsState']},v={sm.valid['controlsState']},f={sm.freq_ok['controlsState']})",
file=sys.stderr, flush=True
)
self._dbg_prev_valid = valid
frogpilot_plan_send.valid = valid
frogpilot_plan_send.valid = sm.all_checks(service_list=['carState', 'controlsState'])
frogpilotPlan = frogpilot_plan_send.frogpilotPlan
frogpilotPlan.accelerationJerk = A_CHANGE_COST * (float(self.jerk) if self.lead_one.status else 1)
+1 -8
View File
@@ -284,14 +284,7 @@ def main() -> NoReturn:
# 4Hz driven by cameraOdometry
if sm.frame % 5 == 0:
# CLEARPILOT: publish valid based on calibration status, not upstream sm.all_checks().
# Original openpilot gated valid on fresh inputs, but that caused a cascade:
# upstream freq glitches → liveCalibration.valid=False → locationd stays
# uninitialized → paramsd fed garbage → bogus steerRatio/stiffnessFactor → erratic
# steering. "valid" semantically means "calibration data is trustworthy"; that's a
# question about calibration convergence, not input freshness.
cal_valid = calibrator.cal_status == log.LiveCalibrationData.Status.calibrated
calibrator.send_data(pm, cal_valid)
calibrator.send_data(pm, sm.all_checks())
if __name__ == "__main__":
+7 -1
View File
@@ -308,12 +308,18 @@ void Localizer::input_fake_gps_observations(double current_time) {
}
void Localizer::handle_gps(double current_time, const cereal::GpsLocationData::Reader& log, const double sensor_time_offset) {
// CLEARPILOT: GPS disabled as a Kalman input. gpsd still publishes real GPS for
// UI / dashcam / clock / night-mode; only locationd ignores it. Falls through to
// determine_gps_mode() which is openpilot's existing no-GPS path (fake observations
// to bound position uncertainty). To re-enable, set this to false.
const bool clearpilot_disable_gps = true;
bool gps_unreasonable = (Vector2d(log.getHorizontalAccuracy(), log.getVerticalAccuracy()).norm() >= SANE_GPS_UNCERTAINTY);
bool gps_accuracy_insane = ((log.getVerticalAccuracy() <= 0) || (log.getSpeedAccuracy() <= 0) || (log.getBearingAccuracyDeg() <= 0));
bool gps_lat_lng_alt_insane = ((std::abs(log.getLatitude()) > 90) || (std::abs(log.getLongitude()) > 180) || (std::abs(log.getAltitude()) > ALTITUDE_SANITY_CHECK));
bool gps_vel_insane = (floatlist2vector(log.getVNED()).norm() > TRANS_SANITY_CHECK);
if (!log.getHasFix() || gps_unreasonable || gps_accuracy_insane || gps_lat_lng_alt_insane || gps_vel_insane) {
if (clearpilot_disable_gps || !log.getHasFix() || gps_unreasonable || gps_accuracy_insane || gps_lat_lng_alt_insane || gps_vel_insane) {
//this->gps_valid = false;
this->determine_gps_mode(current_time);
return;
+1 -10
View File
@@ -187,7 +187,6 @@ def main():
avg_offset_valid = True
total_offset_valid = True
roll_valid = True
dbg_prev_valid = True # CLEARPILOT: track valid transitions
while True:
sm.update()
@@ -257,15 +256,7 @@ def main():
liveParameters.filterState.std = P.tolist()
liveParameters.filterState.valid = True
lp_valid = sm.all_checks()
# CLEARPILOT: on transition to invalid, log per-subscriber state to paramsd.log
if lp_valid != dbg_prev_valid and not lp_valid:
import sys
bad = [s for s in sm.alive if not (sm.alive[s] and sm.valid[s] and sm.freq_ok.get(s, True))]
details = [f"{s}(a={sm.alive[s]},v={sm.valid[s]},f={sm.freq_ok[s]})" for s in bad]
print(f"CLP liveParameters valid=False: {' '.join(details)}", file=sys.stderr, flush=True)
dbg_prev_valid = lp_valid
msg.valid = lp_valid
msg.valid = sm.all_checks()
if sm.frame % 1200 == 0: # once a minute
params = {
+1 -11
View File
@@ -226,8 +226,6 @@ def main(demo=False):
with car.CarParams.from_bytes(params.get("CarParams", block=True)) as CP:
estimator = TorqueEstimator(CP)
dbg_prev_valid = True # CLEARPILOT: track valid transitions
while True:
sm.update()
if sm.all_checks():
@@ -238,15 +236,7 @@ def main(demo=False):
# 4Hz driven by liveLocationKalman
if sm.frame % 5 == 0:
tq_valid = sm.all_checks()
# CLEARPILOT: log per-sub detail on transition to invalid — goes to torqued.log
if tq_valid != dbg_prev_valid and not tq_valid:
import sys
bad = [s for s in sm.alive if not (sm.alive[s] and sm.valid[s] and sm.freq_ok.get(s, True))]
details = [f"{s}(a={sm.alive[s]},v={sm.valid[s]},f={sm.freq_ok[s]})" for s in bad]
print(f"CLP liveTorqueParameters valid=False: {' '.join(details)}", file=sys.stderr, flush=True)
dbg_prev_valid = tq_valid
pm.send('liveTorqueParameters', estimator.get_msg(valid=tq_valid))
pm.send('liveTorqueParameters', estimator.get_msg(valid=sm.all_checks()))
# Cache points every 60 seconds while onroad
if sm.frame % 240 == 0:
+3
View File
@@ -363,6 +363,9 @@ def manager_init(frogpilot_functions) -> None:
params.put("GitRemote", get_origin())
params.put_bool("IsTestedBranch", is_tested_branch())
params.put_bool("IsReleaseBranch", is_release_branch())
# CLEARPILOT: thermald is the source of truth for IgnitionOn; seed False so
# the parked-controlsd predicate evaluates to False before thermald's first tick.
params.put_bool("IgnitionOn", False)
# set dongle id
reg_res = register(show_spinner=True)
+11
View File
@@ -43,6 +43,12 @@ def only_onroad(started: bool, params, CP: car.CarParams) -> bool:
def only_offroad(started, params, CP: car.CarParams) -> bool:
return not started
# CLEARPILOT: predicate for the parked controlsd variant. Runs while ignition
# is on but the car is in Park (so started=False because thermald has gated it
# off). Mutually exclusive with the full controlsd, which uses only_onroad.
def parked_only(started, params, CP: car.CarParams) -> bool:
return params.get_bool("IgnitionOn") and not started
# FrogPilot functions
def allow_logging(started, params, CP: car.CarParams) -> bool:
allow_logging = not (params.get_bool("DeviceManagement") and params.get_bool("NoLogging"))
@@ -82,6 +88,11 @@ procs = [
PythonProcess("calibrationd", "selfdrive.locationd.calibrationd", only_onroad),
PythonProcess("torqued", "selfdrive.locationd.torqued", only_onroad),
PythonProcess("controlsd", "selfdrive.controls.controlsd", only_onroad),
# CLEARPILOT: lightweight CAN listener that runs while ignition is on and the
# car is parked. Publishes carState (so thermald can see gear); does no model,
# planner, or actuator work. Manager swaps it out for the full controlsd as
# soon as gear leaves Park.
PythonProcess("controlsd_parked", "selfdrive.controls.controlsd_parked", parked_only),
PythonProcess("deleter", "system.loggerd.deleter", always_run),
PythonProcess("dmonitoringd", "selfdrive.monitoring.dmonitoringd", driverview, enabled=(not PC or WEBCAM)),
# PythonProcess("qcomgpsd", "system.qcomgpsd.qcomgpsd", qcomgps, enabled=TICI),
+2 -14
View File
@@ -128,14 +128,10 @@ def main():
assert vipc_client.is_connected()
cloudlog.warning(f"connected with buffer size: {vipc_client.buffer_len}")
sm = SubMaster(["liveCalibration", "carState"])
sm = SubMaster(["liveCalibration"])
pm = PubMaster(["driverStateV2"])
calib = np.zeros(CALIB_LEN, dtype=np.float32)
# CLEARPILOT: cache last model output so we can republish (not re-infer) at standstill.
# Saves ~7% CPU; downstream dmonitoringd sees a steady 10Hz stream with known-good
# last readings (driver can't become distracted relative to a stopped car).
last_model_output = None
# last = 0
while True:
@@ -147,16 +143,8 @@ def main():
if sm.updated["liveCalibration"]:
calib[:] = np.array(sm["liveCalibration"].rpyCalib)
standstill = sm["carState"].standstill
t1 = time.perf_counter()
if standstill and last_model_output is not None:
# CLEARPILOT: reuse last inference at standstill
model_output = last_model_output
dsp_execution_time = 0.0
else:
model_output, dsp_execution_time = model.run(buf, calib)
last_model_output = model_output
model_output, dsp_execution_time = model.run(buf, calib)
t2 = time.perf_counter()
pm.send("driverStateV2", get_driverstate_packet(model_output, vipc_client.frame_id, vipc_client.timestamp_sof, t2 - t1, dsp_execution_time))
+2 -36
View File
@@ -134,15 +134,11 @@ def main(demo=False):
setproctitle(PROCESS_NAME)
config_realtime_process(7, 54)
import time as _time
cloudlog.warning("setting up CL context")
_t0 = _time.monotonic()
cl_context = CLContext()
_t1 = _time.monotonic()
cloudlog.warning("CL context ready in %.3fs; loading model", _t1 - _t0)
cloudlog.warning("CL context ready; loading model")
model = ModelState(cl_context)
_t2 = _time.monotonic()
cloudlog.warning("model loaded in %.3fs (total init %.3fs), modeld starting", _t2 - _t1, _t2 - _t0)
cloudlog.warning("models loaded, modeld starting")
# visionipc clients
while True:
@@ -183,9 +179,6 @@ def main(demo=False):
model_transform_main = np.zeros((3, 3), dtype=np.float32)
model_transform_extra = np.zeros((3, 3), dtype=np.float32)
live_calib_seen = False
model_standby = False
last_standby_ts_write = 0
params_memory = Params("/dev/shm/params")
nav_features = np.zeros(ModelConstants.NAV_FEATURE_LEN, dtype=np.float32)
nav_instructions = np.zeros(ModelConstants.NAV_INSTRUCTION_LEN, dtype=np.float32)
buf_main, buf_extra = None, None
@@ -240,33 +233,6 @@ def main(demo=False):
meta_extra = meta_main
sm.update(0)
# CLEARPILOT: two-state modeld — 0fps only when parked (ignition-on means
# engine running in park; ignition-off stops modeld at the manager level).
# Standstill in drive (red light) keeps running so lateral stays responsive
# and liveCalibration/paramsd observations continue.
parked = sm['carState'].gearShifter == car.CarState.GearShifter.park
should_standby = parked
if should_standby and not model_standby:
params_memory.put_bool("ModelStandby", True)
params_memory.put("ModelFps", "0")
model_standby = True
cloudlog.warning("modeld: standby ON")
elif not should_standby and model_standby:
params_memory.put_bool("ModelStandby", False)
params_memory.put("ModelFps", "20")
model_standby = False
run_count = 0
frame_dropped_filter.x = 0.
cloudlog.warning("modeld: standby OFF")
if model_standby:
now = _time.monotonic()
if now - last_standby_ts_write > 1.0:
params_memory.put("ModelStandbyTs", str(now))
last_standby_ts_write = now
last_vipc_frame_id = meta_main.frame_id
continue
desire = DH.desire
is_rhd = sm["driverMonitoringState"].isRHD
frame_id = sm["roadCameraState"].frameId
+2 -22
View File
@@ -21,7 +21,6 @@ def dmonitoringd_thread():
v_cruise_last = 0
driver_engaged = False
dbg_prev_valid = True # CLEARPILOT: track valid transitions
# 10Hz <- dmonitoringmodeld
while True:
@@ -44,18 +43,7 @@ def dmonitoringd_thread():
# Get data from dmonitoringmodeld
events = Events()
# CLEARPILOT: narrow update_states gate. The original sm.all_checks() also
# required modelV2 fresh (stops at standstill in two-state modeld) and
# liveCalibration.valid (calibrationd cascades its own freq_ok to valid, which
# flaps). Both made DM freeze pose → face_detected stuck False → awareness
# decayed to 0 within 6s of engagement. Narrow the gate to the subs
# update_states actually reads, and only to alive+valid (skip freq_ok and
# skip liveCalibration.valid). rpyCalib presence is sufficient to know
# calibration has produced output.
if (sm.alive['driverStateV2'] and sm.valid['driverStateV2'] and
sm.alive['carState'] and sm.valid['carState'] and
sm.alive['controlsState'] and sm.valid['controlsState'] and
sm.alive['liveCalibration'] and len(sm['liveCalibration'].rpyCalib) > 0):
if sm.all_checks() and len(sm['liveCalibration'].rpyCalib):
driver_status.update_states(sm['driverStateV2'], sm['liveCalibration'].rpyCalib, sm['carState'].vEgo, sm['controlsState'].enabled)
# Block engaging after max number of distrations
@@ -66,16 +54,8 @@ def dmonitoringd_thread():
# Update events from driver state
driver_status.update_events(events, driver_engaged, sm['controlsState'].enabled, sm['carState'].standstill)
# CLEARPILOT: log on transition to invalid so we can see which sub caused the cascade
dm_valid = sm.all_checks()
if dm_valid != dbg_prev_valid and not dm_valid:
import sys
bad = [s for s in sm.alive if not (sm.alive[s] and sm.valid[s] and sm.freq_ok.get(s, True))]
details = [f"{s}(a={sm.alive[s]},v={sm.valid[s]},f={sm.freq_ok[s]})" for s in bad]
print(f"CLP driverMonitoringState valid=False: {' '.join(details)}", file=sys.stderr, flush=True)
dbg_prev_valid = dm_valid
# build driverMonitoringState packet
dat = messaging.new_message('driverMonitoringState', valid=dm_valid)
dat = messaging.new_message('driverMonitoringState', valid=sm.all_checks())
dat.driverMonitoringState = {
"events": events.to_msg(),
"faceDetected": driver_status.face_detected,
+43
View File
@@ -170,7 +170,17 @@ def thermald_thread(end_event, hw_queue) -> None:
onroad_conditions: dict[str, bool] = {
"ignition": False,
# CLEARPILOT: park-aware gating. When False, manager runs controlsd_parked
# (CAN listener only) instead of the full onroad stack. Latched + hysteresis
# on going-into-parked to avoid R↔P↔D thrash; flips out of parked instantly.
# Initialized False (assume parked) so the full stack waits for carState
# to confirm gear has actually left Park before spinning up.
"not_parked": False,
}
is_parked = True
parked_since: float | None = None # monotonic ts when gear first read as Park
PARKED_HYSTERESIS_S = 1.5
ignition_param_prev: bool | None = None
startup_conditions: dict[str, bool] = {}
startup_conditions_prev: dict[str, bool] = {}
@@ -247,6 +257,39 @@ def thermald_thread(end_event, hw_queue) -> None:
onroad_conditions["ignition"] = False
cloudlog.error("panda timed out onroad")
# CLEARPILOT: derive is_parked from carState gearShifter. Whichever controlsd
# variant is currently running publishes carState; we just read the gear.
# Going INTO parked has hysteresis (PARKED_HYSTERESIS_S) so brief P touches
# during low-speed parking don't kick the heavy stack off. Going OUT of
# parked is instant so the full stack starts spinning up the moment the
# driver shifts to D/R/N.
#
# Cruise override: if cruise control has any speed set (engaged OR
# paused-with-speed-set), keep the full stack running even in park. This
# means the user can shift to park at a stop, glance at the cluster to
# verify cruise is still set, and roll forward without waiting for full
# controlsd to spin back up.
if sm.updated['carState']:
cs = sm['carState']
gear_is_park = cs.gearShifter == car.CarState.GearShifter.park
cruise_set = cs.cruiseState.speed > 0
now_mono = time.monotonic()
if gear_is_park and not cruise_set:
if parked_since is None:
parked_since = now_mono
if (not is_parked) and (now_mono - parked_since) >= PARKED_HYSTERESIS_S:
is_parked = True
else:
parked_since = None
is_parked = False
onroad_conditions["not_parked"] = not is_parked
# CLEARPILOT: expose ignition as a Param so manager predicates (which only
# see persistent Params, not pandaStates) can gate controlsd_parked.
if ignition_param_prev != onroad_conditions["ignition"]:
params.put_bool("IgnitionOn", onroad_conditions["ignition"])
ignition_param_prev = onroad_conditions["ignition"]
try:
last_hw_state = hw_queue.get_nowait()
except queue.Empty:
@@ -0,0 +1,355 @@
# Session: 2026-04-26 — Baseline Revert + Parked-Controlsd Mode
## Context
This session was driven by a regression: the steering wheel "feels like
it pulls right" during normal driving, with no clear smoking gun. The
suspicion was that one of the variable-FPS / standstill-throttling
changes (added to reduce parked-state fan noise and CPU load) bled into
on-road driving behavior in a hard-to-isolate way.
Strategy: revert all driving-relevant logic to a known-good baseline
captured in `/projects/openpilot/archive/clearpilot` (HEAD `980f0aa`,
July 2024), keep all of the ClearPilot UI/dashcam/telemetry/bench-mode
infrastructure intact on top, then attack the parked-fan-noise problem
fresh from a different angle that doesn't touch driving logic at all.
Three commits landed in this order on branch `clearpilot`:
| SHA | Title |
|---|---|
| `47321e3` | restore driving logic to pre-variable-fps baseline |
| `f7e602c` | controlsd: re-wire UI hooks on top of restored baseline |
| `887b9c9` | parked-controlsd mode: shut down heavy stack while ignition+park |
Pre-revert tip was `62a403d`. The on-device agent should treat
**`62a403d` as "the broken version"** when looking at history.
---
## Commit 1: `47321e3` — Baseline restore
Reverted the following files wholesale to their `980f0aa` archive copy:
- `selfdrive/controls/controlsd.py`
- `selfdrive/controls/lib/events.py`
- `selfdrive/controls/lib/longitudinal_planner.py`
- `selfdrive/modeld/modeld.py`
- `selfdrive/modeld/dmonitoringmodeld.py`
- `selfdrive/locationd/calibrationd.py`
- `selfdrive/locationd/paramsd.py`
- `selfdrive/locationd/torqued.py`
- `selfdrive/car/interfaces.py`
- `selfdrive/car/hyundai/carstate.py` (CAN-FD telemetry preserved as a
commented block at the bottom of `update_canfd` — re-enable by
uncommenting; the `tlog` import is also commented out).
- `selfdrive/monitoring/dmonitoringd.py`
- `selfdrive/frogpilot/controls/frogpilot_planner.py`
- `common/realtime.py`
### Intentionally NOT restored (kept as the post-`62a403d` version)
- `selfdrive/thermald/*` — fan/power tuning kept as-is.
- `selfdrive/car/hyundai/carcontroller.py` and `hyundaicanfd.py`
reviewed; the only delta vs baseline is hoisting the
`no_lat_lane_change` Params read out of the 100Hz hot path
(~5% carcontroller CPU) by passing the bit as an argument. This is
a perf-only change with no behavioral effect outside lane changes.
- `cereal/services.py`, `cereal/custom.capnp` — additive only.
`custom.capnp` adds `latRequested @3` and `noLatLaneChange @4` fields
to `FrogPilotCarControl`; capnp tag numbers are append-only so this is
safe to leave in even though baseline code doesn't write to them.
- `selfdrive/manager/*`, `common/params.cc` — heavy ClearPilot
infrastructure (bench mode, log dir, dashcamd, gpsd, ClearPilot params).
- All `selfdrive/ui/`, `selfdrive/clearpilot/`, `system/clearpilot/`.
### Things removed by the restore (no longer in the tree)
- Standstill frame skipping in modeld (was: skip GPU inference 19/20
frames at standstill, report 0 dropped frames to fool controlsd).
- Standstill frame skipping in dmonitoringmodeld.
- Model standby logic + `ModelStandby`/`ModelStandbyTs` reads in
controlsd's comm-issue suppression path.
- Parked-cycle skip in `state_control()` (10Hz vs 100Hz when in Park).
- Calibrationd validity decoupling from `sm.all_checks()`.
- Post-engage 2s commIssue/location/params suppression window.
- Per-cycle carstate write-gating for `CarSpeedLimit`/`CarIsMetric` in
carstate.py.
- The diff-based carstate telemetry calls (preserved commented out).
These were the candidates for the steering-pull regression. The on-device
agent should **not re-introduce any of these** without a deliberate plan
and a test session. The user's intent is to get baseline driving feel
confirmed first, then re-introduce optimizations one at a time and drive
each.
---
## Commit 2: `f7e602c` — UI hooks on top of baseline
Baseline `controlsd.py` doesn't have the UI plumbing the ClearPilot UI
expects. This commit re-adds only the things needed for the existing UI
to keep working — pure params-write plumbing, no actuator effect.
### Changes (all in `selfdrive/controls/controlsd.py`)
1. **Import `SpeedState`** from `openpilot.selfdrive.clearpilot.speed_logic`.
2. **`Controls.__init__`**:
- `params_memory.put_bool("ScreenDisplayMode", 0)`
`params_memory.put_int("ScreenDisplayMode", 0)` (UI reads it as int).
- Added `self.speed_state = SpeedState()`, `self.speed_state_frame = 0`,
`self.was_driving_gear = False`.
3. **SubMaster** — added `gpsLocation` to the subscriber list, with
`ignore_alive` / `ignore_avg_freq` / `ignore_valid` so missing GPS
doesn't trigger commIssue.
4. **`clearpilot_state_control(...)`** rewritten from a simple 3-state
cycle into the documented 5-state ScreenDisplayMode machine:
- Auto-wake on park→drive edge if currently in screen-off (state 3).
- LFA button transitions in drive: `0→4, 1→2, 2→3, 3→4, 4→2`.
- LFA button transitions outside drive: any except 3 → 3, state 3 → 0.
- Speed/cruise-warning overlay tick at ~2Hz (`speed_state.update(...)`)
reading `gpsLocation`, `CarSpeedLimit` param, `self.is_metric`,
`CS.cruiseState`. This is what writes
`ClearpilotSpeedDisplay`/`ClearpilotSpeedLimitDisplay`/
`ClearpilotCruiseWarning` for the UI overlay.
5. **Lane-change suppression sync** — at the existing baseline
`clearpilot_disable_lat_on_lane_change` block (around line 687), in
addition to the existing `params_memory.put_bool("no_lat_lane_change", ...)`,
also set `self.frogpilot_variables.no_lat_lane_change = True/False`.
This is required because the kept (post-`62a403d`) carcontroller
reads off `frogpilot_variables`, not Params.
### What does NOT change
No edits to lateral or longitudinal control paths. No new actuator-side
behavior. The UI features (nightrider/screen-off/auto day-night, speed
overlay, cruise warning chime) get wired back up purely through param
writes.
---
## Commit 3: `887b9c9` — Parked-controlsd mode
Architectural fix for the original problem this whole session was
chasing: while ignition is on but the car is in Park, the entire onroad
stack (modeld, planner, control, locationd, calibrationd, paramsd,
torqued, dmonitoring*, soundd, loggerd) is running and burning CPU/fan
even though none of it is needed.
Solution: redefine "onroad" as **ignition AND not parked** instead of
just **ignition**. Reuse the existing `started`-based process gating in
manager. Add a tiny second controlsd variant that runs while parked,
just to keep CAN flowing so thermald can see when gear leaves Park.
### Files
#### NEW: `selfdrive/controls/controlsd_parked.py`
Minimal entry point. Roughly:
```python
def main():
config_realtime_process(4, Priority.CTRL_HIGH)
card = CarD() # blocks until first CAN, fingerprints car
card.initialize()
fv = _make_default_frogpilot_variables() # safe False/0 SimpleNamespace
while True:
card.state_update(fv) # publishes carState/carOutput/carParams
```
`CarD.state_update` blocks via `drain_sock_raw(wait_for_one=True)`, so
the loop is paced by CAN traffic — no extra sleep, no CPU spin.
The default `frogpilot_variables` sets these to safe values so
`CarInterfaceBase.update` doesn't `AttributeError`:
`conditional_experimental_mode`, `experimental_mode_via_distance`,
`traffic_mode`, `sport_plus`, `long_pitch`, `no_lat_lane_change` — all
False.
#### `selfdrive/thermald/thermald.py`
- `onroad_conditions` now also has `"not_parked"`. Initialized to
`False` (assume parked at boot) so the heavy stack waits for carState
to confirm gear has left Park before spinning up.
- New module-level loop variables: `is_parked = True`,
`parked_since: float | None = None`, `PARKED_HYSTERESIS_S = 1.5`,
`ignition_param_prev: bool | None = None`.
- New block right after the panda-disconnect check (around line 258 in
current state) reads `sm['carState'].gearShifter`:
- Gear == Park: latch `parked_since`, flip `is_parked = True` after
1.5s of continuous Park (hysteresis).
- Gear != Park: clear `parked_since`, `is_parked = False` immediately
(no hysteresis going out).
- Reverse is treated as **not parked** — driver is moving.
- `onroad_conditions["not_parked"] = not is_parked` every tick.
- New `IgnitionOn` Params write, edge-driven (only on change of
`onroad_conditions["ignition"]`) so we don't hammer the persistent
filesystem 2x/sec.
#### `selfdrive/manager/process_config.py`
- New predicate:
```python
def parked_only(started, params, CP):
return params.get_bool("IgnitionOn") and not started
```
- New process entry directly after the existing `controlsd` entry:
```python
PythonProcess("controlsd_parked", "selfdrive.controls.controlsd_parked", parked_only),
```
- `controlsd` is unchanged (still `only_onroad`). Mutually exclusive
with `parked_only` because `started` is the negation of the relevant
condition.
#### `selfdrive/manager/manager.py`
- Single line in `manager_init()` seeding `IgnitionOn=False` so the
predicate evaluates correctly before thermald's first tick.
#### `common/params.cc`
- New entry in the alphabetical I-block:
```cpp
{"IgnitionOn", CLEAR_ON_MANAGER_START},
```
- Manager predicates can only see persistent Params (not pandaStates
or `/dev/shm/params`), which is why thermald has to expose ignition
this way.
### State machine summary
| Ignition | Gear | Predicates true | What runs |
|----------|-------------|-----------------------------|-------------------------------|
| off | any | none | always_run only (ui, thermald, pandad, deleter, …) |
| on | Park (>1.5s)| parked_only | always_run + controlsd_parked |
| on | not Park | only_onroad (=> all `started`) | full onroad stack |
The transition between rows 2 and 3 is purely manager process
swapping driven by predicate flips — no IPC handshake, no
self-termination. Thermald sees gear change, flips `not_parked`,
`should_start = all(onroad_conditions.values())` flips, manager kills
the wrong variant and spawns the right one on its next tick.
---
## What the on-device agent needs to know / debug
### Build prerequisites
A new param key was added (`IgnitionOn`), so the C++ params whitelist
needs a fresh build. Per `CLAUDE.md` "Adding New Params":
```bash
chown -R comma:comma /data/openpilot
rm -f /data/openpilot/prebuilt /data/openpilot/common/params.o /data/openpilot/common/libcommon.a
su - comma -c "bash /data/openpilot/build_only.sh"
```
`build_only.sh` already deletes `prebuilt` but **does not** delete
`params.o` / `libcommon.a` — verify those are gone before building or
the new key won't be picked up and `Params().put_bool("IgnitionOn", …)`
will throw `UnknownKeyName` in manager_init or thermald.
### How to verify the swap is working
```bash
# 1. Watch which controlsd variant is running
watch -n 1 'ps -ef | grep -E "controlsd(_parked)?" | grep -v grep'
# 2. Watch the gating signals
watch -n 1 'echo "IgnitionOn:"; cat /data/params/d/IgnitionOn 2>/dev/null; \
echo; echo "deviceState.started:" ; \
python3 -c "import cereal.messaging as m; \
s=m.sub_sock(\"deviceState\", timeout=1000); \
print(m.recv_one(s).deviceState.started)"'
# 3. Watch gear via carState
python3 -c "
import cereal.messaging as m
s = m.sub_sock('carState', timeout=2000)
while True:
msg = m.recv_one(s)
if msg: print(msg.carState.gearShifter)
"
```
Expected behavior:
- Ignition off: neither variant in `ps`. `IgnitionOn` is `0`/missing.
- Ignition on, in Park: `controlsd_parked` in `ps`, `controlsd` is not.
`started` is False. After ~1.5s of confirmed Park, modeld and friends
should have stopped.
- Shift to Drive: `controlsd_parked` disappears within ~500ms, full
`controlsd` appears, all the onroad processes spin up. There will be
a brief carState gap during the swap (~0.52s).
### What to suspect first if something breaks
1. **Manager crash on startup** with `UnknownKeyName: IgnitionOn`. Means
`params.o`/`libcommon.a` weren't rebuilt. Delete them and rebuild.
2. **`controlsd_parked` keeps respawning / dying.** Check
`/data/log2/current/controlsd_parked.log`. Most likely either:
- `CarD.__init__` is hanging on `get_one_can` because pandad isn't
up yet — should only matter on the very first boot.
- A `frogpilot_variables` attribute we missed defaulting; add it to
`_make_default_frogpilot_variables` in `controlsd_parked.py`.
3. **Full controlsd never spawns after shift to Drive.** Check
`IgnitionOn` (should be `1`), check carState.gearShifter (should not
be `park`/`unknown`), check thermald.log for `should_start` logic.
Also check that `carState` is actually being published by
`controlsd_parked`.
4. **Steering still pulls right.** The whole point of commit 1 is to
rule out the variable-FPS work as the cause. If the symptom persists
on baseline-restored driving logic, the suspect list shifts to:
- Anything still on the kept side (carcontroller's
`no_lat_lane_change` plumbing, thermald-related changes,
custom.capnp additions).
- Hardware: a calibration that drifted, an alignment issue, panda
CAN bus issue, or torque tuning that was modified outside of these
files.
- The custom driving model selected (`Params("Model")`). Confirm
which `.thneed`/`.onnx` is loaded — none of the model files
themselves were changed in this session.
5. **Panda safety alerts during park transition.** If panda logs a
"lost heartbeat" or drops to NOOUTPUT mode in the swap window, we
need controlsd_parked to issue a no-op carcontrol heartbeat to keep
panda happy. Not implemented in this session — flag for follow-up.
### Open follow-up items (NOT done in this session)
- **Cold-start latency on shift-from-Park.** Modeld load + calibration
warmup may produce a noticeable gap before lateral/long are ready.
Anticipatory wake on `CS.brakePressed && in_park` is the planned
mitigation if needed.
- **Methodically reintroduce optimizations.** Once baseline driving
feels right, the standstill optimizations (modeld 1fps, fan clamps,
etc.) can come back one at a time, each with a drive test.
- **The CAN-FD telemetry block in `selfdrive/car/hyundai/carstate.py`**
is preserved as a commented block at the bottom of `update_canfd()`.
Re-enabling requires uncommenting + restoring the `tlog` import at
the top of the file.
---
## Key file index for fast navigation
```
selfdrive/controls/controlsd.py # full controlsd, restored to baseline + UI hooks re-added
selfdrive/controls/controlsd_parked.py # NEW: parked-only CAN listener
selfdrive/controls/clearpilot_state_control # in controlsd.py, ~line 1255 — 5-state ScreenDisplayMode + speed_state tick
selfdrive/thermald/thermald.py # gear-aware not_parked + IgnitionOn writer (~line 258)
selfdrive/manager/process_config.py # parked_only predicate + new entry
selfdrive/manager/manager.py # IgnitionOn seed in manager_init
common/params.cc # IgnitionOn registered (CLEAR_ON_MANAGER_START)
selfdrive/car/card.py # CarD class — used by both controlsd variants, unchanged
selfdrive/clearpilot/speed_logic.py # SpeedState class — unchanged, called from controlsd.clearpilot_state_control
selfdrive/car/hyundai/carstate.py # restored baseline + commented telemetry block at end of update_canfd
```
## Reproducing the diff per commit
```bash
git show 47321e3 # baseline restore
git show f7e602c # UI hooks
git show 887b9c9 # parked mode
git diff 62a403d..887b9c9 # full session delta
```
@@ -0,0 +1,358 @@
# Session: 2026-04-26 — GPS disabled in locationd; calibrationd-still-stale notes
## Context
Followup session to `2026-04-26-0914-baseline-revert-and-parked-mode`.
After the baseline restore, the manager wouldn't start cleanly and the car
exhibited a "drifting right on straight roads, model rescues us mid-curve"
symptom. This session unblocked the startup chain end-to-end so the car can
boot and run, disabled GPS as an input to locationd (the actual fix that
made the drift go away), and pinned down — but did **not** solve — why
`liveCalibration.valid` is still stuck at `False` and what that latently
breaks downstream.
Pre-session tip: `27cad05`. Single combined commit covering four code
changes plus this README:
- `selfdrive/boardd/boardd.cc``safety_setter_thread` on ignition edge
- `selfdrive/controls/controlsd.py` — drop unregistered
`no_lat_lane_change` Params write, wire `FPCC.noLatLaneChange` for UI
- `cereal/services.py` — deviceState/managerState back to 2Hz to match
restored `DT_TRML`
- `selfdrive/locationd/locationd.cc` — ignore GPS as Kalman input
(reversible flag)
---
## Commit 1: boardd — safety_setter_thread on ignition edge
### Symptom
Manager started fine but `controlsd_parked` blocked indefinitely at:
```
set_obd_multiplexing (selfdrive/car/fw_versions.py:236)
fingerprint (selfdrive/car/car_helpers.py:149)
get_car (selfdrive/car/car_helpers.py:210)
__init__ (selfdrive/car/card.py:60)
main (selfdrive/controls/controlsd_parked.py:41)
```
`set_obd_multiplexing` does `params.get_bool("ObdMultiplexingChanged", block=True)`
it waits for **boardd's `safety_setter_thread`** to ack. That thread spawns
only on the **rising edge of `IsOnroad`** (`boardd.cc:476`). `IsOnroad` is set
by manager from the `started` flag (`helpers.py:49`). With the parked-mode split
from the prior session, `started` requires `not_parked`, which requires thermald
to see `carState.gearShifter != park`, which requires `controlsd_parked` to
publish carState — which it can't do until OBD multiplexing is acked.
Classic deadlock: ignition rising no longer implies IsOnroad rising.
### Fix
`selfdrive/boardd/boardd.cc:476` — change the trigger from IsOnroad rising
edge to **ignition rising edge**. Adds `bool ignition_last = false;` next to
`is_onroad_last`, swaps the gate variable, sets `ignition_last = ignition;`
after. Restores stock openpilot's intent: set safety as soon as the bus is
alive. Both controlsd variants need it; the thread's phase 2 (waiting on
`ControlsReady`) is harmless in parked mode — it just sits.
`is_onroad`/`is_onroad_last` left in place; they're now unused beyond this
gate but the read of `params.getBool("IsOnroad")` still happens, in case any
future logic wants it.
---
## Commit 2: controlsd — drop unregistered Params write
### Symptom
After the boardd fix, controlsd_parked progressed but full controlsd
crashed at first `state_control` cycle:
```
File "selfdrive/controls/controlsd.py", line 693, in state_control
self.params_memory.put_bool("no_lat_lane_change", False)
common.params_pyx.UnknownKeyName: b'no_lat_lane_change'
```
The baseline-restored controlsd unconditionally writes `no_lat_lane_change`
to memory params on every state_control cycle. That key was never
registered in `common/params.cc`. Pre-revert (`62a403d`) controlsd didn't
write the param at all — it set `self.FPCC.noLatLaneChange` (capnp field).
The baseline brought back a code path the fork's params.cc never matched.
### Fix
In `selfdrive/controls/controlsd.py:687-694`, remove the two
`params_memory.put_bool("no_lat_lane_change", ...)` calls and replace with
`self.FPCC.noLatLaneChange = True/False` (matches what the kept UI code in
`onroad.cc:897` and `ui.cc:122` is reading from cereal). The
`frogpilot_variables.no_lat_lane_change` writes were already there — those
are what the kept Hyundai carcontroller actually reads at 100Hz.
No actuator change. Pure plumbing.
---
## Commit 3: services.py — restore deviceState/managerState rates
### Symptom
After fixing the controlsd crash, controlsd booted but immediately fired
**continuous `commIssue`** with:
```
"not_freq_ok": ["deviceState", "managerState"]
```
### Diagnosis
`common/realtime.py` was reverted to `DT_TRML = 0.5` (thermald → 2Hz) in
the baseline restore. But `cereal/services.py` still declared `deviceState`
and `managerState` at 5Hz from earlier 4Hz-fan-control work. The freq window
is `[0.8 × min, 1.2 × max]` — declared 5Hz means [4.0, 6.0]Hz. Thermald
publishing at 2Hz fell well below 4.0 → freq_ok=False every cycle →
commIssue every cycle.
Already documented as a known footgun in `CLAUDE.md` ("Changing a Service's
Publish Rate").
### Fix
`cereal/services.py:33,73` — set both back to 2Hz with a comment pointing
at `DT_TRML`. After this, `not_freq_ok=[]` and the continuous commIssue
stopped — only one transient commIssue at startup remained (warmup).
---
## Commit 4: locationd — ignore GPS as Kalman input (reversible)
### Why
The car was drifting right on straight roads. Pre-revert, the user
reported this had been working for years; the only practically-new thing
is that `system/clearpilot/gpsd.py` (AT-command-based GPS) had recently
**started actually getting fixes**, where for a long time it wasn't.
Two concrete data problems with the GPS feed for selfdrive purposes:
- `gpsd.py:221` hard-codes `gps.vNED = [0.0, 0.0, 0.0]` while the user is
moving 28 m/s. locationd's `handle_gps` derives `OBSERVATION_ECEF_VEL`
from this; the Kalman gets "GPS says you're stopped" while accelerometer
says otherwise.
- `gpsd.py:216,222` populate horizontalAccuracy/verticalAccuracy from
`hdop * 5` (a rough conversion); `bearingAccuracyDeg = 10.0` and
`speedAccuracy = 1.0` are constants. None of these match what a real
GNSS chip reports.
These flow into the latcontrol_torque pipeline indirectly through
`liveLocationKalman.angularVelocityCalibrated` (used as
`actual_curvature_llk = ... / CS.vEgo` blended with steering-angle-derived
curvature) and through `liveLocationKalman.calibratedOrientationNED`
(pitch). When the Kalman has wrong velocity observations, those derived
fields go wrong — and on a straight crowned road, the controller's
"actual_curvature" picture is off-center, biasing torque output.
### What the user does NOT want disabled
gpsd publishes are still consumed by:
- UI speed indicator and the ClearPilot status overlay
- `dashcamd` for `.srt` subtitle sidecars (lat/lon per segment)
- `timed.py` for system-clock setting via `unixTimestampMillis`
- `gpsd.py` itself for sunset/sunrise → `IsDaylight` (auto night mode)
- `telemetryd.py` for the `gps` group in the CSV
So we cannot just stop publishing — only locationd should ignore.
### Fix
`selfdrive/locationd/locationd.cc:310-323` — add a `clearpilot_disable_gps`
const at the top of `Localizer::handle_gps` and OR it into the existing
reject condition. With it true, every gpsLocation message falls through to
`determine_gps_mode(current_time)` (openpilot's stock no-GPS path:
`input_fake_gps_observations` once position uncertainty grows past
`SANE_GPS_UNCERTAINTY`, otherwise nothing). `last_gps_msg` never updates,
`is_gps_ok()` returns False, `liveLocationKalman.gpsOK = false`.
Effect verified by user: drift improved noticeably. The torque controller
is no longer fed contradictory GPS-vs-IMU velocity observations through
the Kalman.
To re-enable GPS as a Kalman input: flip the `clearpilot_disable_gps`
constant to `false` and rebuild. Self-contained edit.
---
## Calibrationd: still stale, root-cause partially understood
### What is happening
`calibrationd` publishes `liveCalibration.valid = sm.all_checks()` on every
cycle, where `sm` polls `cameraOdometry` and non-polls `carState` and
`carParams`. We measured: **120 publishes in 30 seconds, every single one
`valid=False`** with `calStatus=calibrated, calPerc=100, validBlocks=50`.
The body of the calibration is converged — the validity flag is stuck off.
### Why this matters
`liveCalibration.valid=False` cascades:
1. `locationd.cc:715``filterInitialized = sm.allAliveAndValid()`.
liveCalibration is in the sub list and not in `ignore_alive` /
`ignore_valid`. So filterInitialized stays False forever.
2. With Kalman uninitialized, `liveLocationKalman` still publishes but the
body fields are empty/default. `liveLocationKalman.status = uninitialized`.
3. `paramsd.py` subscribes to `liveLocationKalman` with `poll='liveLocationKalman'`
and gates its update logic on `sm.all_checks()`. When liveLocationKalman
itself isn't in a sane state, paramsd's `roll`, `angleOffsetDeg`,
`steerRatio`, `stiffnessFactor` either never converge or converge to bad
values.
4. `latcontrol_torque.py:135-136` uses `params.angleOffsetDeg` to compute
`actual_curvature_vm` and `params.roll` for `roll_compensation`. With
`roll=0`, no compensation for a crowned/banked road. With wrong
`angleOffsetDeg`, the closed-loop "actual curvature" measurement is
biased.
So the latent risk is: even with our GPS fix, the controller is running
**without learned roll compensation and without a learned steering-angle
offset**. Symptom-free on straight, level pavement; biased on banked roads.
### Why `valid=False`
Inside calibrationd's SubMaster, `sm.all_checks() = all_alive AND all_freq_ok
AND all_valid`. We measured each:
- `cameraOdometry`: alive=True, valid=True, freq_ok=True ✓
- `carState`: alive flickers True/False, valid=True, **freq_ok=False (every cycle)**
- `carParams`: alive=True after first arrival, valid=True, freq_ok=False
but excluded from `all_freq_ok` because `_check_avg_freq` skips services
with `frequency < 0.99` Hz (carParams is 0.02 Hz declared) — so it doesn't
fail the gate.
The smoking gun is **`carState.freq_ok = False` from inside calibrationd's
`poll='cameraOdometry'` SubMaster**.
Direct measurement (`/tmp/test_subs.py`):
| poll arg | carState observed rate | freq_ok |
|---|---:|:-:|
| `None` (all polled) | 97.40 Hz | True |
| `'cameraOdometry'` | **2.24 Hz** | **False** |
| `'carState'` | 98.58 Hz | True |
carState is published at 100 Hz to a 10MB shared-memory MSGQ queue. From
inside `poll='cameraOdometry'`, our non-blocking `recv_one_or_none(carState)`
returns None ~88% of the time. `/tmp/diag_recv.py`:
```
cameraOdometry msgs received: 121
carState recv calls: 121, hits: 14 (11.6% hit rate)
avg cycle duration: 50.0ms
```
So we're calling `recv` 20× per second on carState's queue, and finding
the queue empty 9 out of 10 times — even though carState is being published
at 100 Hz to that same queue with a `conflate=True` socket option.
### The MSGQ NUM_READERS = 12 hypothesis
`cereal/messaging/msgq.h:9` defines `#define NUM_READERS 12`. When a 13th
subscriber tries to subscribe to a queue, `msgq.cc:182-197` **invalidates
ALL readers simultaneously** (`*q->read_valids[i] = false` for all i)
to "reset and re-register". On the next read, an invalidated reader's
`msgq_msg_recv` jumps to `msgq_reset_reader` (line 347) and ends up with
`read_pointer == write_pointer` (caught up to current), returning size 0.
Subscribers to `carState` in our running system include: controlsd,
plannerd, locationd, calibrationd, paramsd, dmonitoringd, frogpilot_process,
telemetryd, statsd. That's already nine. The introspection scripts
(`/tmp/check_*.py`, `/tmp/measure_freq.py`, `/tmp/diag_recv.py`,
`/tmp/cal_view.py`) and any UI-side subscribers add more, can easily
push past 12 and trigger global eviction. Once evicted, a long-running
subscriber stays in the "find empty queue / reset / try again" loop, which
is what we measured.
This is **partially confirmed** but not proven definitively. The
investigation was paused before instrumenting the queue header to count
slot churn. The smoking-gun would be: print `q->num_readers` from inside
calibrationd at boot vs. during steady state and watch it tick up to 12+.
### Things to consider for the actual fix
1. **`7ee923b` already solved this exact problem.** It changed
calibrationd's publish to:
```python
# was: calibrator.send_data(pm, sm.all_checks())
# to: calibrator.send_data(pm, calibrator.cal_status == log.LiveCalibrationData.Status.calibrated)
```
The commit message documented the exact failure mode (cascade through
locationd uninitialized → paramsd steerRatio≈0 / stiffnessFactor≈0 →
nonsense curvature commands). It was reverted by `47321e3` as part of
"restore driving logic to pre-variable-fps baseline" — but that revert
was about isolating the drift cause, and the calibrationd change here
is *not* in the variable-FPS family. **Re-applying `7ee923b` is
probably the right next move**, narrowly scoped to calibrationd.py.
2. Less attractive alternatives: bump `NUM_READERS`; switch
carState to `poll=None` in calibrationd (more cycles per update,
higher CPU); add `ignore_average_freq=['carState']` to calibrationd's
SubMaster (treats freq glitches as benign, but keeps the cascade for
alive/valid).
---
## Steer-fault alert investigation (separate symptom)
User saw "Steering Temporarily Unavailable" alerts during a test drive
even though we hadn't touched lateral-control code. Captured in
`realdata/00000081--528e2aa03a--0/rlog`:
- Faults occur in **brief 50100 ms pulses** clustered while moving slowly
(`vEgo` 2.75.2 m/s ≈ 612 mph).
- Each pulse correlates with **large driver wheel torque** (-250, +273,
-187, +119 Nm) — i.e. the user actively turning the wheel during a
parking-lot maneuver.
- `cruise.enabled = False` throughout — openpilot was not engaged.
- The car's MDPS sets `LKA_FAULT` at low speeds when torque is high; that
bit maps directly to `cs_out.steerFaultTemporary` (`carstate.py:259`),
which fires `steerTempUnavailableSilent` regardless of engagement
(`ET.WARNING` displays unconditionally).
User reports they've "never had this issue before" — implying earlier
ClearPilot revisions either gated the fault on speed or used a different
no-lateral path. **Not confirmed which.** Open follow-up.
---
## Open follow-ups (ordered by likely return)
1. **Re-apply `7ee923b`** — gate `liveCalibration.valid` on `calStatus`,
not `sm.all_checks()`. Unblocks locationd init → paramsd convergence →
real `params.roll` and `params.angleOffsetDeg` for the torque
controller. Latent benefit beyond what GPS-disable alone gave us.
2. **Investigate the persistent low-speed `steerTempUnavailable` alert.**
Either (a) gate `steerFaultTemporary` on `vEgo > ~8 m/s` in
`carstate.py:259`, or (b) find what the previous fork did — possibly
stopped sending tester CAN messages on park, possibly suppressed the
alert specifically during a park transition window.
3. **Suppress LKAS fault display when shifting drive → park.** The user
reports the car shows an LKAS-fault icon when openpilot keeps publishing
tester-present CAN messages after entering park. Investigation needed
in `selfdrive/car/hyundai/carcontroller.py` and `hyundaicanfd.py` to
gate tester messages on gear ≠ park.
4. **Wheel-torque headroom edit.** User mentioned a community-known edit
that allows slightly higher steering torque on the Hyundai panda safety
model. Research target: panda safety code for HYUNDAI_CANFD safety
model and the `MAX_TORQUE` / per-cycle delta limits.
5. **Single startup `commIssue` event.** Even with all our fixes, controlsd
logs one transient commIssue right after `controlsd.initialized`
(timeout=true after 6s). The `invalid` set at that moment is
downstream services still warming up (liveCalibration, liveLocationKalman,
liveParameters, liveTorqueParameters, frogpilotPlan, longitudinalPlan,
driverMonitoringState). Most should clear once the calibrationd issue
is fixed; remaining ones are normal warmup.
6. **gpsd.py vNED / accuracy fields.** Out of scope for this session
(we disabled GPS in locationd instead), but if GPS is ever re-enabled,
`gpsd.py:216,221-224` need real values: vNED from
`speed × {cos(bearing), sin(bearing), 0}`, and accuracy fields from
actual modem reports rather than hard-coded constants.