Commit Graph

86 Commits

Author SHA1 Message Date
brianhansonxyz 12da9acfdd revert modeld/controlsd/plannerd/locationd to first-commit baseline
Starting point for rebuilding self-drive from a known-good baseline.
Reverts the following to their state at f46339c:
- selfdrive/modeld/modeld.py (constant 20fps, no variable-rate / standby skip)
- selfdrive/modeld/dmonitoringmodeld.py (no carState sub, no standstill skip)
- selfdrive/controls/controlsd.py (no parked-cycle skip, no FPCC hoisting, no MDPS split)
- selfdrive/controls/lib/longitudinal_planner.py
- selfdrive/locationd/calibrationd.py (valid = sm.all_checks again)
- selfdrive/locationd/paramsd.py
- selfdrive/locationd/torqued.py

All non-self-drive features (thermald fan control, speed limit controller,
cruise warning signs, UI state transitions, GPS fixes, ClearPilot menu,
dashcamd, telemetry, etc.) remain as-is on this branch — only the 4 core
self-drive processes are reverted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 13:26:36 -05:00
brianhansonxyz 62a403d0f1 UI: shift speed-limit/cruise numbers down; modeld: keep 20fps at standstill-in-drive; health: FPS instead of LAG
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
- onroad: speed-limit sign and cruise over/under sign — shift the number
  ~10% further down inside the inner box (adjusted top inset 42→86).
- modeld: narrow the standby condition. Previously 0fps when
  standstill-or-parked; now 0fps only when parked. Standstill in drive
  (red light) continues to run at 20fps so lateral can engage/stay
  responsive and liveCalibration/paramsd keep seeing observations.
  Ignition-off still stops modeld at the manager level.
- Health overlay: replace LAG row with FPS (modeld framerate read from
  ModelFps memory param, which modeld already writes only on standby
  transition — no per-frame writes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 12:54:56 -05:00
brianhansonxyz 7ee923b0e6 calibrationd: publish valid based on calStatus, not sm.all_checks
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
stale / stale (push) Has been cancelled
release / build master-ci (push) Has been cancelled
Previous behavior gated liveCalibration.valid on calibrationd's own
sm.all_checks(). Upstream freq glitches (e.g. carState polling-pattern
artifacts) flapped liveCalibration.valid to False, which cascaded into
locationd: its filterInitialized check requires sm.allAliveAndValid(),
so flapped valid kept locationd uninitialized. While uninitialized,
locationd still published liveLocationKalman but with empty/garbage
angularVelocityCalibrated fields. paramsd's Kalman drank the garbage
and converged to steerRatio ≈ 0, stiffnessFactor ≈ 0 — which
controlsd clamped to 0.1 each and fed into VM.calc_curvature,
producing nonsense curvature commands and visibly jerky steering.

"valid" semantically asks whether the calibration data is
trustworthy — that's a question about convergence (calStatus ==
calibrated), not about input freshness. Switching the gate removes
the cascade: once calibration completes, liveCalibration.valid stays
True stably, locationd initializes, paramsd gets clean observations,
steerRatio converges to the real value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:40:02 -05:00
brianhansonxyz 6a79996a14 park CPU savings + fix early shutdown on virtual battery capacity
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
controlsd: in park+ignition-on, run full step() only every 10th cycle.
data_sample still runs every cycle (CAN parse, button detection) and
card.controls_update still runs every cycle (CAN TX heartbeat, counter
increments). Skipped cycles re-send cached controlsState/carControl so
downstream freq_ok stays OK. Button edge handling + display-mode
transitions extracted to handle_screen_mode() and called every cycle so
debug-button presses aren't dropped. controlsd: 55% → 30% CPU in park.

dmonitoringmodeld: subscribe to carState; at standstill, skip model.run
and re-publish last inference. driverStateV2 continues flowing at 10Hz
with known-good last face data (driver can't become distracted relative
to a stopped car). ~5% CPU saved.

power_monitoring: remove the `car_battery_capacity_uWh <= 0` shutdown
trigger. That virtual capacity counter floor-limits to 3e6 µWh on boot
and drains in ~12 min at typical device power, so a short drive (that
doesn't fully recharge the 30e6 µWh virtual cap) followed by a quick
store stop would trip shutdown well before the 30-min idle timer. The
real car-battery-voltage protection (low_voltage_shutdown at 11.8V with
60s debounce) is kept.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:46:33 -05:00
brianhansonxyz 426382960a dmonitoringd: narrow update_states gate to fix stuck face_detected
Original gate was sm.all_checks() which required modelV2 fresh and
liveCalibration.valid. Both fail spuriously in our setup:
- modelV2 stops at standstill/parked (two-state modeld)
- calibrationd propagates its own freq_ok glitches into liveCalibration.valid

Either condition froze DM pose updates — face_detected stuck False →
maybe_distracted True → awareness decayed to 0 within ~6s of engagement
("TAKE OVER" driver-distraction alert the moment controlsd engaged).

Narrow the gate to only the subs update_states actually consumes
(driverStateV2, carState, controlsState, liveCalibration), check only
alive+valid (skip freq_ok), and skip liveCalibration.valid — rpyCalib
presence is sufficient proof that calibration has produced output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 13:21:07 -05:00
brianhansonxyz 54a2a3afc5 add ghost_frame1/2 sprites for manual editing
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
Generated from xscreensaver left-facing wag pair (pacman.c frames 5 & 6)
and recolored blue at 2x our bg.jpg sprite size. Saved in source
orientation (portrait, same as bg.jpg) for hand-editing later.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:47:01 -05:00
brianhansonxyz 4c8ef93b2b modeld two-state; UI: immediate blank on ignition off; READY splash on wake
- modeld: simplified to 0fps (standstill or parked) or 20fps. Removed
  4/10fps reduced-rate path, republish caching, FPCC/liveCalibration
  reads, and ModelFps per-cycle param writes.
- ui.cc updateWakefulness: ignition on→off now resets interactive_timeout
  to 0 for immediate screen blank. Tap still wakes via existing handler.
- home.cc offroadTransition(false): reset ready->has_driven to false so
  the READY text appears on fresh ignition, not the textless post-drive
  splash.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:20:48 -05:00
brianhansonxyz ba4176ffd0 controlsd: suppress freq-only cascade; modeld: variable rate w/ republish cache
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
release / build master-ci (push) Has been cancelled
- commIssue now fires only on real comms failure (not_alive or CAN RX
  timeout), not on self-declared valid=False cascades from MSGQ-conflate +
  slow-polling freq_ok artifacts. Car drives fine through these and the
  banner was false-positive.
- Add 5-cycle hysteresis (50ms) on commIssue / posenetInvalid /
  locationdTemporaryError / paramsdTemporaryError.
- Cascade-aware suppression: skip posenet/locationd/paramsd temporary
  errors when the only problem is freq_only_cascade (all alive, just
  freq/valid tripped).
- Remove debug _dbg_dump_alert_triggers helper and EVENTS/EVENT_NAME
  imports.
- Re-enable variable-rate modeld (4/10fps idle, 20fps when lat_active /
  lane_changing / calibrating) with republish caching so consumers get
  constant publish rate.
- Split lat_requested (modeld signal) from lat_engaged (actuator gate).
  Momentary steerFaultTemporary no longer drops modeld rate, preventing
  stale-prediction feedback loop on MDPS recovery. CC.latActive still
  respects the fault.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 21:11:52 -05:00
brianhansonxyz e86eafde15 diag: per-publisher valid=False logging; 30min shutdown; daylight fix; UI tweaks
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
stale / stale (push) Has been cancelled
CLAUDE.md: added a "Logging" rule — never use cloudlog (upstream cloud
pipeline, effectively a black hole for us), always use
print(..., file=sys.stderr, flush=True). Manager redirects each process's
stderr to /data/log2/current/{proc}.log. Prefix our lines with "CLP ".

Diagnostic logging — when a publisher sets its own msg.valid=False, log
which specific subscriber tripped the check. Only fires on transition
(True→False) so we don't spam. Covers the services whose cascades we've
been chasing:
  - frogpilot_planner (frogpilotPlan)
  - longitudinal_planner (longitudinalPlan)
  - paramsd (liveParameters)
  - calibrationd (liveCalibration)
  - torqued (liveTorqueParameters)
  - dmonitoringd (driverMonitoringState)

gpsd.is_daylight: fixed a day-boundary bug where the function would flip
to "night" at UTC midnight regardless of actual local sunset. At 85W
sunset is ~00:20 UTC next day, so between local 8pm and actual sunset
the function used *tomorrow's* sunrise/sunset and said night. Now checks
yesterday/today/tomorrow windows with UTC-day offsets.

ui/onroad.cc: nightrider tire-path outline is now light blue (#99CCFF)
at 3px (was white/status-tinted at 6px); lane lines 5% thinner (float
pen width).

thermald/power_monitoring: auto-shutdown timer 10min → 30min.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 19:31:09 -05:00
brianhansonxyz cf2b3fc637 modeld: revert to constant 20fps (except standby at standstill)
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
Disabled the variable-rate (4/10/20fps) logic — it caused persistent
downstream freq/valid cascades that broke paramsd's angle-offset
learner, calibrationd, and locationd's one-shot filterInitialized latch.
Symptom: user's "TAKE CONTROL IMMEDIATELY / Communication Issue"
banner during engaged driving, plus subtle right-pull (paramsd's fast
angle-offset adaptation was frozen all session).

Simpler model now:
  standstill → standby (0fps, paused)
  otherwise  → 20fps (no variable rate)

Republish-caching and FPCC.latRequested/noLatLaneChange plumbing left in
place so re-enabling variable rate later is a one-line change
(full_rate = True → the original full_rate expression).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 18:21:46 -05:00
brianhansonxyz 5b67d4798b fix: update services.py rates to match new thermald/gpsd publish rates
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
Root cause of "TAKE CONTROL IMMEDIATELY / Communication Issue Between
Processes" during driving: our 4Hz thermald and 2Hz gpsd changes pushed
deviceState, managerState, and gpsLocation above their declared rate
ceilings in cereal/services.py. SubMaster's freq_ok check requires
observed rate within [0.8*min, 1.2*max]; publishing faster than max also
fails the check — which trips all_freq_ok(), raises EventName.commIssue,
and fires the banner.

- deviceState: 2Hz → 4Hz  (matches thermald DT_TRML=0.25)
- gpsLocation: 1Hz → 2Hz  (matches gpsd 2Hz polling)
- managerState: 2Hz → 4Hz (gated on deviceState arrival, inherits rate)

Also:
- Re-enabled dashcamd in process_config.py (was disabled for the
  diagnostic test — dashcamd was innocent).
- Added a CLAUDE.md section documenting this class of bug so the next
  rate change updates services.py too.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 16:28:55 -05:00
brianhansonxyz d64a0f6420 fix: thermald crash on startup — CarState is in cereal.car, not cereal.log
The new fan control plumbing referenced log.CarState.GearShifter.park,
but CarState is defined in car.capnp and only DeviceState/PandaState
live in log.capnp. Thermald crashed immediately on first carState tick,
which meant deviceState stopped publishing entirely. The UI then read
stale/default values — freeSpacePercent=0 → "100% full" → "Out of Storage"
alert despite 65 GB actually free. The fan also stopped updating.

from cereal import car, log  (was just log)
car.CarState.GearShifter.park (was log.CarState...)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 16:12:35 -05:00
brianhansonxyz 4dae5804ab feat: 4Hz fan control with gear/cruise-aware clamps; move hot signals to cereal
Fan control rework (thermald → 4Hz):
- DT_TRML 0.5s → 0.25s (thermald loop + fan PID now at 4Hz)
- New clamp rules based on (gear, cruise_engaged, standstill):
    parked                                → 0-100%
    in drive + cruise engaged (any speed) → 30-100%
    in drive + cruise off + standstill    → 10-100%
    in drive + cruise off + moving        → 30-100%
- thermald now reads gearShifter (via carState) and controlsState.enabled,
  passes them to fan_controller.update()
- Removed BENCH_MODE special case — new rules cover bench automatically
- Removed ignition-based branches — gear is the correct signal

System health overlay:
- Subscribed UI to peripheralState so we can read fanSpeedRpm
- Added FAN row: actual fan% (RPM / 65) to sit alongside LAG/DROP/TEMP/CPU/MEM.
  Shows the real fan output vs. what the PID is asking for.

Migrate hot signals from paramsMemory to cereal (frogpilotCarControl):
- Added latRequested @3 and noLatLaneChange @4 to FrogPilotCarControl schema
- controlsd sets FPCC.latRequested / FPCC.noLatLaneChange (send-on-change
  already gates the IPC)
- modeld reads from sm['frogpilotCarControl'] (added to its subscribers)
  instead of paramsMemory (saves ~20 file-read syscalls/sec)
- carcontroller reads from frogpilot_variables (set in-process by controlsd)
  instead of paramsMemory (saves ~100 file-read syscalls/sec in 100Hz path).
  Dropped carcontroller's now-unused Params instance and import.
- UI (ui.cc, onroad.cc) reads from sm['frogpilotCarControl'].noLatLaneChange
- Removed LatRequested and no_lat_lane_change param registrations + defaults

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 16:01:39 -05:00
brianhansonxyz 25b4672fda feat: raise controlsdLagging alert threshold to 20ms avg
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
release / build master-ci (push) Has been cancelled
Stock rk.lagging fires at 11.1ms (90% of 10ms interval), which the
Hyundai CAN load routinely crosses during normal driving as carstate
parses 50-200 msgs/sec. That's normal, not a fault — the same code
behaved the same way at the fork baseline; we just made it visible with
the LAG overlay.

50Hz effective control is still tighter than any human reaction loop.
rk.lagging remains wired to the defensive skip paths (secondary camera
and radar checks) at the original tighter threshold, so we still avoid
false-alarming dependent events when briefly overloaded.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 22:46:07 -05:00
brianhansonxyz 7a0854387e ui: shift health overlay labels 50px left
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 22:35:22 -05:00
brianhansonxyz 83aed16a35 perf: suppress controlsdLagging on engage + gate frogpilotCarControl send
Two related fixes for the post-engage lag spike:

1. controlsdLagging suppressed during lat_engage_suppress window.
   Ratekeeper.lagging triggers when avg cycle duration over 100 cycles
   exceeds 11.1ms (90% of 10ms budget). The modeld 10→20fps ramp causes
   a legitimate transient where downstream services (plannerd, locationd,
   calibrationd, paramsd) each drain 2x the message rate, briefly pushing
   avg cycle time past the threshold. The underlying system isn't broken —
   it's correctly absorbing a scheduled workload transition.

2. frogpilotCarControl now sends only on change (+ 1Hz keepalive) instead
   of every 10ms. The message has 3 bool fields, of which speedLimitChanged
   code is entirely commented out, trafficModeActive flips only on UI
   button press, and alwaysOnLateral changes only on cruise/gear/brake
   edges. plannerd doesn't include frogpilotCarControl in its all_checks
   list so stale-freq detection isn't a concern. Saves ~7ms/sec of
   capnp build + zmq send work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 22:35:16 -05:00
brianhansonxyz 6dede984dc perf: pin dashcamd to cores 0-3 (little cluster)
Previously ran unpinned (affinity mask 0xff) across all 8 cores. When it
landed on core 4 (controlsd) or 5 (plannerd/radard) or 7 (modeld), its
70MB/s frame copies and MP4 muxing caused cache/memory-bandwidth
contention with the RT-pinned processes. SCHED_FIFO prevented direct
preemption but not the cache thrash.

OMX offloads actual H.264 work to hardware so the main thread is
lightweight — fine on the little cluster.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 22:35:02 -05:00
brianhansonxyz 2ddb7fc764 feat: onroad health overlay + 2x tire path in nightrider
System health overlay:
- Lower-right 5-metric panel: LAG (controlsState.cumLagMs), DROP
  (modelV2.frameDropPerc), TEMP (deviceState.maxTempC), CPU (max core of
  deviceState.cpuUsagePercent), MEM (deviceState.memoryUsagePercent)
- Color-coded white→yellow→red by severity (LAG: 50/200ms, DROP: 5/15%,
  TEMP: 75/88°C, CPU: 75/90%, MEM: 70/85%)
- Toggle in ClearPilot → Debug → "System Health Overlay"
- New param ClearpilotShowHealthMetrics, PERSISTENT (disk, survives
  reboots), default false — re-polled every ~2s so toggle takes effect
  without process restart
- InterFont(90, Bold) to match speed limit numeric styling, 30px margin,
  40px between rows, black rounded background

Nightrider center lane path (the "tire track" polygon from
scene.track_vertices) is now drawn at 2x the width of other lines —
highlights the planned path distinctly against the otherwise stark
outline-only rendering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 22:03:30 -05:00
brianhansonxyz 22ced0c558 cleanup: remove dead CarCruiseDisplayActual param
Audit of post-fork param additions found CarCruiseDisplayActual was
written every CAN cycle (gated) but only consumed by hyundaicanfd.py::
create_buttons_alt, which has `return` on line 1 and no active callers.
Write was pure waste. Removed registration, write path, cache field,
and the dead read.

Also dropped the now-unused `from openpilot.common.params import Params`
in hyundaicanfd.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 22:03:01 -05:00
brianhansonxyz ef4e02e354 feat: 10fps daytime / 4fps night modeld + 2Hz GPS
Reduced-rate modeld path now branches on IsDaylight:
- daylight: skip 1/2 frames → 10fps (better model responsiveness when
  lighting gives the net more signal)
- night: skip 4/5 frames → 4fps (unchanged, conservative for power)

IsDaylight is already in /dev/shm (memory) via gpsd.py. Gated the
IsDaylight write on change — it flips twice a day, no reason to rewrite
every 30s. GPS polling bumped from 1Hz → 2Hz.

ModelFps publishes "10" / "4" / "20" so longitudinal_planner's dt and
FCW-threshold scaling (if re-enabled) still track actual rate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:41:29 -05:00
brianhansonxyz 02f25f83c4 perf: hoist Params read out of create_steering_messages 100Hz path
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
create_steering_messages was constructing a new Params("/dev/shm/params")
object and reading no_lat_lane_change on every CAN steering message build
— i.e. 100 allocations + 100 file reads per second. Now the Params
instance lives on CarController, and the value is read once per update()
cycle and passed as a parameter.

Audited all other hyundai CAN-FD integration code for similar patterns:
- carstate.py — already fixed (previous commit)
- carcontroller.py — other Params references are all in commented-out code
- hyundaicanfd.py::create_buttons_alt — dead code (early return), so the
  Params read there never executes; left as-is

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:34:52 -05:00
brianhansonxyz ceb3481cdc feat: add low-speed tier to cruise warning, gate param writes on change
Speed limit ≤25 mph now allows +8 over before warning (e.g. limit 25 →
33 is ok, warn at 34). Existing tiers unchanged: ≥50 → +9 ok warn at +10,
26-49 → +6 ok warn at +7.

Also gate all 7 speed_logic param writes on change (same pattern as the
earlier carstate/controlsd perf fix). Called at 2Hz so not as hot, but
unit/is_metric never change mid-drive and the cruise warning rarely
flips — no reason to thrash /dev/shm on every update.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:28:05 -05:00
brianhansonxyz bae022db43 fix: thermald crash blocking shutdown (getBool → get_bool)
Python Params uses snake_case; the C++ camelCase call raised
AttributeError, killing thermald_thread at the exact moment of shutdown.
Result: DoShutdown never got set, the 10-minute timer "worked" once (set
DashcamShutdown=True) and then thermald died silently. Device kept
draining the battery instead of powering down.

Caught because CLAUDE.md specifically flags this pattern as a common
source of silent failures between C++ and Python.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:20:53 -05:00
brianhansonxyz 1e36d7ec23 feat: disable FCW — stock AEB handles it better
Tucson's radar-based collision warning is more reliable than the comma
model/planner FCW and was producing false positives. Single-user fork in
a single car, so no need to keep both.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:11:39 -05:00
brianhansonxyz 6f3b1b1d2f docs: add py-spy profiling recipe to CLAUDE.md
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
stale / stale (push) Has been cancelled
Captures how we found the hot Params writes in carstate.py so the same
technique can be repeated. Includes the awk aggregators for per-callsite
and per-file-line breakdowns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 17:35:03 -05:00
brianhansonxyz 7a1e157c9c perf: gate hot /dev/shm writes on change — controlsd 69%→28% CPU
py-spy showed per-cycle atomic param writes were the dominant cost. Each
put() is mkstemp+fsync+flock+rename+fsync_dir — fine when rare, ruinous at
100Hz. At park with no state changes, these writes were running anyway and
the flock contention was poisoning the whole system.

carstate.py (update + update_canfd): CarSpeedLimit, CarIsMetric,
  CarCruiseDisplayActual were written every CAN update. Now cached and
  written only on change.
controlsd.py: same fix for LatRequested and no_lat_lane_change. Also
  throttle the sentry crash-file stat() from 100Hz to 1Hz.

Also: suppress locationdTemporaryError/paramsdTemporaryError/posenetInvalid
on lat engage (same 2s window as commIssue), and tie suppression to the
LatRequested edge instead of CC.latActive (fires immediately, not after
the 250ms ramp-up delay).

Also: reset Ratekeeper when it falls >1s behind — the ~6s fingerprinting
stall at startup was poisoning the lag metric for the entire session.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 17:34:07 -05:00
brianhansonxyz 2d819c784b feat: ramp-up delay on lat engagement to prevent commIssue flash
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
Decouples "tell modeld to go fast" from "steering actually active":
- New LatRequested memory param — controlsd writes when lat would be active
- modeld reads LatRequested (not carControl.latActive) for FPS decision,
  so it switches to 20fps immediately on engage request
- controlsd delays CC.latActive becoming true by 250ms (5 frames @ 20fps)
  after LatRequested goes true, giving downstream services
  (longitudinalPlan, liveCalibration, etc.) time to stabilize at the new rate

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 16:46:37 -05:00
brianhansonxyz 1eb8d41454 fix: fan 100% on overheat, FCW fps-aware, commIssue suppress, 10min shutdown
- Fan controller: allow full 100% fan when offroad temp >= 75°C (startup cooling)
- ModelFps memory param: modeld publishes actual FPS (20 or 4) so downstream
  consumers can adjust frame-rate-dependent logic
- Longitudinal planner: dynamically adjusts dt and v_desired_filter based on
  ModelFps; FCW crash_cnt threshold scales with FPS to maintain consistent
  0.15s trigger window at both 20fps and 4fps
- controlsd: suppress commIssue alerts for 2s after lateral control engages
  (FPS transition from 4->20 causes transient freq check failures)
- Shutdown timer: hardcoded to 10 minutes (was 45min via FrogPilot param),
  screen taps reset the countdown via ShutdownTouchReset memory param,
  removed Shutdown Timer UI selector from ClearPilot menu

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 16:29:42 -05:00
brianhansonxyz 2331aa00a0 feat: dashcamd trip lifecycle, status indicator, CLAUDE.md updates
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
release / build master-ci (push) Has been cancelled
dashcamd now waits for valid system time + GPS fix + drive gear before
starting a trip. Returns to waiting state on 10-min park timeout or
ignition off. Publishes DashcamState and per-trip DashcamFrames to
memory params. Status window shows stopped/waiting/recording states.

Updated CLAUDE.md with current display mode behavior, OmxEncoder port
details, speed limit warning thresholds, and dashcam param docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 01:26:58 -05:00
brianhansonxyz dfb7b7404f fix: port OmxEncoder safety fixes from upstream FrogPilot
- OMX_Init/OMX_Deinit managed per encoder instance lifecycle
- Proper error handling in constructor, encoder_open, encoder_close
- Null guards on done_out.pop() and handle in destructor
- Codec config written directly to codecpar (no codec_ctx)
- ffmpeg faststart remux on segment close
- Crash handler in dashcamd for diagnostics
- DashcamFrames param for live frame count in status window

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 01:13:41 -05:00
brianhansonxyz 9ac334b7cf fix: dashcamd OMX crash on restart, add dashcam status indicator
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
- Reset OMX subsystem (Deinit/Init) on dashcamd startup to clear stale
  encoder state from previous unclean exits
- Validate OMX output buffers before memcpy to prevent segfault
- Validate VisionBuf frame data before encoding
- Add dashcam row to status window showing recording state and disk usage

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 00:44:13 -05:00
brianhansonxyz 3cbb81f9f1 fix: always show splash in park, nightrider transitions to screen off
Removed nightrider exception that kept onroad UI visible in park.
Shifting to park from nightrider mode now auto-switches to screen off.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 00:12:01 -05:00
brianhansonxyz 3b47412100 feat: tap screen to wake from screen-off mode
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
Tapping the touchscreen while in display mode 3 (screen off) resets
ScreenDisplayMode to 0 (auto-normal) and wakes the display.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:59:52 -05:00
brianhansonxyz 5e7c8bed52 fix: speed limit rounding, controlsd crashes, modeld calibration rate
- Speed limit display uses round() instead of floor() to fix off-by-one
- Initialize driving_gear in controlsd __init__ to prevent AttributeError
- Fix gpsLocation capnp access (attribute style, not getter methods)
- Fix sm.valid dict access (brackets not parentheses)
- modeld runs full 20fps during calibration instead of throttled 4fps
- Over-speed warning threshold below 50mph changed from 5 to 7

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:55:19 -05:00
brianhansonxyz 6cf21a3698 fix: debug button screen-off mode now works in park state
updateWakefulness was overriding display power every frame when ignition
was on, fighting the screen-off set by home.cc. Now respects
ScreenDisplayMode 3 unconditionally. Also auto-resets to mode 0 when
shifting into drive from screen-off.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 04:21:37 +00:00
brianhansonxyz adcffad276 feat: speed limit ding sound when cruise warning sign appears
Plays a ding via soundd when the cruise warning sign becomes visible
(cruise set speed out of range vs speed limit) or when the speed limit
changes while the warning sign is already showing. Max 1 ding per 30s.

Ding is mixed independently into soundd output at max volume without
interrupting alert sounds. bench_cmd ding available for manual trigger.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 03:49:02 +00:00
brianhansonxyz 6e7117b177 feat: cruise warning signs and speed limit sign sizing
Cruise warning sign appears above speed limit sign when cruise set
speed is too far from the speed limit:
- Red (over): cruise >= limit + 10 (if limit >= 50) or + 5 (if < 50)
- Green (under): cruise <= limit - 5
- Only when cruise active (not paused/disabled) and limit >= 20
- Nightrider mode: colored text/border on black background

Speed limit sign enlarged 5%. 20px gap between signs. Bench mode
gains cruiseactive command (0=disabled, 1=active, 2=paused).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 03:33:27 +00:00
brianhansonxyz 8ccdb47c88 feat: speed limit sign UI and speed processing pipeline
New speed_logic.py module converts raw CAN speed limit and GPS speed
into display-ready params. Called from controlsd (live) and
bench_onroad (bench) at ~2Hz. UI reads params to render:
- Current speed (top center, hidden when 0 or no GPS)
- MUTCD speed limit sign (lower-left, normal + nightrider styles)
- Unit-aware display (mph/kph based on CAN DISTANCE_UNIT)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 03:15:07 +00:00
brianhansonxyz b84c268b6e fix: standstill fan clamp 0-30% and bench mode always 30-100%
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
release / build master-ci (push) Has been cancelled
stale / stale (push) Has been cancelled
Standstill quiet mode was clamped to 0-10% which is dangerously low
under sustained load. Raised to 0-30%. Bench mode now forces 30-100%
onroad fan range regardless of standstill to prevent overheating
during bench testing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 03:09:41 +00:00
brianhansonxyz ffa9da2f97 add root SSH config for git.hanson.xyz to on_start.sh
Ensures git push works without GIT_SSH_COMMAND override. Idempotent —
skips if Host entry already exists in /root/.ssh/config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 03:00:55 +00:00
brianhansonxyz 5e7911e599 move SSH key decryption from provision.sh to on_start.sh
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
Keys now install to /root/.ssh/ (for root git operations) instead of
/data/ssh/.ssh/. Runs every boot via on_start.sh so keys are available
even without a full provision cycle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 02:59:34 +00:00
brianhansonxyz 6fcd4b37ba fix: use wrapper script for claude instead of PATH modification
Create /usr/local/bin/claude wrapper that remounts / read-write before
calling the real binary. Removes PATH append to ~/.bashrc which was
duplicating on every provision run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 02:57:40 +00:00
brianhansonxyz 8d5903b945 fix: pass log_path to athenad launcher to fix crash loop
manage_athenad was calling launcher() with only 2 args but the
per-process logging changes added a required log_path parameter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 02:48:47 +00:00
brianhansonxyz cea8926604 fix: correct git remote repo name to clearpilot.git
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 02:44:55 +00:00
brianhansonxyz e98ae2f9d1 fix git remote: use SSH URL, add remote fixup step to provision.sh
Provision script now checks and corrects the git origin URL to the
SSH remote before fetching updates. Also fixed CLAUDE.md to reflect
the correct hostname (git.hanson.xyz, not git.internal.hanson.xyz).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 02:44:11 +00:00
brianhansonxyz 531b3edcd2 fix: decrypt SSH keys to tmpdir instead of repo, gitignore ed25519 keys
The decrypt step in provision.sh was writing decrypted private keys
directly into the source tree (system/clearpilot/dev/), leaving them
as untracked files in the repo. Now decrypts to a mktemp dir, copies
to the SSH dir, and cleans up. Also added ed25519 key paths to
.gitignore to match the existing id_rsa entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 02:42:52 +00:00
brianhansonxyz f46339c949 switch SSH keys to ed25519, encrypt with hardware serial instead of DongleId
prebuilt / build prebuilt (push) Has been cancelled
badges / create badges (push) Has been cancelled
- Generate new ed25519 keypair (replaces old RSA keys)
- Encrypt with device serial from /proc/cmdline (always available, no manager needed)
- Update decrypt/encrypt tools and provision.sh to use serial
- Remove dependency on DongleId param for SSH key provisioning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:32:51 +00:00
brianhansonxyz 21f71cbc37 Merge branch 'clearpilot' of https://git.hanson.xyz/brianhansonxyz/clearpilot into clearpilot 2026-04-15 01:26:46 +00:00
brianhansonxyz 9d5c838fe3 fix fresh-install build: add preflight script, restore third_party binaries, remove WebEngine dep
- Add build_preflight.sh to create obj/ dirs that git can't track (body/board/obj, panda/board/obj)
- Wire preflight into build_only.sh and launch_chffrplus.sh
- Restore missing third_party binaries (libyuv, snpe, acados, maplibre) that were text pointers
- Remove dead Qt5WebEngineWidgets dependency from UI SConscript

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:26:40 +00:00
brianhansonxyz 4283a3d3f7 provision: add ccrypt, nodejs, claude, ssh identity keys, fix scons obj dirs
stale / stale (push) Has been cancelled
- Install ccrypt, nodejs 18, npm, claude code in provision
- Decrypt id_rsa/id_rsa.pub via dongle ID and install to /data/ssh/.ssh/
- Run provision directly instead of through qt_shell wrapper
- Fix panda and body SConscripts to mkdir obj/ before writing gitversion.h
- Add sudo to su - comma build call
- Remount / rw at top of provision

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 20:18:36 -05:00