Files
clearpilot/CLAUDE.md
T
brianhansonxyz f6516eb4cc CLAUDE.md: bring up to date with current state
Reorganized 'Current State' to enumerate what's actually active in the
tree now (boot/build infra, suppressed logging, gpsd, UI port, display
modes, speed_logicd, sound, dashcamd) instead of the brief summary of
just the early commits.

Trimmed 'Pending' to remove items that are now ported (UI, gpsd, sound,
speed_logic, display modes, dashcamd, memory-param scaffolding). Kept
the still-pending items: HDA2 driving-behavior tweaks, CarSpeedLimit
publisher, telemetry, bench mode, power/thermal, session logging.

Notes added:
- Memory-param keys currently registered (so the next port doesn't
  re-register existing keys by accident)
- camerad gating change to always_run (and the power tradeoff)
- Branch policy: clearpilot is the canonical/main branch; do not
  abandon it for another
- pkill _text by comm in the launch chain
- speed-limit widget shows 0 until carstate.py CAN-FD writes
  CarSpeedLimit
2026-05-03 23:19:41 -05:00

267 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ClearPilot — CLAUDE.md
## Project Overview
ClearPilot is a custom fork of **FrogPilot** (itself a fork of comma.ai's openpilot), purpose-built for Brian Hanson's **Hyundai Tucson** (HDA2 equipped). The vehicle's HDA2 system has specific quirks around how it synchronizes driving state with openpilot that require careful handling.
The fork was previously in a state where many features were layered on top but the driving model behavior had regressed in ways that couldn't be traced. On **2026-05-03** the working tree was reset back to a known-clean baseline so features can be re-introduced one at a time with proper testing. Most non-driving-math features have since been ported back.
## Current State
`build_only.sh` succeeds and `launch_openpilot.sh` boots the manager. The following customizations are active in this tree:
**Boot / build / device infra**
- launch chain (`launch_openpilot.sh`, `launch_chffrplus.sh`) with stale-process kill (including `_text` by comm), `on_start.sh` SSH/WiFi setup, OpenVPN auto-connect (`vpn-monitor.sh` + `vpn.ovpn`), `nice-monitor.sh`, `build_only.sh` + `build_preflight.sh`
- DongleId-keyed dev SSH identity (`system/clearpilot/dev/id_ed25519.{cpt,pub.cpt}` + `tools/{encrypt,decrypt}`)
- Custom startup logo (`bg.jpg`) + custom Qt spinner (`selfdrive/ui/qt/spinner{,.cc,.h}`) — `build.py` atomically swaps the prebuilt spinner binary after each successful build
- `BUILD_ONLY=1` env path in `build.py` spawns the failure TextWindow fully detached (own session, /dev/null stdio) so `build_only.sh` exits with a non-blocking error window
**Logging suppressed**
- `loggerd`, `encoderd`, `stream_encoderd` disabled in `process_config.py` — no `rlog`/`qlog` segments or `.hevc` files written to `/data/media/0/realdata/`
- `save_bootlog()` skipped in `manager_init()` — no boot logs
**GPS (Quectel modem replacement)**
- `system/clearpilot/gpsd.py` polls Quectel EC25 modem via `mmcli` AT commands at 2 Hz, publishes `gpsLocation`, sets system clock on first fix
- NOAA solar-position calc → `IsDaylight` memory param + auto `ScreenDisplayMode` 0↔1 switch (sunrise/sunset)
- **`locationd` patched to NOT subscribe to GPS** — `liveLocationKalman.gpsOK` stays false permanently. Self-driving sees GPS as not-present; downstream consumers (controlsd, paramsd, torqued, frogpilot_planner) handle that case naturally. GPS data is still published and used by speed_logicd / UI / dashcamd, just kept out of the kalman filter.
**UI (full C++ port)**
- New ready/splash screen rendered by Qt directly; shown when started but gear=park
- ClearPilot offroad menu (Home / Dashcam / Debug panels) replacing the stock home — Status window with live system stats (temp, fan, storage, RAM, WiFi, VPN, GPS, telemetry, dashcam)
- Onroad widgets: speed (from `gpsLocation` via `speed_logicd`), speed-limit, cruise-vs-limit warning sign
- Nightrider mode: camera suppressed, lane lines/path drawn as 2px outlines (`ScreenDisplayMode` 1 or 4)
- Display power: `ScreenDisplayMode` 3 → screen off; ignition off blanks immediately
- Qt RPC widget-tree dump server at `ipc:///tmp/clearpilot_ui_rpc`
- Crash handler in `main.cc` with stack-trace dump for SIGSEGV/SIGABRT
- `screenDisplayMode` is a member of `AnnotatedCameraWidget`, accessed from `OnroadWindow::updateState` as `nvg->screenDisplayMode`
- QtWebEngine dependency removed entirely
**Display modes (LFA debug button)**
- 5-state `ScreenDisplayMode` machine in `controlsd.clearpilot_state_control()`:
- Drive: 0→4, 1→2, 2→3, 3→4, 4→2 (button never goes back to auto)
- Not drive: anything except 3 → 3 (off), 3 → 0 (auto)
- Auto-wake from screen-off (mode 3 → 0) on park→drive edge
- "Clearpilot Debug Function Executed" alert removed (`clp_debug_notice` and its event registration deleted from `events.py`)
**Speed / cruise**
- `speed_logicd` standalone managed process (`selfdrive/clearpilot/speed_logicd.py` wrapping `speed_logic.SpeedState`) — subscribes to `gpsLocation` + `carState`, ticks at 2 Hz, writes `Clearpilot{Speed,SpeedLimit,CruiseWarning,...}Display` memory params
- Cruise-vs-limit warning thresholds: +10 over (limit ≥ 50), +7 over (26-49), +9 over (≤ 25); -5 under
- On warning transitions / speed-limit change: writes `ClearpilotPlayDing="1"` (consumed by soundd)
- Decoupled from `controlsd` so self-driving timing is unaffected
- **Note**: speed-limit widget shows 0 until `CarSpeedLimit` publisher in `hyundai/carstate.py` CAN-FD decode is ported (not done yet). Speed widget works.
**Sound**
- `soundd.py` mixes a single-shot ding alongside the alert stream (`MAX_VOLUME`, independent of alert volume map). Polls `ClearpilotPlayDing` at ~2 Hz, clears on read. `selfdrive/clearpilot/sounds/ding.wav` is the asset.
**Dashcam**
- `selfdrive/clearpilot/dashcamd` (native C++) — VisionIPC frames from camerad → OMX H.264 hardware encoder → 3-min MP4 segments + SRT GPS subtitle sidecars in `/data/media/0/videos/<trip>/`
- Trip lifecycle (WAITING → RECORDING → IDLE_TIMEOUT). 10-min idle close from drive→park; immediate close on ignition off; graceful close on `DashcamShutdown`
- Publishes `DashcamState` ("waiting"/"recording"/"stopped") and `DashcamFrames` (per-trip count) every 5 s
- `omx_encoder.cc` ports broken's version (adds `encode_frame_nv12()`, NEON-only libyuv paths)
- **`camerad` gating changed to `always_run`** so the dashcam can record the moment ignition+drive arrives without waiting for camera startup
- `system/loggerd/deleter.py` trip-aware cleanup: drops oldest entire trip dir first, falls back to oldest segment within active trip; also enforces 4 GB cap on `/data/log2`
## Where the Old Code Lives
| Location | What it is |
|---|---|
| `/data/openpilot/` | This repo. **Active.** |
| `/data/openpilot-broken-2026-05-03/` | Full snapshot (with `.git`) of the prior modified-but-broken tree. Reference for porting features. |
| `/data/clearpilot-baseline/` | The original baseline source we copied in. Kept for safety; do not modify. |
| `/data/openpilot-features-broken/` | Pre-existing snapshot from an earlier reset attempt — **unverified**, leave alone. |
| Tag | Commit | What |
|---|---|---|
| `pre-reset-2026-05-03` | `f7e602c` | Last commit of the pre-reset broken-but-feature-complete tree. |
| `working-baseline-2` | `b287fd0` | First commit after the reset where the bare baseline built and launched. |
Both tags are pushed to `origin/clearpilot`.
## Pending Feature Port Roadmap
Items still in `/data/openpilot-broken-2026-05-03/` that haven't been ported. Port in small, testable batches.
**Driving behavior (HDA2 specifics)** — these touch self-driving code and warrant extra scrutiny:
- Lateral disabled (car's radar cruise handles steering; openpilot longitudinal only)
- Brief disengage when turn signal is active during lane changes (note: `no_lat_lane_change` infrastructure already wired through `controlsd``paramsMemory` → UI/canfd, but the actual disengage trigger may need work)
- Driver-monitoring timeout adjustments
- Custom driving models — model files (`duck-amigo.thneed`, `farmville.onnx`, `wd-40.thneed`) live in `selfdrive/clearpilot/models/`; selection logic not ported
**Speed-limit publisher**
- `hyundai/carstate.py update_canfd()` writes `CarSpeedLimit` from CAN — until ported, the speed-limit widget shows 0 and cruise warnings never trigger
**Telemetry**
- `selfdrive/clearpilot/telemetry.py` (client) + `telemetryd.py` (collector) — diff-based CSV logger over ZMQ
- Toggleable via `TelemetryEnabled` memory param from Debug panel
- Auto-disabled if `/data` free < 5 GB; auto-disabled on every manager start
- Hyundai CAN-FD data logged from `update_canfd()` groups (car/cruise/speed_limit/buttons)
**Bench mode (UI testing without a car)**
- `--bench` flag → `BENCH_MODE=1` in `launch_openpilot.sh` (already wired) → enables `bench_onroad.py`, blocks real car processes
- `bench_cmd.py` for setting fake vehicle state via params; UI dump RPC wired (`bench_cmd dump`)
- Param keys (`Bench*`, `ClpUiState`) not yet registered
**Power/thermal**
- Standstill power saving: `modeld` and `dmonitoringmodeld` throttled to 1 fps when stopped
- Fan controller uses offroad clamps at standstill
- Park CPU savings + virtual battery shutdown fix
**Session logging**
- `/data/log2/current/` per-process stderr capture; aggregate `session.log` of major events
- Time-resolved log dir rename via GPS/NTP; 30-day rotation
- See `selfdrive/manager/process.py` and `manager.py` changes in the broken tree
- `LogDirInitialized` param not yet registered
## Working Rules
### CRITICAL: Change Control
This is self-driving software. All changes must be deliberate and well-understood.
- **NEVER make changes outside of what is explicitly requested**
- **Always explain proposed changes first** — describe the change, the logic, and the architecture; let Brian review and approve before writing any code
- **Brian is an expert on this software** — do not override his judgment or assume responsibility for changes he doesn't understand
- **Every line must be understood** — work slowly and deliberately
- **Test everything thoroughly** — Brian must always be in the loop
- **Do not refactor, clean up, or "improve" code beyond the specific request**
### Logging
**NEVER use `cloudlog`.** It's comma.ai's cloud telemetry pipeline, not ours — writes go to a publisher that's effectively a black hole for us (and the only thing it could do if ever reachable is bother the upstream FrogPilot developer). Our changes must always use **file logging** instead.
Use `print(..., file=sys.stderr, flush=True)`. Once session-logging is re-ported, manager will redirect each managed process's stderr to `/data/log2/current/{process}.log`. Prefix custom log lines with `CLP ` so they're easy to filter from upstream noise.
```python
import sys
print(f"CLP gpsd: ...", file=sys.stderr, flush=True)
```
Do not use `cloudlog.warning`, `cloudlog.info`, `cloudlog.error`, `cloudlog.event`, or `cloudlog.exception` in any CLEARPILOT-added code. Existing upstream/FrogPilot `cloudlog` calls can stay untouched.
### File Ownership
We operate as `root` on this device, but openpilot runs as the `comma` user (uid=1000, gid=1000). After any code changes that touch multiple files or before testing:
```bash
chown -R comma:comma /data/openpilot
```
### Git
- Remote: `git@git.hanson.xyz:brianhansonxyz/clearpilot.git`
- Branch: `clearpilot` — this is the canonical/main branch and is what fresh device flashes provision from. **Do not abandon it for another branch.**
- Large model files are tracked in git (intentional — this is a backup)
- The `clearpilot` branch was force-pushed on 2026-05-03 as part of the reset; the prior history is reachable via the `pre-reset-2026-05-03` tag.
### Samba Share
- Share name: `openpilot` (e.g. `\\comma-3889765b\openpilot`)
- Path: `/data/openpilot`
- Username: `comma`
- Password: `i-like-to-drive-cars`
- Runs as `comma:comma` via force user/group — files created over SMB are owned correctly
- Enabled at boot (`smbd` + `nmbd`)
### Testing Changes
Use `build_only.sh` to compile, then start the manager separately. Never compile individual targets with scons directly — always use the full build script. Always start the manager after a successful build — don't wait for the user to ask.
```bash
# 1. Fix ownership
chown -R comma:comma /data/openpilot
# 2. Build (kills running manager, removes prebuilt, compiles, exits)
# build_only.sh tees output to /tmp/build.log and propagates the build's
# exit code via PIPESTATUS. On failure: error text window stays on screen
# fully detached; the script exits non-zero and stderr has the compile error.
su - comma -c "bash /data/openpilot/build_only.sh"
# 3. If build succeeded ($? == 0), start openpilot
su - comma -c "bash /data/openpilot/launch_openpilot.sh"
# 4. Inspect logs
ls /data/log2/current/
cat /data/log2/current/session.log
tail /tmp/build.log # last build's output
```
### Adding New Params
The params system uses a C++ whitelist. Adding a new param name without registering it will crash with `UnknownKeyName`. To add one:
1. Register the key in `common/params.cc` (alphabetically, with `PERSISTENT` or `CLEAR_ON_*` flag)
2. For memory params, add a default in `manager_init()`'s memory-params loop (after the persistent default loop)
3. Remove `prebuilt`, `common/params.o`, and `common/libcommon.a` to force rebuild
### Memory Params (paramsMemory)
ClearPilot uses memory params (`/dev/shm/params/d/`) for transient state that should reset on boot. Conventions:
- **Registration**: register in `common/params.cc` as `PERSISTENT` (the registration flag does NOT control which path the param lives at — `/dev/shm` is tmpfs and clears on reboot regardless)
- **C++ access**: `Params{"/dev/shm/params"}` (the Params class appends `/d/` internally — `Params("/dev/shm/params/d")` would resolve to `/dev/shm/params/d/d/`)
- **Python access**: `Params("/dev/shm/params")`
- **UI toggles**: use `ToggleControl` with manual `toggleFlipped` lambda, not `ParamControl` (which only handles persistent params)
- **IMPORTANT — method names differ between C++ and Python**: C++ uses camelCase (`putBool`, `getBool`, `getInt`), Python uses snake_case (`put_bool`, `get_bool`, `get_int`). This is a common source of silent failures.
Currently registered memory-param keys (in `params.cc`): `ScreenDisplayMode`, `CPTLkasButtonAction`, `CarIsMetric`, `Clearpilot{CruiseWarning,CruiseWarningSpeed,HasSpeed,IsMetric,PlayDing,ShowHealthMetrics,SpeedDisplay,SpeedLimitDisplay,SpeedUnit}`, `Dashcam{Frames,Shutdown,State}`, `IsDaylight`, `ModelFps`, `ModelStandby`, `ModelStandbyTs`, `ScreenRecorderDebug`, `ShutdownTouchReset`, `TelemetryEnabled`, `VpnEnabled`, `no_lat_lane_change`. Bench-mode and `LogDirInitialized` keys not registered yet.
### Changing a Service's Publish Rate
SubMaster's `freq_ok` check requires observed rate to fall within `[0.8 × min_freq, 1.2 × max_freq]` of the value declared in `cereal/services.py`. Publishing *faster* than declared trips `commIssue` just as surely as too slow. If you change how often a process publishes, update the rate in `cereal/services.py` to match.
## Device: comma 3x
- Qualcomm Snapdragon SoC (aarch64), serial `comma-3889765b`
- Storage: WDC SDINDDH4-128G, 128 GB UFS 2.1
- Ubuntu 20.04.6 LTS on AGNOS 9.7
- Kernel 4.9.103+ (custom comma.ai PREEMPT build, vendor-patched Qualcomm)
- Python 3.11.4 via pyenv at `/usr/local/pyenv/versions/3.11.4/` (system python 3.8 — do not use)
- Display: Weston (Wayland) on tty1
- Hardware encoding: OMX (`OMX.qcom.video.encoder.avc` / `.hevc`); V4L2 VIDC exists but is not usable from ffmpeg subprocess
### Filesystem mount quirks
| Mount | Device | Type | Notes |
|---|---|---|---|
| `/` | /dev/sda7 | ext4 | rw |
| `/data` | /dev/sda12 | ext4 | **persistent** — openpilot lives here |
| `/home` | overlay | overlayfs | **volatile** (upper on tmpfs) — changes lost on reboot |
| `/tmp` | tmpfs | tmpfs | volatile |
| `/persist` | /dev/sda2 | ext4 | persistent config/certs, noexec |
| `/dsp` | /dev/sde26 | ext4 | **read-only** Qualcomm DSP firmware |
| `/firmware` | /dev/sde4 | vfat | **read-only** firmware blobs |
### GPS
The device has **no u-blox chip** (`/dev/ttyHS0` does not exist). GPS is the **Quectel EC25 LTE modem**'s built-in GPS, accessed via AT commands through `mmcli`. The original `qcomgpsd` is broken on this device because the diag interface hangs after setup. `system/clearpilot/gpsd.py` is the replacement; locationd is patched not to consume `gpsLocation` so the kalman filter isn't tainted.
### Sidebar / Process Notes
- `camerad` runs `always_run` (was `driverview`) — this is needed for dashcamd. Costs steady-state camera power even when ignition is off.
- `loggerd`, `encoderd`, `stream_encoderd` are commented out in `process_config.py` — no segment or video logging
- `qcomgpsd` is commented out; `system.clearpilot.gpsd` is registered in its place (gated by `qcomgps` callback = `not ublox_available()`, which is always true on this device)
- `speed_logicd` is `only_onroad`; `dashcamd` is `always_run` (manages own trip lifecycle)
- `uploader` already commented out in baseline
## Boot Sequence
```
Power On
→ systemd: comma.service (runs as comma user)
→ /usr/comma/comma.sh (waits for Weston, handles factory reset)
→ /data/continue.sh
→ /data/openpilot/launch_openpilot.sh
→ kill stale instances (launch_openpilot, launch_chffrplus, manager.py, ./ui, _text by comm)
→ bash system/clearpilot/on_start.sh (SSH, WiFi, run provision.sh)
→ background system/clearpilot/vpn-monitor.sh
→ background system/clearpilot/nice-monitor.sh
→ exec ./launch_chffrplus.sh
→ source launch_env.sh
→ run agnos_init
→ set PYTHONPATH
→ if no `prebuilt`: run build.py (spinner + scons)
→ exec selfdrive/manager/manager.py
→ manager_init() sets default params (persistent + memory)
→ ensure_running() loop starts managed processes
```