add CLAUDE.md documenting current state, snapshot locations, and port roadmap

Records: where the broken-but-feature-complete tree lives
(/data/openpilot-broken-2026-05-03 + tag pre-reset-2026-05-03), where
the baseline source still sits (/data/clearpilot-baseline), the
working-baseline-2 tag, and a categorized roadmap of features in the
broken tree that still need to be ported (driving behavior, onroad UI,
speed/cruise logic, dashcamd, gpsd, telemetry, bench mode, power/
thermal, memory params, session logging).

Carries forward the operational guardrails from the previous CLAUDE.md
(no cloudlog, chown to comma, samba/git/test workflow, params/cereal
gotchas, device specifics) so future sessions don't re-learn them.
This commit is contained in:
2026-05-03 21:58:42 -05:00
parent b287fd094e
commit b57f2d8d70
+241
View File
@@ -0,0 +1,241 @@
# ClearPilot — CLAUDE.md
## Project Overview
ClearPilot is a custom fork of **FrogPilot** (itself a fork of comma.ai's openpilot), purpose-built for Brian Hanson's **Hyundai Tucson** (HDA2 equipped). The vehicle's HDA2 system has specific quirks around how it synchronizes driving state with openpilot that require careful handling.
The fork was previously in a state where many features were layered on top but the driving model behavior had regressed in ways that couldn't be traced. On **2026-05-03** the working tree was reset back to a known-clean baseline so features can be re-introduced one at a time with proper testing.
## Current State (post-reset)
This tree currently contains:
- **The pristine pre-modification baseline** (commit `c2ab0fa`, "reset to pre-modification baseline").
- **Startup-chain customizations** (commit `5624898`): launch scripts, on_start/provision flow, OpenVPN auto-connect, nice-monitor, build helpers, custom logo + spinner, encrypted dev SSH keys.
- **Build/launch fixes** (commit `b287fd0`, tagged `working-baseline-2`): make baseline compile cleanly (QtWebEngine removed; `screenDisplayMode` reference fixed), and make `build_only.sh` exit on failure with a detached error window.
`build_only.sh` succeeds and `launch_openpilot.sh` boots the full manager process set.
## Where the Old Code Lives
| Location | What it is |
|---|---|
| `/data/openpilot/` | This repo. Working baseline + startup port. **Active.** |
| `/data/openpilot-broken-2026-05-03/` | Full snapshot (with `.git`) of the prior modified-but-broken tree. Reference for porting features. |
| `/data/clearpilot-baseline/` | The original baseline source we copied in. Kept for safety; do not modify. |
| `/data/openpilot-features-broken/` | Pre-existing snapshot from an earlier reset attempt — **unverified**, leave alone. |
| Tag | Commit | What |
|---|---|---|
| `pre-reset-2026-05-03` | `f7e602c` | Last commit of the broken-but-feature-complete tree. |
| `working-baseline-2` | `b287fd0` | Current head after the reset + startup port + compile fixes. |
Both tags are pushed to `origin/clearpilot`.
## Pending Feature Port Roadmap
Everything below is present in `/data/openpilot-broken-2026-05-03/` and **not** in this tree. Port them in small, testable batches — one feature area per commit, build + launch test in between.
**Driving behavior (HDA2 specifics):**
- Lateral disabled (car's radar cruise handles steering; openpilot longitudinal only)
- Brief disengage when turn signal is active during lane changes
- Driver-monitoring timeout adjustments
- Custom driving models in `selfdrive/clearpilot/models/`: `duck-amigo.thneed`, `farmville.onnx`, `wd-40.thneed`
**Onroad UI:**
- Onroad layout changes (number positions, sidebar hidden during drive)
- New ready/splash screen and home/offroad menu (the "ClearPilot menu" — sidebar settings panel replacing stock home; General / Network / Dashcam / Debug panels)
- Status window with live system stats (temp, fan, storage, RAM, WiFi, VPN, GPS, telemetry, dashcam)
- Crash handler in UI with stack-trace dump for SIGSEGV/SIGABRT
- Display modes via the LFA steering-wheel button (`ScreenDisplayMode`: auto-normal, nightrider, manual normal, screen off, manual nightrider) — including auto day/night switching driven by GPS sunrise/sunset
**Speed / cruise logic:**
- `selfdrive/clearpilot/speed_logic.py` — speed-limit display, cruise-vs-limit warning signs (different thresholds above/below 50 mph), ding sound on warning transitions
- New params: `ClearpilotSpeedDisplay`, `ClearpilotSpeedLimitDisplay`, `ClearpilotCruiseWarning`, `ClearpilotPlayDing`, etc.
**Dashcam:**
- `selfdrive/clearpilot/dashcamd` (native C++) — VisionIPC frames → OMX H.264 hardware encoder, 3-min MP4 segments + SRT GPS subtitles in `/data/media/0/videos/`
- Trip lifecycle (waits for time + GPS + drive gear; closes on park/ignition off)
- `system/loggerd/deleter.py` trip-aware storage rotation
- Disables `encoderd` / `stream_encoderd`; reuses upstream `omx_encoder.cc`
**GPS:**
- `system/clearpilot/gpsd.py` — replacement for broken `qcomgpsd` diag interface; polls Quectel modem via `mmcli` AT commands at 1Hz, publishes `gpsLocation`
- NOAA solar-position calc for sunrise/sunset (drives display auto day/night)
**Telemetry:**
- `selfdrive/clearpilot/telemetry.py` (client) + `telemetryd.py` (collector) — diff-based CSV logger over ZMQ
- Toggleable via `TelemetryEnabled` memory param from Debug panel
- Auto-disabled if `/data` free < 5 GB; auto-disabled on every manager start
- Hyundai CAN-FD data logged from `selfdrive/car/hyundai/carstate.py update_canfd()`
**Bench mode (UI testing without a car):**
- `--bench` flag → `BENCH_MODE=1` → enables `bench_onroad.py`, blocks real car processes
- `bench_cmd.py` for setting fake vehicle state via params
- UI introspection RPC at `ipc:///tmp/clearpilot_ui_rpc` (widget-tree dump)
**Power/thermal:**
- Standstill power saving: `modeld` and `dmonitoringmodeld` throttled to 1fps when stopped
- Fan controller uses offroad clamps at standstill
- Park CPU savings + virtual battery shutdown fix
**Memory params (`/dev/shm/params`):**
Lots of new keys for runtime UI state — `TelemetryEnabled`, `VpnEnabled`, `ModelStandby`, `ScreenDisplayMode`, `DashcamState`, `DashcamFrames`, `DashcamShutdown`, `LogDirInitialized`, plus the speed/cruise display set.
**Session logging:**
- `/data/log2/current/` per-process stderr capture; aggregate `session.log` of major events
- Time-resolved log dir rename via GPS/NTP; 30-day rotation
- See `selfdrive/manager/process.py` and `manager.py` changes
**Already in this tree (just listing for reference, do NOT re-port):**
- `system/clearpilot/vpn-monitor.sh` + `vpn.ovpn` — OpenVPN auto-connect to `vpn.hanson.xyz`
- `system/clearpilot/nice-monitor.sh`
- `system/clearpilot/provision.sh` (apt installs, Claude Code installer, git remote fix, fast-forward)
- `system/clearpilot/on_start.sh` (SSH keys, ssh.service, git.hanson.xyz Host config, WiFi radio on)
- `system/clearpilot/dev/id_ed25519.{cpt,pub.cpt}` (DongleId-keyed)
- `system/clearpilot/startup_logo/bg.jpg` + scripts
- `selfdrive/ui/qt/spinner` + `spinner.cc/.h` (custom logo)
- `build_only.sh`, `build_preflight.sh`
## Working Rules
### CRITICAL: Change Control
This is self-driving software. All changes must be deliberate and well-understood.
- **NEVER make changes outside of what is explicitly requested**
- **Always explain proposed changes first** — describe the change, the logic, and the architecture; let Brian review and approve before writing any code
- **Brian is an expert on this software** — do not override his judgment or assume responsibility for changes he doesn't understand
- **Every line must be understood** — work slowly and deliberately
- **Test everything thoroughly** — Brian must always be in the loop
- **Do not refactor, clean up, or "improve" code beyond the specific request**
### Logging
**NEVER use `cloudlog`.** It's comma.ai's cloud telemetry pipeline, not ours — writes go to a publisher that's effectively a black hole for us (and the only thing it could do if ever reachable is bother the upstream FrogPilot developer). Our changes must always use **file logging** instead.
Use `print(..., file=sys.stderr, flush=True)`. `manager.py` redirects each managed process's stderr to `/data/log2/current/{process}.log` (once that feature is re-ported), so these lines land in the per-process log we already grep. Prefix custom log lines with `CLP ` so they're easy to filter out from upstream noise.
Example:
```python
import sys
print(f"CLP frogpilotPlan valid=False: carState_freq_ok={sm.freq_ok['carState']}", file=sys.stderr, flush=True)
```
Do not use `cloudlog.warning`, `cloudlog.info`, `cloudlog.error`, `cloudlog.event`, or `cloudlog.exception` in any CLEARPILOT-added code. Existing upstream/FrogPilot `cloudlog` calls can stay untouched.
### File Ownership
We operate as `root` on this device, but openpilot runs as the `comma` user (uid=1000, gid=1000). After any code changes that touch multiple files or before testing:
```bash
chown -R comma:comma /data/openpilot
```
### Git
- Remote: `git@git.hanson.xyz:brianhansonxyz/clearpilot.git`
- Branch: `clearpilot`
- Large model files are tracked in git (intentional — this is a backup)
- The `clearpilot` branch was force-pushed on 2026-05-03 as part of the reset; the prior history is reachable via the `pre-reset-2026-05-03` tag.
### Samba Share
- Share name: `openpilot` (e.g. `\\comma-3889765b\openpilot`)
- Path: `/data/openpilot`
- Username: `comma`
- Password: `i-like-to-drive-cars`
- Runs as `comma:comma` via force user/group — files created over SMB are owned correctly
- Enabled at boot (`smbd` + `nmbd`)
### Testing Changes
Use `build_only.sh` to compile, then start the manager separately. Never compile individual targets with scons directly — always use the full build script. Always start the manager after a successful build — don't wait for the user to ask.
```bash
# 1. Fix ownership
chown -R comma:comma /data/openpilot
# 2. Build (kills running manager, removes prebuilt, compiles, exits)
# build_only.sh tees output to /tmp/build.log and propagates the build's
# exit code via PIPESTATUS. On failure: error text window stays on screen
# fully detached; the script exits non-zero and stderr has the compile error.
su - comma -c "bash /data/openpilot/build_only.sh"
# 3. If build succeeded ($? == 0), start openpilot
su - comma -c "bash /data/openpilot/launch_openpilot.sh"
# 4. Inspect logs
ls /data/log2/current/
cat /data/log2/current/session.log
```
### Adding New Params
The params system uses a C++ whitelist. Adding a new param name without registering it will crash with `UnknownKeyName`. To add one:
1. Register the key in `common/params.cc` (alphabetically, with `PERSISTENT` or `CLEAR_ON_*` flag)
2. Set the default in `selfdrive/manager/manager.py` `manager_init()`
3. Remove `prebuilt`, `common/params.o`, and `common/libcommon.a` to force rebuild
### Memory Params (paramsMemory)
Once re-ported, ClearPilot will use memory params (`/dev/shm/params/d/`) for UI toggles that should reset on boot. Conventions:
- **Registration**: register in `common/params.cc` as `PERSISTENT` (the registration flag does NOT control which path the param lives at)
- **C++ access**: `Params{"/dev/shm/params"}` — the Params class appends `/d/` internally, so `Params("/dev/shm/params/d")` would resolve to `/dev/shm/params/d/d/`
- **Python access**: `Params("/dev/shm/params")`
- **UI toggles**: use `ToggleControl` with manual `toggleFlipped` lambda, not `ParamControl` (which only handles persistent params)
- **IMPORTANT — method names differ between C++ and Python**: C++ uses camelCase (`putBool`, `getBool`, `getInt`), Python uses snake_case (`put_bool`, `get_bool`, `get_int`). This is a common source of silent failures.
### Changing a Service's Publish Rate
SubMaster's `freq_ok` check requires observed rate to fall within `[0.8 × min_freq, 1.2 × max_freq]` of the value declared in `cereal/services.py`. Publishing *faster* than declared trips `commIssue` just as surely as too slow. If you change how often a process publishes, update the rate in `cereal/services.py` to match.
## Device: comma 3x
- Qualcomm Snapdragon SoC (aarch64), serial `comma-3889765b`
- Storage: WDC SDINDDH4-128G, 128 GB UFS 2.1
- Ubuntu 20.04.6 LTS on AGNOS 9.7
- Kernel 4.9.103+ (custom comma.ai PREEMPT build, vendor-patched Qualcomm)
- Python 3.11.4 via pyenv at `/usr/local/pyenv/versions/3.11.4/` (system python 3.8 — do not use)
- Display: Weston (Wayland) on tty1
- Hardware encoding: OMX (`OMX.qcom.video.encoder.avc` / `.hevc`); V4L2 VIDC exists but is not usable from ffmpeg subprocess
### Filesystem mount quirks
| Mount | Device | Type | Notes |
|---|---|---|---|
| `/` | /dev/sda7 | ext4 | rw |
| `/data` | /dev/sda12 | ext4 | **persistent** — openpilot lives here |
| `/home` | overlay | overlayfs | **volatile** (upper on tmpfs) — changes lost on reboot |
| `/tmp` | tmpfs | tmpfs | volatile |
| `/persist` | /dev/sda2 | ext4 | persistent config/certs, noexec |
| `/dsp` | /dev/sde26 | ext4 | **read-only** Qualcomm DSP firmware |
| `/firmware` | /dev/sde4 | vfat | **read-only** firmware blobs |
### GPS
The device has **no u-blox chip** (`/dev/ttyHS0` does not exist). GPS is the **Quectel EC25 LTE modem**'s built-in GPS, accessed via AT commands through `mmcli`. The original `qcomgpsd` is broken on this device because the diag interface hangs after setup. Once re-ported, `system/clearpilot/gpsd.py` replaces it.
## Boot Sequence
```
Power On
→ systemd: comma.service (runs as comma user)
→ /usr/comma/comma.sh (waits for Weston, handles factory reset)
→ /data/continue.sh
→ /data/openpilot/launch_openpilot.sh
→ kill stale instances (launch_openpilot, launch_chffrplus, manager.py, ./ui, selfdrive/ui/text)
→ bash system/clearpilot/on_start.sh (SSH, WiFi, run provision.sh)
→ background system/clearpilot/vpn-monitor.sh
→ background system/clearpilot/nice-monitor.sh
→ exec ./launch_chffrplus.sh
→ source launch_env.sh
→ run agnos_init
→ set PYTHONPATH
→ if no `prebuilt`: run build.py (spinner + scons)
→ exec selfdrive/manager/manager.py
→ manager_init() sets default params
→ ensure_running() loop starts managed processes
```