Files
clearpilot/CLAUDE.md
Brian Hanson 3edc1972c8 crash handler, model guard, status temp/fan, timeout fix
- SIGSEGV/SIGABRT crash handler in ui/main.cc prints stack trace to stderr
- Fixed onroad crash: guard update_model() against empty model position data
  (was dereferencing end()-1 on empty list when modeld not running in bench)
- Status window: added device temperature and fan speed
- Interactive timeout returns to splash/onroad (not ClearPilotPanel)
- bench_cmd dump detects crash loops via UI process uptime check
- bench_cmd wait_ready timeout increased to 20s
- Restored camerad to bench ignore list (not needed for UI testing)
- Updated CLAUDE.md with crash debugging procedures

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 02:46:08 +00:00

440 lines
21 KiB
Markdown

# ClearPilot — CLAUDE.md
## Project Overview
ClearPilot is a custom fork of **FrogPilot** (itself a fork of comma.ai's openpilot), based on a 2024 release. It is purpose-built for Brian Hanson's **Hyundai Tucson** (HDA2 equipped). The vehicle's HDA2 system has specific quirks around how it synchronizes driving state with openpilot that require careful handling.
### Key Customizations in This Fork
- **UI changes** to the onroad driving interface
- **Lane change behavior**: brief disengage when turn signal is active during lane changes
- **Lateral control disabled**: the car's own radar cruise control handles lateral; openpilot handles longitudinal only
- **Driver monitoring timeouts**: modified safety timeouts for the driver monitoring system
- **Custom driving models**: `duck-amigo.thneed`, `farmville.onnx`, `wd-40.thneed` in `selfdrive/clearpilot/models/`
- **ClearPilot service**: Node.js service at `selfdrive/clearpilot/` with behavior scripts for lane change and longitudinal control
- **Native dashcamd**: C++ process capturing raw camera frames via VisionIPC with OMX H.264 hardware encoding
- **Standstill power saving**: model inference throttled to 1fps and fan quieted when car is stopped
- **Clean offroad UI**: grid launcher replacing stock home screen
- **Debug button (LFA)**: steering wheel button repurposed for screen toggle and future UI actions
See `GOALS.md` for feature roadmap.
## Working Rules
### CRITICAL: Change Control
This is self-driving software. All changes must be deliberate and well-understood.
- **NEVER make changes outside of what is explicitly requested**
- **Always explain proposed changes first** — describe the change, the logic, and the architecture; let Brian review and approve before writing any code
- **Brian is an expert on this software** — do not override his judgment or assume responsibility for changes he doesn't understand
- **Every line must be understood** — work slowly and deliberately
- **Test everything thoroughly** — Brian must always be in the loop
- **Do not refactor, clean up, or "improve" code beyond the specific request**
### File Ownership
We operate as `root` on this device, but openpilot runs as the `comma` user (uid=1000, gid=1000). After any code changes that touch multiple files or before testing:
```bash
chown -R comma:comma /data/openpilot
```
### Git
- Remote: `git@git.internal.hanson.xyz:brianhansonxyz/comma.git`
- Branch: `clearpilot`
- Large model files are tracked in git (intentional — this is a backup)
### Samba Share
- Share name: `openpilot` (e.g. `\\comma-3889765b\openpilot`)
- Path: `/data/openpilot`
- Username: `comma`
- Password: `i-like-to-drive-cars`
- Runs as `comma:comma` via force user/group — files created over SMB are owned correctly
- Enabled at boot (`smbd` + `nmbd`)
### Testing Changes
Always use `build_only.sh` to compile, then start the manager separately. Never compile individual targets with scons directly — always use the full build script. Always use full paths with `su - comma` — the login shell drops into `/home/comma` (volatile tmpfs), not `/data/openpilot`. **Always start the manager after a successful build** — don't wait for the user to ask.
```bash
# 1. Fix ownership (we edit as root, openpilot runs as comma)
chown -R comma:comma /data/openpilot
# 2. Build (kills running manager, removes prebuilt, compiles, exits)
# Shows progress spinner on screen. On failure, shows error on screen
# and prints to stderr. Does NOT start the manager.
su - comma -c "bash /data/openpilot/build_only.sh"
# 3. If build succeeded ($? == 0), start openpilot
su - comma -c "bash /data/openpilot/launch_openpilot.sh"
# 4. Review the aggregate session log for errors
cat /data/log2/$(ls -t /data/log2/ | head -1)/session.log
# 5. Check per-process stderr logs if needed
ls /data/log2/$(ls -t /data/log2/ | head -1)/
```
### Adding New Params
The params system uses a C++ whitelist. Adding a new param name in `manager.py` alone will crash with `UnknownKeyName`. You must:
1. Register the key in `common/params.cc` (alphabetically, with `PERSISTENT` or `CLEAR_ON_*` flag)
2. Add the default value in `selfdrive/manager/manager.py` in `manager_init()`
3. Remove `prebuilt`, `common/params.o`, and `common/libcommon.a` to force rebuild
### Building Native (C++) Processes
- SCons is the build system. Static libraries (`common`, `messaging`, `cereal`, `visionipc`) must be imported as SCons objects, not `-l` flags
- The `--as-needed` linker flag can cause link order issues with static libs — disable it in your SConscript if needed
- OMX encoder object (`omx_encoder.o`) is compiled by the UI build — reference the pre-built `.o` file rather than recompiling (avoids "two environments" scons error)
- `prebuilt` is recreated after every successful build — always remove it before rebuilding
## Bench Mode (Onroad UI Testing)
Bench mode allows testing the onroad UI without a car connected. It runs a fake vehicle simulator (`bench_onroad.py`) as a managed process and disables real car processes (pandad, thermald, controlsd, etc.).
### Usage
**IMPORTANT**: Do NOT use `echo` to write bench params — `echo` appends a newline which causes param parsing to fail silently (e.g. gear stays in park). Always use the `bench_cmd.py` tool.
```bash
# 1. Build first
chown -R comma:comma /data/openpilot
su - comma -c "bash /data/openpilot/build_only.sh"
# 2. Start in bench mode
su - comma -c "bash /data/openpilot/launch_openpilot.sh --bench"
# 3. Wait for UI to be ready (polls RPC every 1s, up to 20s)
su - comma -c "PYTHONPATH=/data/openpilot python3 -m selfdrive.clearpilot.bench_cmd wait_ready"
# 4. Control vehicle state
su - comma -c "PYTHONPATH=/data/openpilot python3 -m selfdrive.clearpilot.bench_cmd gear D"
su - comma -c "PYTHONPATH=/data/openpilot python3 -m selfdrive.clearpilot.bench_cmd speed 20"
su - comma -c "PYTHONPATH=/data/openpilot python3 -m selfdrive.clearpilot.bench_cmd speedlimit 45"
su - comma -c "PYTHONPATH=/data/openpilot python3 -m selfdrive.clearpilot.bench_cmd cruise 55"
su - comma -c "PYTHONPATH=/data/openpilot python3 -m selfdrive.clearpilot.bench_cmd engaged 1"
# 5. Inspect UI widget tree (RPC call, instant response)
su - comma -c "PYTHONPATH=/data/openpilot python3 -m selfdrive.clearpilot.bench_cmd dump"
```
### Debugging Crashes
The UI has a SIGSEGV/SIGABRT crash handler (`selfdrive/ui/main.cc`) that prints a stack trace to stderr, captured in the per-process log:
```bash
# Check for crash traces
grep -A 30 "CRASH" /data/log2/$(ls -t /data/log2/ | head -1)/ui.log
# Resolve addresses to source lines
addr2line -e /data/openpilot/selfdrive/ui/ui -f 0xADDRESS
# bench_cmd dump detects crash loops automatically:
# if UI process uptime < 5s, it reports "likely crash looping"
```
### UI Introspection RPC
The UI process runs a ZMQ REP server at `ipc:///tmp/clearpilot_ui_rpc`. Send `"dump"` to get a recursive widget tree showing class name, visibility, geometry, and stacked layout current indices. This is the primary debugging tool for understanding what the UI is displaying.
- `bench_cmd dump` — queries the RPC and prints the widget tree
- `bench_cmd wait_ready` — polls the RPC every second until `ReadyWindow` is visible (up to 10s)
- `ui_dump.py` — standalone dump tool (same as `bench_cmd dump`)
### Architecture
- `launch_openpilot.sh --bench` sets `BENCH_MODE=1` env var
- `manager.py` reads `BENCH_MODE`, blocks real car processes, enables `bench_onroad` process
- `bench_onroad.py` publishes fake `deviceState`, `pandaStates`, `carState`, `controlsState` at correct frequencies
- The UI receives these messages identically to real car data
- Blocked processes in bench mode: pandad, thermald, controlsd, radard, plannerd, calibrationd, torqued, paramsd, locationd, sensord, ubloxd, pigeond, dmonitoringmodeld, dmonitoringd, modeld, soundd, camerad, loggerd, micd, dashcamd
### Key Files
| File | Role |
|------|------|
| `selfdrive/clearpilot/bench_onroad.py` | Fake vehicle state publisher |
| `selfdrive/clearpilot/bench_cmd.py` | Command tool for setting bench params and querying UI |
| `selfdrive/clearpilot/ui_dump.py` | Standalone UI widget tree dump |
| `selfdrive/manager/process_config.py` | Registers bench_onroad as managed process (enabled=BENCH_MODE) |
| `selfdrive/manager/manager.py` | Blocks conflicting processes in bench mode |
| `launch_openpilot.sh` | Accepts `--bench` flag, exports BENCH_MODE env var |
| `selfdrive/ui/qt/window.cc` | UI RPC server (`ipc:///tmp/clearpilot_ui_rpc`), widget tree dump |
### Resolved Issues
- **SIGSEGV in onroad view (fixed)**: `update_model()` in `ui.cc` dereferenced empty model position data when modeld wasn't running. Fixed by guarding against empty `plan_position.getX()`. The root cause was found using the crash handler + `addr2line`.
- **`showDriverView` overriding transitions (fixed)**: was forcing `slayout` to onroad/home every frame at 20Hz, overriding park/drive logic. Fixed to only act when not in started state.
- **Sidebar appearing during onroad transition (fixed)**: `MainWindow::closeSettings()` was re-enabling the sidebar. Fixed by not calling `closeSettings` during `offroadTransition`.
## Session Logging
Per-process stderr and an aggregate event log are captured in `/data/log2/{session}/`.
### Log Directory
- Created at manager import time with timestamp: `/data/log2/YYYY-MM-DD-HH-MM-SS/`
- If system clock is invalid (cold boot, no WiFi, RTC stuck at 1970): uses `/data/log2/boot-{monotonic}/`, renamed to real timestamp once GPS/NTP resolves the time
- Session directories older than 30 days are deleted on manager startup
### Per-Process Logs
- Every `PythonProcess` and `NativeProcess` has stderr redirected to `{name}.log` in the session directory
- `DaemonProcess` (athenad) redirects both stdout+stderr (existing behavior)
- Stderr is redirected via `os.dup2` inside the forked child process
### Aggregate Session Log (`session.log`)
A single `session.log` in each session directory records major events:
- Manager start/stop/crash
- Process starts, deaths (with exit codes), watchdog restarts
- Onroad/offroad transitions
### Key Files
| File | Role |
|------|------|
| `selfdrive/manager/process.py` | Log directory creation, stderr redirection, session_log logger |
| `selfdrive/manager/manager.py` | Log rotation cleanup, session event logging |
| `build_only.sh` | Build-only script (no manager start) |
## Dashcam (dashcamd)
### Architecture
`dashcamd` is a native C++ process that captures raw camera frames directly from `camerad` via VisionIPC and encodes to MP4 using the Qualcomm OMX H.264 hardware encoder. This replaces the earlier FrogPilot screen recorder approach (`QWidget::grab()` -> OMX).
- **Codec**: H.264 AVC (hardware accelerated via `OMX.qcom.video.encoder.avc`)
- **Resolution**: 1928x1208 (full camera resolution, no downscaling)
- **Bitrate**: 4 Mbps
- **Container**: MP4 (remuxed via libavformat)
- **Segment length**: 3 minutes per file
- **Save path**: `/data/media/0/videos/YYYYMMDD-HHMMSS.mp4`
- **Standstill timeout**: suspends recording after 10 minutes of standstill, resumes when car moves
- **Storage**: ~90 MB per 3-minute segment, ~43 hours of footage in 78 GB free space
- **Storage device**: WDC SDINDDH4-128G UFS 2.1 — automotive grade, ~384 TB write endurance, no concern for continuous writes
### Key Differences from Old Screen Recorder
| | Old (screen recorder) | New (dashcamd) |
|---|---|---|
| Source | `QWidget::grab()` screen capture | Raw NV12 from VisionIPC |
| Resolution | 1440x720 | 1928x1208 |
| Works with screen off | No (needs visible widget) | Yes (independent of UI) |
| Process type | Part of UI process | Standalone native process |
| Encoder input | RGBA -> NV12 conversion | NV12 direct (added `encode_frame_nv12`) |
### Key Files
| File | Role |
|------|------|
| `selfdrive/clearpilot/dashcamd.cc` | Main dashcam process — VisionIPC -> OMX encoder |
| `selfdrive/clearpilot/SConscript` | Build config for dashcamd |
| `selfdrive/frogpilot/screenrecorder/omx_encoder.cc` | OMX encoder (added `encode_frame_nv12` method) |
| `selfdrive/frogpilot/screenrecorder/omx_encoder.h` | Encoder header |
| `selfdrive/manager/process_config.py` | dashcamd registered as NativeProcess, encoderd disabled |
| `system/loggerd/deleter.py` | Storage rotation (9 GB threshold, oldest videos deleted first) |
### Params
- `DashcamDebug` — when `"1"`, dashcamd runs even without car connected (for bench testing)
- `IsDriverViewEnabled` — must be `"1"` to start camerad on bench (no car ignition)
## Standstill Power Saving
When `carState.standstill` is true:
- **modeld**: skips GPU inference on 19/20 frames (1fps vs 20fps), reports 0 frame drops to avoid triggering `modeldLagging` in controlsd
- **dmonitoringmodeld**: same 1fps throttle, added `carState` subscription
- **Fan controller**: uses offroad clamps (0-30%) instead of onroad (30-100%) at standstill; thermal protection still active via feedforward if temp > 60°C
### Key Files
| File | Role |
|------|------|
| `selfdrive/modeld/modeld.py` | Standstill frame skip logic |
| `selfdrive/modeld/dmonitoringmodeld.py` | Standstill frame skip logic |
| `selfdrive/thermald/fan_controller.py` | Standstill-aware fan clamps |
| `selfdrive/thermald/thermald.py` | Passes standstill to fan controller via carState |
## Debug Function Button (LFA/LKAS Steering Wheel Button)
The Hyundai Tucson's LFA (Lane Following Assist) steering wheel button is repurposed as a general-purpose UI control button. It has no driving function in ClearPilot since lateral control is disabled.
### Signal Chain
```
Steering wheel LFA button press
-> CAN-FD message: cruise_btns_msg_canfd["LFA_BTN"]
[selfdrive/car/hyundai/carstate.py:332-339]
-> Edge detection: lkas_enabled vs lkas_previously_enabled
-> create_button_events() -> ButtonEvent(type=FrogPilotButtonType.lkas)
[selfdrive/car/hyundai/interface.py:168]
-> controlsd.update_clearpilot_events(CS)
[selfdrive/controls/controlsd.py:1235-1239]
-> events.add(EventName.clpDebug)
-> controlsd.clearpilot_state_control(CC, CS)
[selfdrive/controls/controlsd.py:1241-1258]
-> Toggles ScreenDisaplayMode param (0=on, 1=off) in /dev/shm/params
-> UI reads ScreenDisaplayMode in drawHud()
[selfdrive/ui/qt/onroad.cc:390-403]
-> mode=1 and no alert: Hardware::set_display_power(false)
-> mode=0 or alert visible: Hardware::set_display_power(true)
```
### Current Behavior
- Each press toggles the display on/off instantly (debug alert suppressed)
- `ScreenDisaplayMode` is in-memory params (`/dev/shm/params`), resets on reboot
- `max_display_mode = 1` — currently only two states (on/off); can be extended for future modes
### Key Files
| File | Role |
|------|------|
| `selfdrive/car/hyundai/carstate.py` | Reads LFA_BTN from CAN-FD |
| `selfdrive/car/hyundai/interface.py` | Creates ButtonEvent with FrogPilotButtonType.lkas |
| `selfdrive/controls/controlsd.py` | Fires clpDebug event, toggles ScreenDisaplayMode |
| `selfdrive/controls/lib/events.py` | clpDebug event definition (alert suppressed) |
| `selfdrive/ui/qt/onroad.cc` | Reads ScreenDisaplayMode, controls display power |
## Screen Timeout / Display Power
Display power is managed by `Device::updateWakefulness()` in `selfdrive/ui/ui.cc`.
- **Ignition off (offroad)**: screen blanks after `ScreenTimeout` seconds (default 120) of no touch. Tapping wakes it.
- **Ignition on (onroad)**: screen stays on unconditionally — `setAwake(s.scene.ignition || interactive_timeout > 0)` means ignition=true always keeps the screen awake. The FrogPilot `ScreenTimeoutOnroad` param (default 10s) has no effect because ignition being true short-circuits the timeout check.
- **Debug button (LFA)**: the only way to turn off the screen while driving. Toggles `ScreenDisaplayMode` param which is checked in `drawHud()` (onroad) and `updateState()` (park splash). Independent of the timeout system.
## Offroad UI
The offroad home screen (`selfdrive/ui/qt/home.cc`) was replaced with a clean grid launcher. Stock FrogPilot widgets (date, version, update/alert notifications) were removed.
- **Settings button**: opens the original comma/FrogPilot settings (backdoor to all original settings)
- **Dashcam button**: placeholder for future dashcam footage viewer
- Tapping the splash screen (ReadyWindow) goes directly to the grid launcher (no sidebar)
- Sidebar with metrics (TEMP, VEHICLE, CONNECT) is hidden but still accessible via settings path
## Device: comma 3x
### Hardware
- Qualcomm Snapdragon SoC (aarch64)
- Serial: comma-3889765b
- Storage: WDC SDINDDH4-128G, 128 GB UFS 2.1
- Connects to the car via comma panda (CAN bus interface)
### Operating System
- **Ubuntu 20.04.6 LTS (Focal Fossa)** on aarch64
- **Kernel**: 4.9.103+ (custom comma.ai PREEMPT build, Feb 2024) — very old, vendor-patched Qualcomm kernel
- **Python**: 3.11.4 via pyenv at `/usr/local/pyenv/versions/3.11.4/` (system python is 3.8, do not use)
- **AGNOS version**: 9.7 (comma's custom OS layer on top of Ubuntu)
- **Display server**: Weston (Wayland compositor) on tty1
- **SELinux**: mounted (enforcement status varies)
### Users
- `comma` (uid=1000) — the user openpilot runs as; member of root, sudo, disk, gpu, gpio groups
- `root` — what we SSH in as; files must be chowned back to comma before running openpilot
### Filesystem / Mount Quirks
| Mount | Device | Type | Notes |
|-------------|-------------|---------|-------|
| `/` | /dev/sda7 | ext4 | Root filesystem, read-write |
| `/data` | /dev/sda12 | ext4 | **Persistent**. Openpilot lives here. Survives reboots. |
| `/home` | overlay | overlayfs | **VOLATILE** — upperdir on tmpfs, changes lost on reboot |
| `/tmp` | tmpfs | tmpfs | Volatile, 150 MB |
| `/var` | tmpfs | tmpfs | Volatile, 128 MB (fstab) / 1.5 GB (actual) |
| `/systemrw` | /dev/sda10 | ext4 | Writable system overlay, noexec |
| `/persist` | /dev/sda2 | ext4 | Persistent config/certs, noexec |
| `/cache` | /dev/sda11 | ext4 | Android-style cache partition |
| `/dsp` | /dev/sde26 | ext4 | **Read-only** Qualcomm DSP firmware |
| `/firmware` | /dev/sde4 | vfat | **Read-only** firmware blobs |
### Hardware Encoding
- **OMX**: `OMX.qcom.video.encoder.avc` (H.264) and `OMX.qcom.video.encoder.hevc` — used by dashcamd and screen recorder
- **V4L2**: Qualcomm VIDC at `/dev/v4l/by-path/platform-aa00000.qcom_vidc-video-index1` — used by encoderd (now disabled). Not accessible from ffmpeg due to permission/driver issues
- **ffmpeg**: v4.2.2, has `h264_v4l2m2m` and `h264_omx` listed but neither works from ffmpeg subprocess (OMX port issues, V4L2 device not found). Use OMX directly via the C++ encoder
### Fan Control
Software-controlled via `thermald` -> `fan_controller.py` -> panda USB -> PWM. Target temp 70°C, PI+feedforward controller. See Standstill Power Saving section for standstill-aware clamps.
## Boot Sequence
```
Power On
-> systemd: comma.service (runs as comma user)
-> /usr/comma/comma.sh (waits for Weston, handles factory reset)
-> /data/continue.sh (exec bridge to openpilot)
-> /data/openpilot/launch_openpilot.sh
-> Kills other instances of itself and manager.py
-> Runs on_start.sh (logo, reverse SSH)
-> exec launch_chffrplus.sh
-> Sources launch_env.sh (thread counts, AGNOS_VERSION)
-> Runs agnos_init (marks boot slot, GPU perms, checks OS update)
-> Sets PYTHONPATH, symlinks /data/pythonpath
-> Runs build.py if no `prebuilt` marker
-> Launches selfdrive/manager/manager.py
-> manager_init() sets default params
-> ensure_running() loop starts all managed processes
```
## Openpilot Architecture
### Process Manager
`selfdrive/manager/manager.py` orchestrates all processes defined in `selfdrive/manager/process_config.py`.
### Always-Running Processes (offroad + onroad)
- `thermald` — thermal management and fan control
- `pandad` — panda CAN bus interface
- `ui` — Qt-based onroad/offroad UI
- `deleter` — storage cleanup (9 GB threshold)
- `statsd`, `timed`, `logmessaged`, `tombstoned` — telemetry/logging
- `manage_athenad` — comma cloud connectivity
- `fleet_manager`, `frogpilot_process` — FrogPilot additions
### Onroad-Only Processes (when driving)
- `controlsd` — main vehicle control loop
- `plannerd` — path planning
- `radard` — radar processing
- `modeld` — driving model inference (throttled to 1fps at standstill)
- `dmonitoringmodeld` — driver monitoring model (throttled to 1fps at standstill)
- `locationd`, `calibrationd`, `paramsd`, `torqued` — localization and calibration
- `sensord` — IMU/sensor data
- `soundd` — alert sounds
- `camerad` — camera capture
- `loggerd` — CAN/sensor log recording (video encoding disabled)
### ClearPilot Processes
- `dashcamd` — raw camera dashcam recording (runs onroad or with DashcamDebug flag)
### GPS
- `ubloxd` + `pigeond` for u-blox GPS hardware
- `qcomgpsd`, `ugpsd`, `navd` currently **commented out** in process_config
### Key Dependencies
- **Python 3.11** (via pyenv) with: numpy, casadi, onnx/onnxruntime, pycapnp, pyzmq, sentry-sdk, sympy, Cython
- **capnp (Cap'n Proto)** — IPC message serialization between all processes
- **ZeroMQ** — IPC transport layer
- **Qt 5** — UI framework (with WebEngine available but not used for rotation reasons)
- **OpenMAX (OMX)** — hardware video encoding
- **libavformat** — MP4 container muxing
- **libyuv** — color space conversion
- **SCons** — build system for native C++ components