diff --git a/CLAUDE.md b/CLAUDE.md index d9b6ba2..4787d30 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -195,6 +195,43 @@ The UI process runs a ZMQ REP server at `ipc:///tmp/clearpilot_ui_rpc`. Send `"d - **`showDriverView` overriding transitions (fixed)**: was forcing `slayout` to onroad/home every frame at 20Hz, overriding park/drive logic. Fixed to only act when not in started state. - **Sidebar appearing during onroad transition (fixed)**: `MainWindow::closeSettings()` was re-enabling the sidebar. Fixed by not calling `closeSettings` during `offroadTransition`. +## Performance Profiling + +Use `py-spy` to find CPU hotspots in any Python process. It's installed at `/home/comma/.local/bin/py-spy`. (If missing: `su - comma -c "/usr/local/pyenv/versions/3.11.4/bin/pip install py-spy"`.) + +```bash +# Find the target pid +ps -eo pid,cmd | grep -E "selfdrive.controls.controlsd" | grep -v grep + +# Record 10s of stacks at 200Hz, raw (folded) format +/home/comma/.local/bin/py-spy record -o /tmp/ctrl.txt --pid --duration 10 --rate 200 --format raw + +# Aggregate: which line of step() is consuming the most samples +awk -F';' '{ + for(i=1;i<=NF;i++) if ($i ~ /step \(selfdrive\/controls\/controlsd.py/) step_line=i; + if (step_line && step_line < NF) { + n=split($NF, parts, " "); count=parts[n]; + caller = $(step_line+1); + sum[caller] += count; + } + step_line=0; +} END { for (c in sum) printf "%6d %s\n", sum[c], c }' /tmp/ctrl.txt | sort -rn | head -15 + +# Aggregate by a source file — shows hottest lines in that file +awk -F';' '{ + for(i=1;i<=NF;i++) if ($i ~ /carstate\.py:/) { + match($i, /:[0-9]+/); ln = substr($i, RSTART+1, RLENGTH-1); + n=split($NF, parts, " "); count=parts[n]; + sum[ln] += count; + } +} END { for (l in sum) printf "%5d line %s\n", sum[l], l }' /tmp/ctrl.txt | sort -rn | head -15 + +# Quick stack dump (single sample, no recording) +/home/comma/.local/bin/py-spy dump --pid +``` + +**Known performance trap — hot `Params` writes**: `Params.put()` does `mkstemp` + `fsync` + `flock` + `rename` + `fsync_dir`. At 100Hz even on tmpfs the `flock` contention is ruinous. Cache the last-written value and skip writes when unchanged. Found this pattern in `carstate.py` and `controlsd.py` — controlsd went from 69% → 28% CPU after gating writes. + ## Session Logging Per-process stderr and an aggregate event log are captured in `/data/log2/current/`.