When a self-driving car stops unexpectedly, who do you ask why?
With a neural end-to-end system, the honest answer is: nobody knows. The weights decided. We built something different.
This post documents selfdriving-simulation — a Rich Learning V2 domain implementation where every driving decision is a traversable graph query with a named reason, a typed source object, and a human-readable audit line. The agent learns from experience and replays past decisions without re-computing them. And when it's wrong, you can read exactly why.
The Problem With Opaque Driving
Modern end-to-end autonomous driving systems make decisions the same way a large language model generates tokens: by passing inputs through millions of parameters and producing an output. The process is fast. The result is often correct. But the reasoning is invisible.
Ask a neural system "why did you brake?" and the truthful answer is a 70-million-parameter tensor. That's not an explanation. In safety-critical domains — and autonomous driving is as safety-critical as it gets — that's a liability.
There's a second problem: amnesia. Each drive is a cold start. Yesterday's experience with a red light at 18 metres on the northbound lane is not remembered. The agent re-derives the same decision, at the same cost, every time.
What we wanted: A driving agent where every decision traces to a named reason, and where known situations are recalled from memory — not recomputed from scratch.
What Rich Learning Replaces (And What It Doesn't)
Let's be precise. This project does not replace deep learning for computer vision. YOLO still identifies the pedestrian. A lidar pipeline still clusters the point cloud. Neural perception still runs — and in Phase 3, it runs locally via a CARLA bridge.
What Rich Learning replaces is high-level behavioural decision-making and memory storage. Once perception has classified the scene — "red signal at 18 metres, obstacle 12 metres ahead, visibility 45 metres" — the decision layer takes over. That layer is a graph, not a neural network. Every decision is a typed edge. Every reason is a named field. Every replay is a graph lookup.
How It Works: From Sensor to Decision to Memory
The architecture has three clean layers, separated by typed contracts. No layer reaches across the boundary above or below it.
Layer 1 — The Domain (F#)
The pure decision types live in F#. This is not a stylistic preference — it's a safety guarantee. F# discriminated unions give us exhaustive pattern matching at compile time. If you add a new signal state (say, Flashing), every match expression that doesn't handle it becomes a build error. The CI system rejects the code before it can reach production.
type SignalState = Green | Yellow | Red
type PlannerDecision =
| Continue of reason: string
| Slow of reason: string
| Stop of reason: string * emergencyBrake: bool The validation logic is just as explicit. The compiler rejects any match that doesn't cover all cases:
let validateAction (scene: DrivingScene) (action: PlannerDecision) =
match scene.SignalState, scene.HasObstacleAhead, scene.VisibilityMeters, action with
| Red, _, d, Continue when d <= 30.0 -> Invalid "Continue illegal near red signal"
| _, true, _, Continue -> Invalid "Continue illegal with obstacle ahead"
| _, _, v, Continue when v < 25.0 -> Invalid "Continue illegal in low visibility"
| _ -> Valid Layer 2 — The Integration (C#)
C# owns everything that crosses into the Rich Learning V2 kernel: typed graph object wrappers, the perception adapter, the graph persistence layer, and the planning workflow. The F# types cross cleanly — discriminated unions compile to sealed class hierarchies that C# can switch on.
IExecutableBoundaryObject.Validate, calling the F# constraint layer. If a proposed action is vetoed, it is instantly rejected and an alternative must be generated.
Layer 3 — Perception (The CARLA Seam)
Sensor frames are adapters, not first-class graph objects. The IPerceptionProvider seam accepts Camera, Lidar, and Radar sub-records and produces the typed context and boundary objects the planner consumes. In Phase 1 and 2, a SyntheticPerceptionProvider satisfies this contract. In Phase 3, a CarlaPerceptionProvider wires real CARLA payloads into the same typed structs — no planner changes required.
The Results: 22 Green Tests, Two Episodes, One Audit Trail
The vertical slice covers two driving episodes against four named scenarios. Episode 1 (morning commute) actively computes decisions. Episode 2 (afternoon replay) recalls them from graph memory.
What the Audit Trail Actually Says
Every decision produces a structured audit log. Here's the real output from the demo run — this is not hand-crafted. It's the exact JSON from demo-output/audit.json:
| Scene | Episode | Source | Decision | Reason |
|---|---|---|---|---|
| sim-stop-red-obstacle | ep-morning-commute | ActiveComputed | Stop | Red signal 18m ahead within legal stop distance. |
| sim-slow-low-visibility | ep-morning-commute | ActiveComputed | Slow | Visibility 15m below safe threshold. |
| sim-continue-clear | ep-morning-commute | ActiveComputed | Continue | All clear: green signal, no obstacle, visibility 80m. |
| sim-stop-red-obstacle | ep-afternoon-replay | PassiveRecalled EXACT | Stop | Recalled from graph memory — zero recomputation. |
| sim-stop-red-obstacle-near | ep-afternoon-replay | PassiveRecalled SIMILAR (d=0.022) | Stop | Nearest-neighbour match from similar scene in memory. |
The passive recall path labels every decision as EXACT (distance = 0.0) or SIMILAR (nearest-neighbour with a reported distance). The system never silently falls through — the audit line always tells you how the decision was produced.
What a Recalled Decision Looks Like
Here's the full audit trail for the sim-stop-red-obstacle-near scenario — a scene the agent has never seen before, handled by structural similarity lookup:
scene=sim-stop-red-obstacle-near episode=ep-afternoon-replay
lane=northbound-through signal=red@20m obstacle=11m visibility=45m
passive-recall SIMILAR (d=0.022): decision=stop
from scene=sim-stop-red-obstacle
recalled reason="Red signal 18m ahead within legal stop distance."
constraint-revalidation: accepted (recalled decision still legal in current scene)
boundary-validator: accepted The agent saw a red light at 20 metres with an obstacle at 11 metres. It had learned from a scene with a red light at 18 metres and an obstacle at 12 metres. Structural distance: 0.022 (scale: 0.0 = exact, 1.0 = maximum distance). Decision: Stop. Revalidated. Accepted. No neural inference required.
Notice the constraint-revalidation line. The recalled decision isn't just replayed blindly — it's validated against the legality rules of the new scene. If the recalled action is no longer legal, the graph rejects it and falls back to active planning.
The Neo4j Export: Reasoning as a Queryable Graph
Every decision is also written to demo-output/decision-log.cypher — a Neo4j MERGE script that renders the entire decision history as a live graph. When Phase 3 lands a live Neo4j backend, researchers can query: "show me every scene where the agent recalled a Stop from a structurally similar episode."
Why this matters for safety: In neural systems, failure analysis requires re-running the model on the offending input and hoping the attention maps are interpretable. In this system, failure analysis is a graph query. The scene object, the constraint that rejected the action, and the recalled memory that influenced the decision are all first-class nodes. You don't need an interpreter — you need a browser.
Why Two Languages?
The project uses F# for the domain layer and C# for the integration layer. This is worth explaining, because it's a deliberate engineering choice — not a preference.
SignalState variant and every unhandled match becomes a build error. CI rejects the code. No runtime miss possible.
What's Next
Phase 1 and Phase 2 are shipped and green. Here's what comes next:
| # | Item | Effort | Value |
|---|---|---|---|
| 1 | Live Neo4j backend — swap the .cypher file export for a queryable Neo4j graph | Medium | High — strongest external demo surface |
| 2 | DapsaEngine wiring — thread the workflow through the full consonance → active → fossilize lifecycle | Medium | Full integration to unlock continuous active-to-fossil learning cycles in complex driving scenarios |
| 3 | CARLA adapter — implement IPerceptionProvider against a live CARLA Python gRPC bridge | Large | High — the seam is ready, this is only external integration |
What This Experiment Doesn't Prove Yet
- Real sensors. Perception is synthetic. A CARLA bridge is the next gate. The seam exists; the real-world data doesn't yet.
- Multi-obstacle scenes. The current slice handles single-obstacle scenarios. Multi-vehicle interactions and lane changes are Phase 3 scope.
- Full fossilization. Passive recall currently uses graph-memory lookup. The full DAPSA consonance → fossilize lifecycle is wired but not yet exercised end-to-end in driving context.
- External benchmarks. No comparison against CARLA leaderboard baselines yet. That comes when the live simulator bridge lands.