Configuration
Rich Learning supports two storage backends and several tuneable parameters that control exploration, novelty detection, and experience replay.
Storage Backends
LiteDB (Embedded — Default)
Zero-configuration embedded database. Data stored in a single file. Best for local experimentation, CI pipelines, and environments where you can't run a database server.
dotnet run -- SplitMnist --litedb Pros: No setup, no external dependencies, fast startup, portable.
Cons: No graph visualisation UI, in-process BFS for path queries.
Neo4j (Server)
Full graph database with Cypher query language and browser-based visualisation. Best for production, large state spaces, and when you want to visually inspect the topology.
# Environment variables
export NEO4J_URI=bolt://localhost:7687
export NEO4J_USER=neo4j
export NEO4J_PASSWORD=richlearning
# Run without --litedb flag
dotnet run -- SplitMnist Quick start with Docker:
docker run -d --name neo4j \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/richlearning \
-v neo4j_data:/data \
neo4j:5
Then open http://localhost:7474 to explore the graph visually.
| Feature | LiteDB | Neo4j |
|---|---|---|
| Setup | Zero-config | Server required |
| Path queries | In-process BFS | Cypher shortestPath() |
| Visualisation | None | Neo4j Browser |
| Bottleneck analysis | Not available | GetBottleneckLandmarksAsync() |
| Stale node queries | Not available | GetStaleLandmarksAsync() |
| Cluster statistics | Not available | GetClusterStatsAsync() |
| Deployment | Single file | Docker / VM / Cloud |
Cartographer Parameters
The Cartographer planner exposes several tuneable properties:
| Parameter | Default | Description |
|---|---|---|
NoveltyThreshold | 0.3 | Minimum cosine distance to nearest landmark before a new landmark is created. Lower = more landmarks, finer resolution. |
ValueEmaAlpha | 0.1 | Exponential moving average smoothing factor for landmark value estimates. |
NoveltyDecayRate | 0.05 | Rate at which novelty score decays per visit. Higher = novelty fades faster. |
TrajectoryWindowSize | 20 | Number of recent landmarks kept for loop detection. Larger window catches longer cycles. |
Tuning Guidelines
- Dense state spaces (e.g., pixel observations): Lower
NoveltyThresholdto ~0.1 for finer discretisation - Sparse state spaces (e.g., grid worlds): Raise
NoveltyThresholdto ~0.5 to avoid redundant landmarks - Long episodes: Increase
TrajectoryWindowSizeto 50+ for better cycle detection - Volatile rewards: Lower
ValueEmaAlphato ~0.05 for smoother value estimates
Replay Scoring
Prioritised experience replay ranks transitions using:
priority = (|TdError| + ε) / (TransitionCount + 1) × staleness Where:
ε = 0.01— ensures minimum priority for all transitionsstaleness = currentTimestep - LastTrainedTimestep— favours under-trained transitionsTransitionCount— normalises priority by experience
Frontier Scoring
Frontier landmarks are ranked for exploration:
score = NoveltyScore / (1 + VisitCount) × 1 / (1 + OutDegree) High-novelty, low-visit, low-connectivity nodes are prioritised — they represent the boundary of the known map.
State Encoder
The IStateEncoder interface converts raw observations into fixed-length
embeddings for nearest-neighbour comparison:
public interface IStateEncoder
{
int EmbeddingDimension { get; }
double[] Encode(double[] rawState);
double Distance(double[] a, double[] b); // default: cosine
}
The default DefaultStateEncoder is an identity encoder with 64-dimensional
output and cosine distance. For domain-specific applications, implement this interface
with a proper feature extractor (e.g., PCA, random projection, or a frozen pretrained encoder).