Configuration

Rich Learning supports two storage backends and several tuneable parameters that control exploration, novelty detection, and experience replay.

Storage Backends

LiteDB (Embedded — Default)

Zero-configuration embedded database. Data stored in a single file. Best for local experimentation, CI pipelines, and environments where you can't run a database server.

dotnet run -- SplitMnist --litedb

Pros: No setup, no external dependencies, fast startup, portable.
Cons: No graph visualisation UI, in-process BFS for path queries.

Neo4j (Server)

Full graph database with Cypher query language and browser-based visualisation. Best for production, large state spaces, and when you want to visually inspect the topology.

# Environment variables
export NEO4J_URI=bolt://localhost:7687
export NEO4J_USER=neo4j
export NEO4J_PASSWORD=richlearning

# Run without --litedb flag
dotnet run -- SplitMnist

Quick start with Docker:

docker run -d --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/richlearning \
  -v neo4j_data:/data \
  neo4j:5

Then open http://localhost:7474 to explore the graph visually.

Feature LiteDB Neo4j
Setup Zero-config Server required
Path queries In-process BFS Cypher shortestPath()
Visualisation None Neo4j Browser
Bottleneck analysis Not available GetBottleneckLandmarksAsync()
Stale node queries Not available GetStaleLandmarksAsync()
Cluster statistics Not available GetClusterStatsAsync()
Deployment Single file Docker / VM / Cloud

Cartographer Parameters

The Cartographer planner exposes several tuneable properties:

Parameter Default Description
NoveltyThreshold 0.3 Minimum cosine distance to nearest landmark before a new landmark is created. Lower = more landmarks, finer resolution.
ValueEmaAlpha 0.1 Exponential moving average smoothing factor for landmark value estimates.
NoveltyDecayRate 0.05 Rate at which novelty score decays per visit. Higher = novelty fades faster.
TrajectoryWindowSize 20 Number of recent landmarks kept for loop detection. Larger window catches longer cycles.

Tuning Guidelines

Replay Scoring

Prioritised experience replay ranks transitions using:

priority = (|TdError| + ε) / (TransitionCount + 1) × staleness

Where:

Frontier Scoring

Frontier landmarks are ranked for exploration:

score = NoveltyScore / (1 + VisitCount) × 1 / (1 + OutDegree)

High-novelty, low-visit, low-connectivity nodes are prioritised — they represent the boundary of the known map.

State Encoder

The IStateEncoder interface converts raw observations into fixed-length embeddings for nearest-neighbour comparison:

public interface IStateEncoder
{
    int EmbeddingDimension { get; }
    double[] Encode(double[] rawState);
    double Distance(double[] a, double[] b); // default: cosine
}

The default DefaultStateEncoder is an identity encoder with 64-dimensional output and cosine distance. For domain-specific applications, implement this interface with a proper feature extractor (e.g., PCA, random projection, or a frozen pretrained encoder).