Configuration

Rich Learning supports two storage backends and several tuneable parameters that control exploration, novelty detection, and experience replay.

Storage Backends

LiteDB (Embedded — Default)

Zero-configuration embedded database. Data stored in a single file. Best for local experimentation, CI pipelines, and environments where you can't run a database server.

dotnet run -- SplitMnist --litedb

Pros: No setup, no external dependencies, fast startup, portable.
Cons: No graph visualisation UI, in-process BFS for path queries.

Neo4j (Server)

Full graph database with Cypher query language and browser-based visualisation. Best for production, large state spaces, and when you want to visually inspect the topology.

# Environment variables
export NEO4J_URI=bolt://localhost:7687
export NEO4J_USER=neo4j
export NEO4J_PASSWORD=richlearning

# Run without --litedb flag
dotnet run -- SplitMnist

Quick start with Docker:

docker run -d --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/richlearning \
  -v neo4j_data:/data \
  neo4j:5

Then open http://localhost:7474 to explore the graph visually.

Feature	LiteDB	Neo4j
Setup	Zero-config	Server required
Path queries	In-process BFS	Cypher `shortestPath()`
Visualisation	None	Neo4j Browser
Bottleneck analysis	Not available	`GetBottleneckLandmarksAsync()`
Stale node queries	Not available	`GetStaleLandmarksAsync()`
Cluster statistics	Not available	`GetClusterStatsAsync()`
Deployment	Single file	Docker / VM / Cloud

Cartographer Parameters

The Cartographer planner exposes several tuneable properties:

Parameter	Default	Description
`NoveltyThreshold`	0.3	Minimum cosine distance to nearest landmark before a new landmark is created. Lower = more landmarks, finer resolution.
`ValueEmaAlpha`	0.1	Exponential moving average smoothing factor for landmark value estimates.
`NoveltyDecayRate`	0.05	Rate at which novelty score decays per visit. Higher = novelty fades faster.
`TrajectoryWindowSize`	20	Number of recent landmarks kept for loop detection. Larger window catches longer cycles.

Tuning Guidelines

Dense state spaces (e.g., pixel observations): Lower NoveltyThreshold to ~0.1 for finer discretisation
Sparse state spaces (e.g., grid worlds): Raise NoveltyThreshold to ~0.5 to avoid redundant landmarks
Long episodes: Increase TrajectoryWindowSize to 50+ for better cycle detection
Volatile rewards: Lower ValueEmaAlpha to ~0.05 for smoother value estimates

Replay Scoring

Prioritised experience replay ranks transitions using:

priority = (|TdError| + ε) / (TransitionCount + 1) × staleness

Where:

ε = 0.01 — ensures minimum priority for all transitions
staleness = currentTimestep - LastTrainedTimestep — favours under-trained transitions
TransitionCount — normalises priority by experience

Frontier Scoring

Frontier landmarks are ranked for exploration:

score = NoveltyScore / (1 + VisitCount) × 1 / (1 + OutDegree)

High-novelty, low-visit, low-connectivity nodes are prioritised — they represent the boundary of the known map.

State Encoder

The IStateEncoder interface converts raw observations into fixed-length embeddings for nearest-neighbour comparison:

public interface IStateEncoder
{
    int EmbeddingDimension { get; }
    double[] Encode(double[] rawState);
    double Distance(double[] a, double[] b); // default: cosine
}

The default DefaultStateEncoder is an identity encoder with 64-dimensional output and cosine distance. For domain-specific applications, implement this interface with a proper feature extractor (e.g., PCA, random projection, or a frozen pretrained encoder).