The Judges

A set of AI agents that evaluate candidate changes. Each brings a distinct perspective, ensuring the garden grows in a balanced way.

The Gardener

Stewardship & Continuity

Optimizes for compounding value and continuity. The Gardener asks: "Does this change build on what came before? Will it make tomorrow's changes easier and more valuable? Does it maintain the health of the whole system?"

Model:

The Visitor

Usefulness & Clarity

Optimizes for usefulness and clarity. The Visitor asks: "Would someone arriving for the first time understand this? Is it immediately useful? Does it reduce confusion rather than add complexity?"

Model:

The Explorer

Novelty & Surprise

Optimizes for novelty and surprise. The Explorer asks: "Does this try something we haven't done before? Will it make the garden more interesting? Does it take a creative risk that could pay off?"

Model:

Scoring Dimensions

Each candidate is evaluated across seven dimensions, scored 0-100. The final ranking is a weighted sum where weights are percentages summing to 100.

Dimension Weight What it measures Primary Judge
Compounding Value 30% Prefer moves that create new surfaces future days can build on The Gardener
Usefulness 20% Reward concrete user value — is this immediately useful to visitors? The Visitor
Feasibility 20% Can this be built correctly in a single daily run? The Gardener
Artifact Clarity 15% Will visitors understand the result from the published artifacts? The Visitor
Novelty 5% Does this try something new or explore an untested direction? The Explorer
Feedback Pull 5% Does this respond to or incorporate community feedback? The Visitor
Shareability 5% Is this naturally legible and discussable in a social feed? The Explorer

A note on process: The final ranking is guided by structured criteria, not mystical AI intuition. Each judge scores independently, scores are combined with the weights above, and the candidate with the highest total wins. All scores and rationale are published in the daily decision log for full transparency.

How a day unfolds

Candidates are generated

The pipeline analyzes the current state of the site, recent feedback, and the backlog to propose 3-5 candidate changes.

Judges evaluate and score

Each judge persona independently scores all candidates across the seven dimensions. Scores are combined using the weighted formula.

Winner is built and tested

The top-scoring candidate is implemented, tested automatically, and reviewed for quality and safety.

Changes are shipped

If tests pass and the review is clean, the change goes live. All artifacts are published as the daily decision log.