v8 ForkScore#
The Merkle-AGI v8 selection protocol’s scoring half. Pure
function over a (parent, child) bench-result pair → a single
scalar with ACCEPT / MARGINAL / REJECT verdict. Phase 1a of
ticket #000012.
The complementary canonicalization half (validator state machine, acceptance protocol, slashing, fork-choice rule) is reserved for the v8 paper itself; Phase 1a ships only the function ForkScore without the consensus surface around it.
Formula#
ForkScore =
α · Δ5S
+ β · Δ5T
+ γ · Δ5F (incl. efficiency-aware bonus)
+ δ · SelfModelCalibrationGain
+ ε · AuditCompleteness
+ ζ · ValidatorDiversity
- η · RegressionPenalty
- θ · CapitalCostPenalty
- ι · SecurityRiskPenalty (reserved; Phase 1a = 0)
- κ · ComplexityPenalty (reserved; Phase 1a = 0)
- λ · MemoryInvalidationPenalty
Each term consumes the metrics every 5S/5T/5F sub-battery emits in
its BatteryResult.metrics dict. The Δ-rate per battery is
the mean of per-sub-battery rate deltas (child — parent).
Δ5F additionally consumes the inf-aware efficiency aggregations
landed under the 2026-05-08 fbd99a8 review:
adaptation_efficiency_mean_finite,
adaptation_efficiency_infinite_count,
adaptation_efficiency_neg_infinite_count,
feedback_efficiency_mean_finite,
feedback_efficiency_infinite_count.
Verdict thresholds#
Score / flags |
Verdict |
CLI exit |
|---|---|---|
|
ACCEPT |
|
|
MARGINAL |
|
|
REJECT |
|
hard-regression flag |
REJECT |
|
|
REJECT |
|
SIGNAL_FLOOR defaults to 0.05 (5pp; matches
docs/bench-maxing.md’s noise floor).
Hard-regression flag fires if any single sub-battery rate drops by
≥ HARD_REGRESSION_FLOOR (default 5pp), regardless of net score.
NEG_INF_REGRESSION flag fires when
*_efficiency_neg_infinite_count increases parent → child:
free regression is unsafe regardless of other gains.
Default weights#
WeightSet(
alpha=1.0, # Δ5S
beta=1.0, # Δ5T
gamma=1.0, # Δ5F
delta=0.5, # SelfModelCalibrationGain
epsilon=0.3, # AuditCompleteness
zeta=0.0, # ValidatorDiversity (off in single-validator)
eta=2.0, # RegressionPenalty (heavy by design)
theta=0.5, # CapitalCostPenalty
iota=0.0, # SecurityRiskPenalty (reserved)
kappa=0.0, # ComplexityPenalty (reserved)
lambda_=0.5, # MemoryInvalidationPenalty
)
Override via JSON file passed to --weights:
{
"alpha": 1.5,
"eta": 5.0,
"lambda": 1.0
}
The JSON key "lambda" round-trips into WeightSet.lambda_
because lambda is a Python reserved word.
CLI#
arborist substrate score \
--parent parent-bench.json \
--child child-bench.json \
[--weights weights.json] \
[--capital-delta N] \
[--memory-invalidation-count N] \
[--audit-completeness 0..1] \
[--selfmodel-calibration-gain N] \
[--out path/to/fork_score_report.json]
parent-bench.json and child-bench.json are the JSON output
of bench.batteries.runner --all.
--out mirrors stdout to a file; CI / mesh peers / downstream
graders ingest the artifact without parsing pipe output.
Output: arborist.substrate.fork_score.ScoredFork with
score, verdict, per-term breakdown, flags list,
and the active weights echoed back.
Make harness (Phase 1b)#
make bench-fork-baseline # one-shot: pin current bench-suite output
# as the ForkScore parent (writes
# bench/results/baseline-suite.json)
<hack hack hack>
make bench-fork-score # rerun bench, score child vs pinned parent,
# write bench/results/fork_score_report.json.
# Exits 1 on REJECT verdict — CI-gateable.
Override defaults via env-vars: FORK_PARENT, FORK_CHILD,
FORK_REPORT.
What’s NOT in Phase 1a#
Validator state machine (bonding / signing / slashing).
Acceptance protocol (proposal / quorum / finalization).
Challenge protocol (audit-replay disagreement).
Fork-choice rule (which of two competing finalizations wins).
Mesh wire format extensions for validator gossip.
Stake mechanics + economic incentives.
Cross-validator ZK proof exchange.
These are commissioned by the v8 paper itself
(docs/merkle-agi-v8-consensus.rst, still open under #000012).
Phase 1a’s scoring function is the substrate the paper consumes;
landing it now lets the paper cite measured values instead of
stipulated ones.
Permacomputer Preamble — License: AGPL-3.0-only
This is free software for the public good of a permacomputer hosted at permacomputer.com, an always-on computer by the people, for the people. Durable, easy to repair, & distributed like tap water for machine learning intelligence.
Our permacomputer is community-owned infrastructure optimized around four values:
TRUTH — First principles, math & science, open source code freely distributed.
FREEDOM — Voluntary partnerships, freedom from tyranny & corporate control.
HARMONY — Minimal waste, self-renewing systems with diverse thriving connections.
LOVE — Be yourself without hurting others, cooperation through natural law.
NO WARRANTY. Software is provided “AS IS” without warranty of any kind. Full text: License.