arborist — answers your AI can prove
=====================================

.. rst-class:: center

*one-page summary · permacomputer.com · AGPL-3.0*

----

Most retrieval-augmented question-answering systems hand a language
model some context and ship whatever the model says. There is no way to
tell whether the answer faithfully reflects the source or whether the
model embroidered it. **arborist closes that gap.** Every answer is
verified against its source *after* the language model finishes, by a
mechanical checker — not by another AI grading the first one. The
checker labels each answer **grounded** (every claim was found in the
source), **partly grounded** (some claims, not all), or
**ungrounded**. The label travels with the answer and is stored in a
tamper-evident chain back to the source bytes.

**Fabricated citations become impossible.** When the model answers, it
never types the quoted text. arborist tags each candidate source
chunk with a short label — ``E1``, ``E2`` — and asks the model to
answer using those labels. The model might write *"Jupiter is the
largest planet [E1], with a radius of about 70 000 km [E2]"*; arborist
renders the actual chunk text at display time. A model cannot fabricate
a quote it never types.

**The proof path is cheap and mechanical.** Verification is text
comparison, not embeddings, not similarity, not another model. It runs
on a laptop. Optional smart-ranking layers — cross-encoder rerankers,
entailment models — exist on the side; they help arborist find better
evidence, they do not influence whether an answer is certified.

**Same question, same document, same answer.** Answers are content-
addressed: the cache key folds in the document, the question, the
model identity, and the policy under which the answer was checked.
Two users asking the same question of the same document under the same
policy hit the same record. Reproducible. Replayable. Shareable.

**What it has measured on real traffic.**

- **100% misattribution catch at 0% false positives** — every answer
  where the cited source was unrelated to the claim was flagged,
  without a single grounded answer wrongly demoted, across the full
  pooled test bed.
- **55–65% topic-deflection catch at 0–0.4% false positives** — picks
  off-topic answers out of the stream while leaving on-topic answers
  untouched.
- **100% citation coverage on the curated textbook corpus** — every
  cited claim resolves to a chain of evidence ending at a public-domain
  or open-licensed source.

**What it costs.** Python 3.12. SQLite, one file per shard, each
designed around ~10 GB. A Wikipedia-class corpus lands across a
handful of shards (the live deployment is four shards, ~38 GB on
disk, holding 3.5 M documents / 6.2 M chunks). No GPU for the proof
path. Use any OpenAI-compatible inference endpoint; the free
reference endpoint is `hermes.ai.unturf.com/v1
<https://hermes.ai.unturf.com/v1/models>`_ (OpenAI-compatible — the
``/v1/models`` link returns the live model card). Source:
`git.unturf.com/engineering/unturf/arborist
<https://git.unturf.com/engineering/unturf/arborist>`_. Full
whitepaper: `unfirehose.com/merkle-providence-reverse-rag.html
<https://unfirehose.com/merkle-providence-reverse-rag.html>`_.
