arborist — answers your AI can prove

arborist — answers your AI can prove#

one-page summary · permacomputer.com · AGPL-3.0


Most retrieval-augmented question-answering systems hand a language model some context and ship whatever the model says. There is no way to tell whether the answer faithfully reflects the source or whether the model embroidered it. arborist closes that gap. Every answer is verified against its source after the language model finishes, by a mechanical checker — not by another AI grading the first one. The checker labels each answer grounded (every claim was found in the source), partly grounded (some claims, not all), or ungrounded. The label travels with the answer and is stored in a tamper-evident chain back to the source bytes.

Fabricated citations become impossible. When the model answers, it never types the quoted text. arborist tags each candidate source chunk with a short label — E1, E2 — and asks the model to answer using those labels. The model might write “Jupiter is the largest planet [E1], with a radius of about 70 000 km [E2]”; arborist renders the actual chunk text at display time. A model cannot fabricate a quote it never types.

The proof path is cheap and mechanical. Verification is text comparison, not embeddings, not similarity, not another model. It runs on a laptop. Optional smart-ranking layers — cross-encoder rerankers, entailment models — exist on the side; they help arborist find better evidence, they do not influence whether an answer is certified.

Same question, same document, same answer. Answers are content- addressed: the cache key folds in the document, the question, the model identity, and the policy under which the answer was checked. Two users asking the same question of the same document under the same policy hit the same record. Reproducible. Replayable. Shareable.

What it has measured on real traffic.

  • 100% misattribution catch at 0% false positives — every answer where the cited source was unrelated to the claim was flagged, without a single grounded answer wrongly demoted, across the full pooled test bed.

  • 55–65% topic-deflection catch at 0–0.4% false positives — picks off-topic answers out of the stream while leaving on-topic answers untouched.

  • 100% citation coverage on the curated textbook corpus — every cited claim resolves to a chain of evidence ending at a public-domain or open-licensed source.

What it costs. Python 3.12. SQLite, one file per shard, each designed around ~10 GB. A Wikipedia-class corpus lands across a handful of shards (the live deployment is four shards, ~38 GB on disk, holding 3.5 M documents / 6.2 M chunks). No GPU for the proof path. Use any OpenAI-compatible inference endpoint; the free reference endpoint is hermes.ai.unturf.com/v1 (OpenAI-compatible — the /v1/models link returns the live model card). Source: git.unturf.com/engineering/unturf/arborist. Full whitepaper: unfirehose.com/merkle-providence-reverse-rag.html.


Permacomputer Preamble — License: AGPL-3.0-only

This is free software for the public good of a permacomputer hosted at permacomputer.com, an always-on computer by the people, for the people. Durable, easy to repair, & distributed like tap water for machine learning intelligence.

Our permacomputer is community-owned infrastructure optimized around four values:

  • TRUTH — First principles, math & science, open source code freely distributed.

  • FREEDOM — Voluntary partnerships, freedom from tyranny & corporate control.

  • HARMONY — Minimal waste, self-renewing systems with diverse thriving connections.

  • LOVE — Be yourself without hurting others, cooperation through natural law.

NO WARRANTY. Software is provided “AS IS” without warranty of any kind. Full text: License.