Makefile reference#

Every arborist workflow lives behind a make target. This page is auto-generated from the project Makefile’s ## description annotations at Sphinx build time, so it stays in sync with the source.

Run make help locally for a flat alphabetized listing.

Note

Targets are grouped below by workflow phase, in the order a new operator typically runs them. The first row of each table is the most common entry point for that phase.

Setup#

One-time installation. Creates the venv and (optionally) the crawler’s heavy extras. Re-running is a no-op when up to date.

Target

Description

make bootstrap

create venv and install editable package

make bootstrap-crawler

install [crawler] extras into the venv

make all

bootstrap → fetch cur → ingest cur → verify → stats

Fetch#

Download corpus snapshots into data/. Idempotent — curl skips files already present.

Target

Description

make fetch

download all 3 files (cur + old.1 + old.2 + concat)

make fetch-cur

download cur table dump (~82 MB)

make fetch-old

download old (revision history) parts and concatenate (~893 MB)

make fetch-xml

download Phase IV XML cur dump (default: enwiki 20101011, 6.2 GB)

make fetch-abstract

download Phase IV abstract.xml (default: enwiki 20101011, ~3 GB)

Ingest#

Parse a source into Merkle-committed shards. -attached variants are the canonical sharded path (one SQLite per shard, no WAL contention). Single-DB variants are for experiments.

Target

Description

make ingest

default ingest = cur (use ingest-old or *-parallel for full)

make ingest-cur

ingest INGEST_LIMIT cur articles

make ingest-cur-attached

sharded ingest, no WAL contention (Phase 2)

make ingest-old

ingest INGEST_LIMIT old (history) revisions

make ingest-old-attached

sharded ingest of old history (Phase 2)

make ingest-xml

ingest INGEST_LIMIT pages from $(WP_XML)

make ingest-xml-history

ingest every revision (multi-revision mode); set WP_XML to a pages-meta-history file

make ingest-xml-attached

sharded XML ingest, one process per shard

make ingest-abstract

ingest INGEST_LIMIT abstract docs from $(WP_ABSTRACT)

make ingest-grok-attached

ingest Grok conversations into $(GROK_SHARD)

make ingest-grok-media-attached

ingest Grok media prompts into $(GROK_SHARD)

make ingest-self

ingest this repo’s HEAD into a dedicated shard

make ingest-self-providence

promote STRICT live providence records into the document corpus [KG_SECONDS=3600]

make ingest-git

ingest GIT_REPO=<path> into its own shard

make ingest-hg

ingest HG_REPO=<path> (mercurial) into its own shard

make crawl-ingest

crawl URL=https://x.com [DEPTH=2 MAX=0 FAST=1] into $(SHARDS_DIR)/crawl_<domain>.db

make recrawl-check

conditional HEAD per ingested doc [DOMAIN=x.com LIMIT=100 CRAWL_SHARD=path]

Distill#

Surface → core distillation. Cores are Merkle-bound back to their source chunks via inclusion proofs.

Target

Description

make distill-shards-parallel

one distill process per shard (parallel)

make distill-shards-tfidf-parallel

TF-IDF cores per shard, in parallel

make backfill-concepts

backfill all concept extractors in parallel across shards

Query#

Ask the corpus a question. query-dry skips the LLM call and returns the assembled context — useful for prompt iteration.

Target

Description

make query

ask the corpus a question [JSON=1 BURN=1 REPAIR=1 REPROMPTS=N K=”extra retrieval keywords” ANSWER_MODE=claim_lattice|claim_lattice_pointer|quote BROAD=1 REJECT_BROAD=1 ALLOW_BROAD=1 WITNESS=1 XLANG=1 XLANG_MT=1]; JSON by default

make query-dry

like ‘make query’ but skip the LLM call (dry-run) [JSON=1 BURN=1 ANSWER_MODE=… BROAD=1 REJECT_BROAD=1 ALLOW_BROAD=1 XLANG=1 XLANG_MT=1]

make search

keyword search; override SEARCH_Q (or pass Q=…)

Verify and inspect#

Round-trip Merkle proofs, audit chain integrity, sidecar diagnostics on cached answers.

Target

Description

make verify

round-trip Merkle proofs for VERIFY_N random documents

make verify-shards

cross-shard Merkle round-trip on a random sample

make chain-check

audit-chain integrity count for $(DB) (0 = intact: linear chain, one genesis)

make chain-check-shards

audit-chain integrity count for every *.db in $(SHARDS_DIR)

make analyze-shards

cross-shard compression spectrum + audit integrity

make stats

counts: documents, chunks, edges, audit chain

make stats-shards

cross-shard stats via UNION views over $(SHARDS_DIR)

make activity

recent Q&A + freshly cached docs (agent timeline)

make inspect

sidecar diagnose unverified spans for a cache_key: make inspect KEY=hex [JSON=1]

Operations on cached records#

Mark a record falsified (audit-preserving) or burn it from the database (refuses if it has children unless FORCE=1).

Target

Description

make falsify

mark a cached answer wrong: make falsify KEY=hex REASON=’why’

make burn

delete a leaf with no children. providence: KEY=<cache_key>; document/core: KIND=document|core ROOT=<hex>. REASON=’why’ [FORCE=1]

make burn-kindergarten

bust providence rows < SECONDS old [SECONDS=3600 FORCE=1 DRY_RUN=1 REASON=’why’]

Tests and benches#

Default test suite excludes opt-in crawler tests. bench-qa runs the full QA-quality sweep; bench-qa-smoke is the 5-question fast loop.

Target

Description

make test

run pytest suite (excludes opt-in crawler tests)

make test-crawler

run only the lifted crawler tests

make test-live

live QA quality tests against Hermes (gated; -n auto parallel)

make bench

benchmark serial vs parallel-shared vs attached at $(BENCH_DOCS) docs

make bench-qa

QA-quality sweep: questions × modes × N samples [BENCH_QA_N=3 BENCH_QA_LIMIT=N BENCH_QA_MODES=… BENCH_QA_CONCURRENCY=4]

make bench-qa-smoke

quick 5-question smoke (all anchor classes; ~30s)

make bench-emergent

blue-moon emergent stress test (3-word triangulation; N=10)

make bench-emergent-pending

print log entries awaiting teacher review

Docs#

Render diagrams (graphviz) and build the Sphinx API reference. RTD rebuilds on push; these targets are for local previews.

Target

Description

make docs

render docs/diagrams/*.dot -> .png + .svg via graphviz

make docs-api

generate Sphinx API reference from docstrings (output: docs/_source/_build/html/)

make docs-api-clean

remove Sphinx build artifacts

Clean#

Reversible by re-running the matching bootstrap / fetch / ingest target. clean-data deletes the largest payload (downloaded dumps).

Target

Description

make clean

remove venv + caches (keeps fetched data and db)

make clean-db

drop the arborist db (keeps fetched data and venv)

make clean-data

remove fetched dumps

make help

show this help

Uncategorized#

Targets not yet placed in a workflow phase. If you see one here, add it to docs/_source/_ext/makefile_targets.py PHASES.

Target

Description

make bench-5f

5F battery (Function+Finetuning+Falsification+Formulate+Feedback Loop)

make bench-5f-falsification-hard

#000046 — HARD Falsification near-miss pack (rate < 1.0 at HEAD by design)

make bench-5f-falsification-live

5F Falsification via real arborist.qa.verify.verify_quotes (Phase 1b.2)

make bench-5f-feedback-loop-live

5F Feedback Loop via live arborist memory + audit chain (Phase 1b.2)

make bench-5f-finetuning-live

5F Finetuning via real selfmodel store/claims_for round-trip (Phase 1b.2)

make bench-5f-finetuning-shardchain

#000025 §10.11 — Finetuning over the two latest chain snapshots (run bench-5f-selfmodel-snapshot >=2x first)

make bench-5f-formulate-hard

#000046 Phase 2 / #000048 step 2.4 — HARD Formulate clause-segmentation pack (12/12 after step 2.4)

make bench-5f-formulate-live

5F Formulate via live arborist.qa.parse_claims (Phase 1b.2)

make bench-5f-function-live

5F Function via live arborist.qa.parse_claims (Phase 1b.2)

make bench-5f-harvest

harvest live-shard falsification proposals → 5F fixture pack

make bench-5f-live

all 5F Phase-1b.2 live wire-ups

make bench-5f-selfmodel-snapshot

#000025 §10.11 — append one SelfModel snapshot (rates as claims) to the chain shard

make bench-5f-threshold-calibration

#000025 §10.14 — 5S/5T/5F → ForkScore threshold calibration report

make bench-5r

5R battery (React+Rearrange+Restore+Replicate+Resonate)

make bench-5r-live

all 5R Phase-1b.2 live wire-ups

make bench-5r-react-live

5R React via real audit_events on a temp shard (Phase 1b.2)

make bench-5r-restore-live

5R Restore via real audit chain query (Phase 1b.2)

make bench-5s

5S battery (Syntax+Semantics+Syllogism+Synthesis+Semiotics)

make bench-5s-algebra

5S algebra-symbolic π* (ticket #000030 Phase 1; SymPy expand+srepr canonicalizer)

make bench-5s-arithmetic

5S arithmetic π* (SQD §14.1; rational arithmetic canonicalizer)

make bench-5s-calculus-limit

5S calculus-limit π* (#000030 Phase 4)

make bench-5s-calculus-series

5S calculus-series π* (#000030 Phase 5)

make bench-5s-code

5S code-carrier (Syntax + Semantics through code-py-ast@v1)

make bench-5s-combinatorics

5S combinatorics π* (ticket #000032; pure-integer counting kernel)

make bench-5s-function-sampled

5S function-sampled π* (#000030 Phase 7; SymPy → time-series bridge)

make bench-5s-linear-algebra

5S linear-algebra π* (#000030 Phase 6)

make bench-5s-logic-kernel

5S logic-kernel π* (SQD §14.3; CNF canonicalizer)

make bench-5s-math

complete math π* surface (arithmetic + logic-kernel)

make bench-5s-tabular

5S tabular-pinned π* (#000030/Phase tabular; declared-schema 2D structured-data canonicalizer)

make bench-5s-time-series

5S time-series-quantized π* (SQD §13.5; quantized integer-vector canonicalizer)

make bench-5s5t

5S + 5T (Phase-1b vocabulary)

make bench-5s5t5f

5S + 5T + 5F (operational triad)

make bench-5t

5T battery (Transfer Learning+Triangulation+Truthtables+Transitivity+Time)

make bench-5t-legacy

5T legacy SQD-name (transfer-v1) for Phase-1a digest stability

make bench-fork-baseline

pin current bench-suite output as ForkScore parent

make bench-fork-baseline-hard

#000046 — pin the below-ceiling hard-Falsification rate as a ForkScore parent

make bench-fork-score

#000012 Phase 1b — score current bench output vs $(FORK_PARENT)

make bench-jaggedness

#000060: deterministic retrieval jaggedness across surface perturbations [JAGGED_LIMIT / _K / _CLASSES …]

make bench-nli-backends

#000049 §7 #28 — torch vs onnx-int8 vs tinygrad agreement+latency A/B

make bench-nli-shadow

#000049 Phase 2 — NLI shadow would-demote sweep (INPUT=”f1.jsonl f2.jsonl” optional)

make bench-qa-progressive-and

retrieval-side fixture exercising progressive-AND + DF filter [BENCH_PROGRESSIVE_N=3]

make bench-real-shard

#000026 Phase 2 — real-shard workload baseline (latency, audit, primary-source use)

make bench-suite

complete Dav1DPrometheus suite (5S + 5T + 5F + 5R)

make bench-witness-divergence

extract LLM-divergence events as 5F Falsification fixtures

make bench-witness-sweep

fire witness mode on canonical-question corpus against real shards

make bootstrap-math

install [math] extras (sympy) into the venv

make bootstrap-nli

install [nli] extras + warm the pinned NLI checkpoint

make bootstrap-nli-only

#000049 §7 #22 — minimal venv + [nli] only (GPU/CPU compute-node setup; no [dev])

make control-ab

#000057 v1: Hermes-solo vs Arborist, blinded Opus judge [N=12 FIXTURE=…]

make control-sweep

#000057: model×framing control sweep [CONTROL_SWEEP_N / _WORKERS / _RESUME=jsonl …]

make crawl-textbooks

BFS-crawl every textbook with a crawl_url field, one shard per host (in $(CRAWL_SHARDS_DIR))

make crawl-textbooks-stats

docs-per-shard summary across textbook crawl shards

make demo-plot

function-sampled@v1 demo: make demo-plot Q=’sin(x)’ [PNG=/tmp/out.png]

make docs-one-pager

render docs/_build/arborist-one-pager.pdf

make docs-pagers

render both 1-pager and 2-pager PDFs

make docs-pagers-clean

remove generated pager PDFs

make docs-two-pager

render docs/_build/arborist-two-pager.pdf

make export-nli-onnx

#000049 §3 — export the pinned shadow-NLI checkpoint to ONNX (int8)

make fetch-textbooks

fetch + ingest manifested textbooks → $(TEXTBOOK_DB)

make judge-self-test

verify the #000057 external judge on known-verdict triples (gate before control-ab)

make monitor-access

proxy access-log real-client-IP tally + SVG: make monitor-access LOG=path

make monitor-graph

render stored load samples to SVG [MONITOR_HOURS=6]

make monitor-poll

poll LLM endpoint /metrics (+GPU power/util) into SQLite [MONITOR_INTERVAL=10 MONITOR_GPU=name=host]

make prometheus-sweep-dryrun

#000037 Phase 3 sleep-sweep dry-run → markdown report

make prometheus-trigger-probe

#000037 §12 measured-pressure probe → markdown report

make rapl-access

[root] make Intel RAPL energy_uj readable for watt_bench CPU power

make rapl-access-revoke

[root] remove the RAPL read-access udev rule

make test-ci

CI gate: full suite minus crawler + wikipedia ingest

make textbook

ingest one textbook by id: make textbook ID=bogart-ctgd-2017

make textbook-aristotle-posterior

Aristotle Posterior Analytics (PD, Wikisource)

make textbook-aristotle-prior

Aristotle Prior Analytics (PD, Wikisource)

make textbook-bogart

Bogart Combinatorics Through Guided Discovery (GFDL)

make textbook-boole

Boole Laws of Thought (PD, PG TeX)

make textbook-cantor

Cantor Contributions to Transfinite Numbers (PD, Jourdain transl. 1915, Wikisource)

make textbook-dedekind

Dedekind Essays on the Theory of Numbers 1901 (PD, PG #21016 TeX)

make textbook-demorgan

De Morgan First Notions of Logic 1839 (PD, PG #67017)

make textbook-grinstead-snell

Grinstead-Snell Introduction to Probability (GFDL, LibreTexts mirror)

make textbook-hilbert

Hilbert Foundations of Geometry (PD, PG TeX)

make textbook-judson

Judson Abstract Algebra: Theory and Applications (GFDL)

make textbook-keller-trotter

Keller-Trotter Applied Combinatorics (CC-BY-SA, slow ~25min crawl-delay)

make textbook-laplace

Laplace Philosophical Essay on Probabilities 1814 / 1902 (PD, PG #58881)

make textbook-levin

Levin Discrete Mathematics (CC-BY-SA)

make textbook-list

list ingestable textbook ids (entries with urls or crawl_url)

make textbook-morin

Morin Open Data Structures (CC-BY)

make textbook-newton

Newton Principia (PD, Wikisource)

make textbook-peano

Peano Arithmetices Principia 1889 (CC-BY-SA, Verheyen+Nahas LaTeX from GitHub)

make textbook-plfa

PLFA - Programming Language Foundations in Agda (CC-BY-4.0)

make textbook-pm

Whitehead-Russell Principia Mathematica Vol 1 (PD, PG #78050 — preface+intro only)

make textbook-russell-imp

Russell Introduction to Mathematical Philosophy 1919 (PD, PG #41654)

make textbook-russell-pom

Russell Principles of Mathematics 1903 (PD content + CC-BY-SA-4.0 typesetting, Klement HTML)

make textbook-sf-lf

Software Foundations Vol 1 Logical Foundations (MIT, Pierce et al.)

make textbooks-base-knowledge

ingest the four 2026-05-09 base-knowledge additions (pillars I/II/III/IX)

make textbooks-stats

stats of the textbook shard

make textbooks-summary

list manifest entries with license + URL counts

make textbooks-tex

ingest every manifest entry that declares a tex_url (Hilbert + Boole)

make textbooks-urls

emit one textbook URL per line to stdout

make textbooks-verify

sample-Merkle-verify the textbook shard


Permacomputer Preamble — License: AGPL-3.0-only

This is free software for the public good of a permacomputer hosted at permacomputer.com, an always-on computer by the people, for the people. Durable, easy to repair, & distributed like tap water for machine learning intelligence.

Our permacomputer is community-owned infrastructure optimized around four values:

  • TRUTH — First principles, math & science, open source code freely distributed.

  • FREEDOM — Voluntary partnerships, freedom from tyranny & corporate control.

  • HARMONY — Minimal waste, self-renewing systems with diverse thriving connections.

  • LOVE — Be yourself without hurting others, cooperation through natural law.

NO WARRANTY. Software is provided “AS IS” without warranty of any kind. Full text: License.