Makefile reference#

Every arborist workflow lives behind a make target. This page is auto-generated from the project Makefile’s ## description annotations at Sphinx build time, so it stays in sync with the source.

Run make help locally for a flat alphabetized listing.

Note

Targets are grouped below by workflow phase, in the order a new operator typically runs them. The first row of each table is the most common entry point for that phase.

Setup#

One-time installation. Creates the venv and (optionally) the crawler’s heavy extras. Re-running is a no-op when up to date.

Target	Description
`make bootstrap`	create venv and install editable package
`make bootstrap-crawler`	install [crawler] extras into the venv
`make all`	bootstrap → fetch cur → ingest cur → verify → stats

Fetch#

Download corpus snapshots into data/. Idempotent — curl skips files already present.

Target	Description
`make fetch`	download all 3 files (cur + old.1 + old.2 + concat)
`make fetch-cur`	download cur table dump (~82 MB)
`make fetch-old`	download old (revision history) parts and concatenate (~893 MB)
`make fetch-xml`	download Phase IV XML cur dump (default: enwiki 20101011, 6.2 GB)
`make fetch-abstract`	download Phase IV abstract.xml (default: enwiki 20101011, ~3 GB)

Ingest#

Parse a source into Merkle-committed shards. -attached variants are the canonical sharded path (one SQLite per shard, no WAL contention). Single-DB variants are for experiments.

Target	Description
`make ingest`	default ingest = cur (use ingest-old or *-parallel for full)
`make ingest-cur`	ingest INGEST_LIMIT cur articles
`make ingest-cur-attached`	sharded ingest, no WAL contention (Phase 2)
`make ingest-old`	ingest INGEST_LIMIT old (history) revisions
`make ingest-old-attached`	sharded ingest of old history (Phase 2)
`make ingest-xml`	ingest INGEST_LIMIT pages from $(WP_XML)
`make ingest-xml-history`	ingest every revision (multi-revision mode); set WP_XML to a pages-meta-history file
`make ingest-xml-attached`	sharded XML ingest, one process per shard
`make ingest-abstract`	ingest INGEST_LIMIT abstract docs from $(WP_ABSTRACT)
`make ingest-grok-attached`	ingest Grok conversations into $(GROK_SHARD)
`make ingest-grok-media-attached`	ingest Grok media prompts into $(GROK_SHARD)
`make ingest-self`	ingest this repo’s HEAD into a dedicated shard
`make ingest-self-providence`	promote STRICT live providence records into the document corpus [KG_SECONDS=3600]
`make ingest-git`	ingest GIT_REPO=<path> into its own shard
`make ingest-hg`	ingest HG_REPO=<path> (mercurial) into its own shard
`make crawl-ingest`	crawl URL=https://x.com [DEPTH=2 MAX=0 FAST=1] into $(SHARDS_DIR)/crawl_<domain>.db
`make recrawl-check`	conditional HEAD per ingested doc [DOMAIN=x.com LIMIT=100 CRAWL_SHARD=path]

Distill#

Surface → core distillation. Cores are Merkle-bound back to their source chunks via inclusion proofs.

Target	Description
`make distill-shards-parallel`	one distill process per shard (parallel)
`make distill-shards-tfidf-parallel`	TF-IDF cores per shard, in parallel
`make backfill-concepts`	backfill all concept extractors in parallel across shards

Query#

Ask the corpus a question. query-dry skips the LLM call and returns the assembled context — useful for prompt iteration.

Target	Description
`make query`	ask the corpus a question [JSON=1 BURN=1 REPAIR=1 REPROMPTS=N K=”extra retrieval keywords” ANSWER_MODE=claim_lattice\|claim_lattice_pointer\|quote BROAD=1 REJECT_BROAD=1 ALLOW_BROAD=1 WITNESS=1 XLANG=1 XLANG_MT=1]; JSON by default
`make query-dry`	like ‘make query’ but skip the LLM call (dry-run) [JSON=1 BURN=1 ANSWER_MODE=… BROAD=1 REJECT_BROAD=1 ALLOW_BROAD=1 XLANG=1 XLANG_MT=1]
`make search`	keyword search; override SEARCH_Q (or pass Q=…)

Verify and inspect#

Round-trip Merkle proofs, audit chain integrity, sidecar diagnostics on cached answers.

Target	Description
`make verify`	round-trip Merkle proofs for VERIFY_N random documents
`make verify-shards`	cross-shard Merkle round-trip on a random sample
`make chain-check`	audit-chain integrity count for $(DB) (0 = intact: linear chain, one genesis)
`make chain-check-shards`	audit-chain integrity count for every *.db in $(SHARDS_DIR)
`make analyze-shards`	cross-shard compression spectrum + audit integrity
`make stats`	counts: documents, chunks, edges, audit chain
`make stats-shards`	cross-shard stats via UNION views over $(SHARDS_DIR)
`make activity`	recent Q&A + freshly cached docs (agent timeline)
`make inspect`	sidecar diagnose unverified spans for a cache_key: make inspect KEY=hex [JSON=1]

Operations on cached records#

Mark a record falsified (audit-preserving) or burn it from the database (refuses if it has children unless FORCE=1).

Target	Description
`make falsify`	mark a cached answer wrong: make falsify KEY=hex REASON=’why’
`make burn`	delete a leaf with no children. providence: KEY=<cache_key>; document/core: KIND=document\|core ROOT=<hex>. REASON=’why’ [FORCE=1]
`make burn-kindergarten`	bust providence rows < SECONDS old [SECONDS=3600 FORCE=1 DRY_RUN=1 REASON=’why’]

Tests and benches#

Default test suite excludes opt-in crawler tests. bench-qa runs the full QA-quality sweep; bench-qa-smoke is the 5-question fast loop.

Target	Description
`make test`	run pytest suite (excludes opt-in crawler tests)
`make test-crawler`	run only the lifted crawler tests
`make test-live`	live QA quality tests against Hermes (gated; -n auto parallel)
`make bench`	benchmark serial vs parallel-shared vs attached at $(BENCH_DOCS) docs
`make bench-qa`	QA-quality sweep: questions × modes × N samples [BENCH_QA_N=3 BENCH_QA_LIMIT=N BENCH_QA_MODES=… BENCH_QA_CONCURRENCY=4]
`make bench-qa-smoke`	quick 5-question smoke (all anchor classes; ~30s)
`make bench-emergent`	blue-moon emergent stress test (3-word triangulation; N=10)
`make bench-emergent-pending`	print log entries awaiting teacher review

Docs#

Render diagrams (graphviz) and build the Sphinx API reference. RTD rebuilds on push; these targets are for local previews.

Target	Description
`make docs`	render docs/diagrams/*.dot -> .png + .svg via graphviz
`make docs-api`	generate Sphinx API reference from docstrings (output: docs/_source/_build/html/)
`make docs-api-clean`	remove Sphinx build artifacts

Clean#

Reversible by re-running the matching bootstrap / fetch / ingest target. clean-data deletes the largest payload (downloaded dumps).

Target	Description
`make clean`	remove venv + caches (keeps fetched data and db)
`make clean-db`	drop the arborist db (keeps fetched data and venv)
`make clean-data`	remove fetched dumps
`make help`	show this help

Uncategorized#

Targets not yet placed in a workflow phase. If you see one here, add it to docs/_source/_ext/makefile_targets.py PHASES.

Target	Description
`make bench-5f`	5F battery (Function+Finetuning+Falsification+Formulate+Feedback Loop)
`make bench-5f-falsification-hard`	#000046 — HARD Falsification near-miss pack (rate < 1.0 at HEAD by design)
`make bench-5f-falsification-live`	5F Falsification via real arborist.qa.verify.verify_quotes (Phase 1b.2)
`make bench-5f-feedback-loop-live`	5F Feedback Loop via live arborist memory + audit chain (Phase 1b.2)
`make bench-5f-finetuning-live`	5F Finetuning via real selfmodel store/claims_for round-trip (Phase 1b.2)
`make bench-5f-finetuning-shardchain`	#000025 §10.11 — Finetuning over the two latest chain snapshots (run bench-5f-selfmodel-snapshot >=2x first)
`make bench-5f-formulate-hard`	#000046 Phase 2 / #000048 step 2.4 — HARD Formulate clause-segmentation pack (12/12 after step 2.4)
`make bench-5f-formulate-live`	5F Formulate via live arborist.qa.parse_claims (Phase 1b.2)
`make bench-5f-function-live`	5F Function via live arborist.qa.parse_claims (Phase 1b.2)
`make bench-5f-harvest`	harvest live-shard falsification proposals → 5F fixture pack
`make bench-5f-live`	all 5F Phase-1b.2 live wire-ups
`make bench-5f-selfmodel-snapshot`	#000025 §10.11 — append one SelfModel snapshot (rates as claims) to the chain shard
`make bench-5f-threshold-calibration`	#000025 §10.14 — 5S/5T/5F → ForkScore threshold calibration report
`make bench-5r`	5R battery (React+Rearrange+Restore+Replicate+Resonate)
`make bench-5r-live`	all 5R Phase-1b.2 live wire-ups
`make bench-5r-react-live`	5R React via real audit_events on a temp shard (Phase 1b.2)
`make bench-5r-restore-live`	5R Restore via real audit chain query (Phase 1b.2)
`make bench-5s`	5S battery (Syntax+Semantics+Syllogism+Synthesis+Semiotics)
`make bench-5s-algebra`	5S algebra-symbolic π* (ticket #000030 Phase 1; SymPy expand+srepr canonicalizer)
`make bench-5s-arithmetic`	5S arithmetic π* (SQD §14.1; rational arithmetic canonicalizer)
`make bench-5s-calculus-limit`	5S calculus-limit π* (#000030 Phase 4)
`make bench-5s-calculus-series`	5S calculus-series π* (#000030 Phase 5)
`make bench-5s-code`	5S code-carrier (Syntax + Semantics through code-py-ast@v1)
`make bench-5s-combinatorics`	5S combinatorics π* (ticket #000032; pure-integer counting kernel)
`make bench-5s-function-sampled`	5S function-sampled π* (#000030 Phase 7; SymPy → time-series bridge)
`make bench-5s-linear-algebra`	5S linear-algebra π* (#000030 Phase 6)
`make bench-5s-logic-kernel`	5S logic-kernel π* (SQD §14.3; CNF canonicalizer)
`make bench-5s-math`	complete math π* surface (arithmetic + logic-kernel)
`make bench-5s-tabular`	5S tabular-pinned π* (#000030/Phase tabular; declared-schema 2D structured-data canonicalizer)
`make bench-5s-time-series`	5S time-series-quantized π* (SQD §13.5; quantized integer-vector canonicalizer)
`make bench-5s5t`	5S + 5T (Phase-1b vocabulary)
`make bench-5s5t5f`	5S + 5T + 5F (operational triad)
`make bench-5t`	5T battery (Transfer Learning+Triangulation+Truthtables+Transitivity+Time)
`make bench-5t-legacy`	5T legacy SQD-name (transfer-v1) for Phase-1a digest stability
`make bench-fork-baseline`	pin current bench-suite output as ForkScore parent
`make bench-fork-baseline-hard`	#000046 — pin the below-ceiling hard-Falsification rate as a ForkScore parent
`make bench-fork-score`	#000012 Phase 1b — score current bench output vs $(FORK_PARENT)
`make bench-jaggedness`	#000060: deterministic retrieval jaggedness across surface perturbations [JAGGED_LIMIT / _K / _CLASSES …]
`make bench-nli-backends`	#000049 §7 #28 — torch vs onnx-int8 vs tinygrad agreement+latency A/B
`make bench-nli-shadow`	#000049 Phase 2 — NLI shadow would-demote sweep (INPUT=”f1.jsonl f2.jsonl” optional)
`make bench-qa-progressive-and`	retrieval-side fixture exercising progressive-AND + DF filter [BENCH_PROGRESSIVE_N=3]
`make bench-real-shard`	#000026 Phase 2 — real-shard workload baseline (latency, audit, primary-source use)
`make bench-suite`	complete Dav1DPrometheus suite (5S + 5T + 5F + 5R)
`make bench-witness-divergence`	extract LLM-divergence events as 5F Falsification fixtures
`make bench-witness-sweep`	fire witness mode on canonical-question corpus against real shards
`make bootstrap-math`	install [math] extras (sympy) into the venv
`make bootstrap-nli`	install [nli] extras + warm the pinned NLI checkpoint
`make bootstrap-nli-only`	#000049 §7 #22 — minimal venv + [nli] only (GPU/CPU compute-node setup; no [dev])
`make control-ab`	#000057 v1: Hermes-solo vs Arborist, blinded Opus judge [N=12 FIXTURE=…]
`make control-sweep`	#000057: model×framing control sweep [CONTROL_SWEEP_N / _WORKERS / _RESUME=jsonl …]
`make crawl-textbooks`	BFS-crawl every textbook with a crawl_url field, one shard per host (in $(CRAWL_SHARDS_DIR))
`make crawl-textbooks-stats`	docs-per-shard summary across textbook crawl shards
`make demo-plot`	function-sampled@v1 demo: make demo-plot Q=’sin(x)’ [PNG=/tmp/out.png]
`make docs-one-pager`	render docs/_build/arborist-one-pager.pdf
`make docs-pagers`	render both 1-pager and 2-pager PDFs
`make docs-pagers-clean`	remove generated pager PDFs
`make docs-two-pager`	render docs/_build/arborist-two-pager.pdf
`make export-nli-onnx`	#000049 §3 — export the pinned shadow-NLI checkpoint to ONNX (int8)
`make fetch-textbooks`	fetch + ingest manifested textbooks → $(TEXTBOOK_DB)
`make judge-self-test`	verify the #000057 external judge on known-verdict triples (gate before control-ab)
`make monitor-access`	proxy access-log real-client-IP tally + SVG: make monitor-access LOG=path
`make monitor-graph`	render stored load samples to SVG [MONITOR_HOURS=6]
`make monitor-poll`	poll LLM endpoint /metrics (+GPU power/util) into SQLite [MONITOR_INTERVAL=10 MONITOR_GPU=name=host]
`make prometheus-sweep-dryrun`	#000037 Phase 3 sleep-sweep dry-run → markdown report
`make prometheus-trigger-probe`	#000037 §12 measured-pressure probe → markdown report
`make rapl-access`	[root] make Intel RAPL energy_uj readable for watt_bench CPU power
`make rapl-access-revoke`	[root] remove the RAPL read-access udev rule
`make test-ci`	CI gate: full suite minus crawler + wikipedia ingest
`make textbook`	ingest one textbook by id: make textbook ID=bogart-ctgd-2017
`make textbook-aristotle-posterior`	Aristotle Posterior Analytics (PD, Wikisource)
`make textbook-aristotle-prior`	Aristotle Prior Analytics (PD, Wikisource)
`make textbook-bogart`	Bogart Combinatorics Through Guided Discovery (GFDL)
`make textbook-boole`	Boole Laws of Thought (PD, PG TeX)
`make textbook-cantor`	Cantor Contributions to Transfinite Numbers (PD, Jourdain transl. 1915, Wikisource)
`make textbook-dedekind`	Dedekind Essays on the Theory of Numbers 1901 (PD, PG #21016 TeX)
`make textbook-demorgan`	De Morgan First Notions of Logic 1839 (PD, PG #67017)
`make textbook-grinstead-snell`	Grinstead-Snell Introduction to Probability (GFDL, LibreTexts mirror)
`make textbook-hilbert`	Hilbert Foundations of Geometry (PD, PG TeX)
`make textbook-judson`	Judson Abstract Algebra: Theory and Applications (GFDL)
`make textbook-keller-trotter`	Keller-Trotter Applied Combinatorics (CC-BY-SA, slow ~25min crawl-delay)
`make textbook-laplace`	Laplace Philosophical Essay on Probabilities 1814 / 1902 (PD, PG #58881)
`make textbook-levin`	Levin Discrete Mathematics (CC-BY-SA)
`make textbook-list`	list ingestable textbook ids (entries with urls or crawl_url)
`make textbook-morin`	Morin Open Data Structures (CC-BY)
`make textbook-newton`	Newton Principia (PD, Wikisource)
`make textbook-peano`	Peano Arithmetices Principia 1889 (CC-BY-SA, Verheyen+Nahas LaTeX from GitHub)
`make textbook-plfa`	PLFA - Programming Language Foundations in Agda (CC-BY-4.0)
`make textbook-pm`	Whitehead-Russell Principia Mathematica Vol 1 (PD, PG #78050 — preface+intro only)
`make textbook-russell-imp`	Russell Introduction to Mathematical Philosophy 1919 (PD, PG #41654)
`make textbook-russell-pom`	Russell Principles of Mathematics 1903 (PD content + CC-BY-SA-4.0 typesetting, Klement HTML)
`make textbook-sf-lf`	Software Foundations Vol 1 Logical Foundations (MIT, Pierce et al.)
`make textbooks-base-knowledge`	ingest the four 2026-05-09 base-knowledge additions (pillars I/II/III/IX)
`make textbooks-stats`	stats of the textbook shard
`make textbooks-summary`	list manifest entries with license + URL counts
`make textbooks-tex`	ingest every manifest entry that declares a tex_url (Hilbert + Boole)
`make textbooks-urls`	emit one textbook URL per line to stdout
`make textbooks-verify`	sample-Merkle-verify the textbook shard

Permacomputer Preamble — License: AGPL-3.0-only

This is free software for the public good of a permacomputer hosted at permacomputer.com, an always-on computer by the people, for the people. Durable, easy to repair, & distributed like tap water for machine learning intelligence.

Our permacomputer is community-owned infrastructure optimized around four values:

TRUTH — First principles, math & science, open source code freely distributed.
FREEDOM — Voluntary partnerships, freedom from tyranny & corporate control.
HARMONY — Minimal waste, self-renewing systems with diverse thriving connections.
LOVE — Be yourself without hurting others, cooperation through natural law.

NO WARRANTY. Software is provided “AS IS” without warranty of any kind. Full text: License.

Makefile reference

Contents

Makefile reference#

Setup#

Fetch#

Ingest#

Distill#

Query#

Verify and inspect#

Operations on cached records#

Tests and benches#

Docs#

Clean#

Uncategorized#