Distillation: surface → core compression#

Surface-to-core extraction and recursive distillation.

Distillation: surface docs -> core docs, Merkle-signed back.

class arborist.distill.DistillationResult(core, contributing_chunk_indices)[source]#

Bases: object

What a Distiller returns for one source.

contributing_chunk_indices: 0-based indices into source_chunks that fed the core’s content. The runner Merkle-proves each one against the source’s document_root and stores those proofs in derivations.proof_blob.

Parameters:
core: Document#
contributing_chunk_indices: list[int]#
class arborist.distill.Distiller[source]#

Bases: ABC

Pure function: surface Document + its chunks -> core DistillationResult.

Distillers must be deterministic — same source bytes produce the same core bytes. Bumping a distiller’s algorithm requires bumping its name.

name: str#
abstractmethod distill(source, source_chunks)[source]#
Parameters:
Return type:

DistillationResult

class arborist.distill.FirstSentenceDistiller(max_chars=4096)[source]#

Bases: Distiller

Parameters:

max_chars (int)

name: str = 'first-sentence-v1'#
distill(source, source_chunks)[source]#
Parameters:
Return type:

DistillationResult

class arborist.distill.TfidfKeywordDistiller(top_k=16)[source]#

Bases: Distiller

Parameters:

top_k (int)

name: str = 'tfidf-keywords-v1'#
distill(source, source_chunks)[source]#
Parameters:
Return type:

DistillationResult


Permacomputer Preamble — License: AGPL-3.0-only

This is free software for the public good of a permacomputer hosted at permacomputer.com, an always-on computer by the people, for the people. Durable, easy to repair, & distributed like tap water for machine learning intelligence.

Our permacomputer is community-owned infrastructure optimized around four values:

  • TRUTH — First principles, math & science, open source code freely distributed.

  • FREEDOM — Voluntary partnerships, freedom from tyranny & corporate control.

  • HARMONY — Minimal waste, self-renewing systems with diverse thriving connections.

  • LOVE — Be yourself without hurting others, cooperation through natural law.

NO WARRANTY. Software is provided “AS IS” without warranty of any kind. Full text: License.