← Documentation index Foundations › Architecture handbook

Architecture handbook · July 2026

Metnos

From the whole system to its component contracts

One tutorial, two levels of detail. Start with the ideas, follow a real request, then zoom into the components that make each promise executable.

Status: pre-1.0
Self-contained HTML — printable as PDF

Audience: curious readers first, implementers second.
Simple language, exact contracts, diagrams that can be read without the code.

5learning stages

15+visual explanations

26component deep dives

1continuous architecture

One architecture, three levels of zoom

WhyThe principles and limits that make Metnos a particular kind of assistant.

What happensThe complete path of a turn, including denial, recovery and undo.

How it is builtCanonical component contracts: schemas, calls, authority and error conditions.

How to use this handbook. Read straight through once. On a second pass, use the component links as trapdoors into implementation detail. The overview explains the promise; the component page defines the contract that keeps it.

1. What Metnos is

Metnos is a self-hosted architecture for a governed agent. Its core does not define an application domain: it plans, applies policy, remembers, synthesizes, isolates and audits. The admitted executor set defines what a concrete instance can actually do. Executors are small signed programs, generated or imported only inside a closed vocabulary and orchestrated by a local-first LLM planner. Frontier models are optional consults, not the residence of the system.

The name comes from mētis (cunning intelligence) + noûs (mind). It lives on a machine under your physical and legal control. The reference instance is reached through Telegram and the browser (port 8770), and its current executor catalog covers files, mail, photos, calendars, the web, GitHub and host operations. Those are uses of that catalog, not limits of Metnos. Change the admitted executors and enabled skills/backends, and the operational domain changes without changing the governing architecture.

Metnos corePlanning, policy, memory, synthesis, sandbox, placement and audit. It governs action but does not choose a domain.

Admitted executor setThe signed verbs and objects the instance can execute, together with the skills and backends that are enabled.

Concrete Metnos instanceHome operations, GitHub maintenance, research, remote systems — or another bounded domain expressed by its catalog.

Scope rule. Mail, photos and calendars do not belong to Metnos itself. They belong to executors in the reference catalog. Maintaining Metnos through GitHub is already a different use of the same architecture.

Figure 1 — Metnos at a glance. The process lives on your machine: channels receive, the mind plans through configured tiers, guards filter, and executors act on enabled backends. Remote providers and the frontier tier are explicit choices.

The identity card

Item	Actual state
Shape	Python ≥ 3.11 process, executor-based microarchitecture; ReAct runtime with one-shot planning (the Mētis engine, ch. 5).
Tools	Signed executors in the catalog, plus those synthesized on the fly and those imported behind a gate (ch. 7). All are vectorized: list in, list out. The domain reference provides the user view; current names and counts live in the generated catalog.
Brain	Compatible local or remote LLM endpoints; four abstract tiers `fast / middle / wise / frontier` (ch. 8). Frontier = cloud opt-in.
Channels	Telegram (outbound long-poll, no open ports) + web on port 8770 (chat and admin dashboards), ch. 11.
Devices	A controlled part of the catalog can run on registered PCs in the same network through `metnos-client`: the server remains the point of policy, selection, signing and audit; the device runs only executors declared compatible. Detail: remote executors.
Senses	In-process image pipeline: semantics + faces + EXIF in one unified index (ch. 10).
Language	i18n by construction: every string and prompt is per-language data. IT + EN validated; other languages = drop-in translation packs (not yet tested).
License / status	AGPL-3.0; pre-1.0. Public repo: github.com/brunialti/metnos — a deterministic export-subset of the daily-driven instance.

Pre-1.0, not a polished product. Metnos is installable and operational, but interfaces and defaults may still change. Contracts, security, tests and explicit failures take priority over promises of stability.

2. The three bets

The whole project rests on three architectural bets. They are deliberate positions, not optimizations: each one reverses a widespread habit of agent frameworks.

Figure 2 — The three bets. Each one reverses an agent-framework habit: skills imported on trust, cloud-first design, the LLM as an oracle re-rolled every turn.

The comparison, with no discounts

	Typical agent framework	Metnos
Tools	Hand-written, imported or generated free-form, then run as-is with the assistant's privileges	Synthesized at runtime too — but from a closed, audited vocabulary: signed, aged, smoke-tested and screened before they can ever run
Safety	Trust the author of the package	Don't trust the package: the package must pass the checks (7-layer gate, ch. 7)
LLM	Often cloud-first	Local first; frontier opt-in
Routing	The model picks a tool each turn — non-reproducible	Deterministic by construction: seed-pinned local inference, ties broken by curated affinity (ch. 8)
Output	Free-form, different per tool	Uniform: list in / list out, pipeable between steps (ch. 6)
Undo	Rare or best-effort	First-class: a closed catalog of reverse patterns, moves = COPY-then-DELETE, honest `ok_count` (ch. 12)
Language	English only, strings in the code	i18n by construction: strings and prompts are per-language data

Why determinism pays off. Most agents treat the LLM as an oracle to re-roll: ask twice, get two different plans. Metnos makes the opposite bet: a local planner, constrained to a closed vocabulary, can be made reproducible. Routing can then be measured, put under regression tests, audited — like ordinary software. And it compounds: a request that has been solved once is replayed by a fast path with no LLM call at all (ch. 9).

3. The key concepts, in seven cards

Seven words carry the whole document. Defining them now saves you half an hour of confusion thirty lines from here; each one has its implementation contract in the component atlas below.

executor — an executable capability: a small program that does one thing well (read files, send an email, move messages, search photos). It accepts lists as input and produces lists as output, carries a manifest that describes it, an Ed25519 signature that authenticates it and a sandbox profile that confines it. It is the only class of things that act in the system.

closed vocabulary — every executor is named verb_object[_qualifier[_descriptor]], composing governed sets of canonical actions and objects plus qualifiers in four families. It is not an aesthetic convention: it is the boundary of what the system can name — and therefore synthesize. New terms enter only through explicit governance (necessary · general · understandable).

manifest — the TOML identity card of an executor: a description in prescriptive chapters (SCOPE / PATTERN / NOT / OUT), the argument schema, affinity keywords, the reversibility pattern, the code digest. It is not documentation for humans: it is the tool's prompt, written so that a mid-size LLM uses it well (ch. 6).

synt — the process that brings into existence what the pool cannot do yet: a cascade of strategies ordered by cost that first composes existing executors and only as a documented exception generates new code, in five stages plus a semantic check (ch. 7). It proposes; the human approves.

vaglio — (Italian for «sifting») the filter that always sits before execution: a deterministic guard (forbidden paths, unrecoverable commands) followed by a judge that weighs grey-zone operations and, above threshold, asks the user for explicit confirmation with buttons on the channel (ch. 12).

mnest · mnestome — a mnest is the thread linking two executors that were activated together: it is born from context, reinforced by use, and decays if not reused. The mnestome is the graph of all mnests: the system's associative memory, on SQLite, curated by a nightly process (the ager). It gives the planner the intuition of «which executor usually follows which» (ch. 9).

skill ↔ backend — two orthogonal axes: a skill decides whether a group of capabilities is active, trusted and configured (dormant until its prerequisite appears); a backend decides how an action runs against a concrete service (calendar = local ICS or Google), chosen by configuration — never by the LLM. The planner never sees the provider.

The anatomy of a name

The closed vocabulary is the project's most fertile idea: it makes names composable (the planner can predict what a capability it has never seen is called), filterable (the prefilter reasons over verb and object) and synthesizable (synt cannot name anything outside the grammar).

Figure 3 — The anatomy of a name. Four positional levels, the last two optional; the five producer verbs are distinguished by their primary input, so the planner never has to choose among synonyms.

4. The layered architecture

Metnos is an onion: the outside talks to the world, the inside executes. Each layer trusts only the one beneath it, and privileges shrink as you move toward the core. A request — whether from a user or from a scheduled task — crosses all of them, in order.

Figure 4 — The seven real layers, with the modules that implement them. The cognitive engine (layer 3) is the heart of chapter 5; the guards (layer 4) sit always between the plan and the effect.

Channels — a channel is an adapter: it converts an external interface (Telegram, browser) into messages and replies. Adding one does not touch the core (ch. 11).
Turn runtime — the shell that measures and orchestrates: per-phase telemetry (intent_ms, prefilter_ms, vaglio_ms, exec_ms), safety caps, logs.
The Mētis engine — plans once, executes deterministically, recovers with judgment, and when there is no way out, says so (ch. 5).
Guards — no bare subprocess, ever: every effect passes through policy, vaglio and sandbox (ch. 12).
Executors and backends — who acts and against what: the skill↔backend separation keeps the provider out of the planner's head; placement then decides whether the executor stays on the server or runs on a registered device (ch. 3 and 6).
Tissues — what survives between turns: associative memory, learned shortcuts, undo history, audit.

5. Anatomy of a multitool turn

If you read only one chapter, read this one. We follow a real request — «find the spam mails and move them to the trash» — from entry to answer: four tools chained together, a single call to the model, every step measured and annotated.

5.1 The cascade, step by step

The ground rule: the model is the last resort, not the first. Memory is tried first (zero LLM, milliseconds); if the request is new, the model is asked once, for the whole plan; the execution that follows is pure deterministic mechanics.

Figure 5 — The anatomy of a multitool turn. Shortcuts (0 LLM) are tried first; if the request is new, the Proposer asks the model for the whole plan in a single constrained call; execution is deterministic, with the vaglio in front of the only state-changing step. On the right, the two error paths: targeted recovery and the honest dead end.

Literal shortcuts. A closed table recognizes the most common phrases («what time is it») in microseconds. Here: no match.
Intent. One call to the fast tier (reasoning off, ~0.4 s) extracts the canonical verb, the object and keywords. Compound requests become an ordered list of clauses, each with its own pool.
Plan memory. Fastpath (shortcuts you approved with the ★ button) and Autopath (plans learned on their own) answer without the model if they recognize the request. Here: miss, it's the first time.
Prefilter. The catalog shrinks to the relevant pool for the clause: verb+object match, qualifier bonus, and — to break ties among siblings — the curated affinity bonus (cap +3). All deterministic: same query, same pool, same order.
Mētis Proposer. ONE call to the wise tier produces the whole plan: steps, links, final message. It generates up to N candidates (adaptive, with early-stop), each physically constrained by the pool's GBNF grammar; a teleological ranking picks the best.
Validator. A typecheck of the plan before running it: existing tools, well-formed args, real references. A trivial error costs one re-proposal, not one wrong execution.
Execution. Pure mechanics: for every step the runtime resolves the placeholders, passes through the vaglio, invokes in the sandbox, accumulates the observation. Caps: 12 steps per turn, same executor max 3 times in a row.
Closing. The final message is a template filled with the real results. If the turn succeeds, Autopath records it: next time we jump straight to point 3.

5.2 The plan: what the model actually proposes

The Proposer does not produce prose: it produces a structured object — steps, slots to fill (fillers), final message. This is the real plan for our request:

{
  "steps": [
    {"tool": "find_messages",
     "args": {"folder": "INBOX", "query": "is:unread"}},
    {"tool": "classify_entries",
     "args": {"from_step": 1, "dimension": "spam"}},
    {"tool": "filter_entries",
     "args": {"from_step": 2, "where_field": "spam", "where_value": "spam"}},
    {"tool": "move_messages",
     "args": {"from_step": 3, "dst_folder": "${FILLER:trash_folder}"}}
  ],
  "fillers": {
    "trash_folder": {
      "prompt": "What is the trash folder called for this account?",
      "default": "Trash",
      "tier": "fast"
    }
  },
  "final_message": "Moved ${step4.ok_count} mails to the trash."
}

Worth noting: the model does not know the account's trash folder name — and does not make one up. It declares a slot (${FILLER:trash_folder}) that the runtime will fill at the right moment with a cheap micro-call (cached) or with the default.

5.3 Data piping: how the steps talk to each other

Placeholder	What it does
`from_step: N`	Take the entries produced by step N (1-based) and pass them whole to this step. Lists travel only this way: never pasted back into the prompt.
`${stepN.field}`	Extract a scalar field from step N's result (nested paths supported). Used mostly in the final message.
`${FILLER:name}`	A slot filled on the fly by a micro-call to the `fast` tier (cached) or by the declared default.
`${RUNTIME:key}`	Turn context, resolved by the runtime: `actor` (who is speaking), `lang`, `channel`.

Figure 6 — The plan of Figure 5 seen as a data flow. Lists stream between steps via from_step; scalars, slots and context pass through typed placeholders that the executor resolves deterministically.

When a cap bites, you see it. If a limit truncates a result (entries, bytes, steps), the executor declares it in the fields (truncated: true, used, available_total) and the runtime says so in the reply — offering to widen only if technically possible, and never widening on its own. A partial result presented as complete is considered a bug, not an optimization.

6. Executors: vectorized by construction

Every executor accepts a list and returns a list — even when the list has zero or one element. There is no *_batch anywhere: the batch version is the executor. It is the decision that keeps plans short and results composable.

Figure 7 — The vectorized contract. Zero, one or a thousand elements cross the same code; caps are explicit arguments and truncation is declared in the fields, never hidden.

Three conventions follow from the contract, and you will see them everywhere:

entries vs results — whatever enriches or reads a list returns entries (the record schema is preserved, the pipeline can continue); whatever transforms (move, write, delete) returns results (the schema changes: outcomes, not records).
Robustness at the natural-language boundary — 0 as a placeholder means «no limit»; comparisons are case-insensitive by default; on open text domains values with */? are globs, on closed domains (ids, slugs, scopes) matching is strict and exact. LLM biases never turn into silent failures.
Honest counting — ok_count counts the elements that were actually processed. Never declare an outcome that does not match reality.

The manifest: the tool's prompt

Every executor carries a TOML manifest. It is not courtesy documentation: it is what the planner reads when it decides whether and how to use the tool — written for a mid-size local LLM, not for a frontier model. Short sentences, literal examples, defaults spelled out; the description follows four prescriptive chapters:

The same manifest also declares where the executor may run. The platforms and [placement] fields prevent sending to Windows a tool written only for Linux, or running on a PC an executor that has not been audited for the device. When the chat names a paired PC, the runtime uses those declarations to choose server or device execution; if the target is not reachable, the outcome is honest, not a silent fallback.

[description]
en = "SCOPE: search files by pattern in directory.
      PATTERN: find_files(base_path=\"/\", patterns=[\"*.jpg\"]).
      NOT: list_dirs+filter_entries; get_files (ID lookup).
      OUT: entries=[{path,name,type,mime,kind,size,mtime}]."

Figure 8 — One manifest, four consumers: prefilter, planner pool, grammar and undo each read different fields of the same TOML. The digest binds the manifest to the signed code.

One execution policy

Every executor call, local or remote, crosses the same execution engine. In one place the runtime applies metrics, backpressure, per-resource limits and a cap derived from the hardware. The default deliberately remains serial and the cross-executor pool is off: adopting the infrastructure does not change the order, inputs, outputs or capabilities of existing executors.

[execution]
effect = "unknown"
parallelism_class = 0
resource_class = "default"
concurrency_key = "none"
equivalence_gate = "unverified"

Class	Requested budget	Admission
0	No cross-executor thread.	Default for every existing and generated executor.
1	Moderate concurrency.	Only after verified equivalence; always within engine and hardware limits.
2	High concurrency.
3	Controlled maximum.

The class measures only a budget: it grants no authority and does not mean read-only. A future executor that creates or mutates objects may run concurrently, but it must declare an isolation key, provide the resource identity, and pass equivalence, collision, idempotency and postcondition tests. If any evidence is missing, the loader reduces it to class 0.

All three executor-generation paths — Synt proposals, reactive synthesis and skill generation — also consume one central contract. The local model may design a rich implementation, but it cannot rewrite identity, lifecycle, I/O or the initial execution policy. When it parallelizes independent entries, the worker count comes from the engine and results must return in input order.

Preservation rule. An executor stays serial until it explicitly declares parallel eligibility and passes equivalence testing. Changing the central policy propagates limits and observability to all executors; it never promotes one implicitly.

Remote authority: declared once, consumed three times

A provider name in an argument is data, not permission. For a conforming executor, remote access exists only when the manifest declares an effective provider:access capability. A closed when condition can make that capability active only for the selected backend. Invalid or non-matching conditions grant nothing.

Final typed invocationclient = "google_workspace"
The planner may select a value; it cannot create authority.

Effective manifest capabilityprovider:access
when.client = google_workspace
Exact schema validation, known binding, fail closed.

One decision, three effectsMount the provider home read-write, enable network, keep execution on the server.

Authority rule. The same effective binding governs credentials, network and placement. Names, suffixes and arbitrary arguments are never independent permission paths for conforming executors. Legacy inference exists only as a temporary migration bridge.

Execution on the server or on a registered PC

The normal shape remains simple: the plan selects an executor, the runtime sends it through policy and vaglio, then invokes it in the server sandbox. Remote executors add one controlled detour: for selected executors declared portable, the execution point may be a registered PC in the same LAN or overlay network.

This is not a new channel and not a generic backend. The channel is still Telegram or web; the backend is still files, mail, calendar or another service. The remote executor is the place where the small signed program runs. Metnos keeps on the server the executor choice, policy checks, device registry, payload signing, timeout and audit.

Figure 8b — A remote executor does not move the mind: it moves only the execution of an admitted executor. The server remains the authority that decides, signs, waits and records.

The choice does not depend on the browser IP address. In the web UI, the machine opening the page may be the server, another PC on the network, or a browser behind a proxy; from Telegram there is no local browser at all. Metnos therefore uses the name of the paired device and anchors it in the request language: “on the laptop” is a target, “the laptop” alone is not.

The remote client does not receive general freedom. It polls the server instead of exposing ports; verifies the server signature before execution; downloads only signed and compatible executors; writes the result to a local spool before delivery. If the server is not reachable, it retries delivery without rerunning the work already done.

The remote client is contained by construction: on Windows the Job Object bounds duration, memory and process trees; on Linux the sandbox uses bwrap when present. On this foundation remote executors are no longer read-only: since C7 (ADR 0183) mutating ones — file write, move and delete — run on the device too, made safe by the policy they long required: idempotency, audit, and device-aware reversibility (deterministic reverse patterns and blob backups queued to the same device for undo; the known gap being that blob-restore is not remotable). Executors whose dependencies cannot be resolved on the device stay server-only.

The operational details — PC pairing, UI installation, heartbeat, signed queue, per-OS sandboxing and current limits — live in the remote executors component contract.

7. Synt: the tool factory

When the pool cannot do something, the planner does not improvise code in the middle of the turn: it hands over to synt, the process that brings into existence what is missing. It first tries to compose existing executors; only as a documented exception does it generate a new one — in five stages, each with its own contract.

Figure 9 — The synthesis pipeline: four procedural stages on the middle tier, the code on the top tier, then independent semantic verification, signature and birth tests. The multi-stage design converges where the single prompt failed.

Two triggers, one cascade

Mode	Trigger	Timing
Reactive	During a turn: the planner finds no executor that satisfies the request.	Synchronous — the user is waiting; composition of existing executors is tried first.
Introvert	At night: the ager walks the mnestome and finds recurrences, overlapping traces, families with the same shape.	Asynchronous, in homeostasis: it proposes merges, generalizations, specializations.

In both cases the same rule holds: synt proposes, the human approves. No self-modification without a filter; every proposal comes with its rationale, and is reversible.

The 7-layer gate

The same funnel applies to synthesized code and to skills imported from outside: no package runs on trust.

Figure 10 — The 7-layer gate, identical for synthesized and imported executors: signature, vocabulary, usage quarantine, sandbox, smoke test, semantic verification, audit. Only at the end of the funnel does a package become a trusted executor.

8. Four tiers, one deterministic routing

Tiers are abstract roles, not pinned models: fast / middle / wise are assignments you bind to whatever endpoint you have, and frontier is the optional cloud role. Several tiers may share one endpoint or use separate endpoints: the planner always sees the role.

Tier	Role	Constraint
`fast`	Short structured extractions: intent, fillers, classifications. Reasoning off.	Configured endpoint; short replies. Mandatory.
`middle`	Procedural work: synthesis stages 1-4, descriptions, judgments.	Endpoint and parameters configured for the role.
`wise`	The planner: proposes the whole plan; writes the stage-5 code.	Mandatory: it never degrades to fast.
`frontier`	An external consult when explicitly requested (e.g. analyzing an issue).	Cloud API, opt-in, with managed fallback if the key is absent.

Tier ≠ model. No GPU or NPU is required by construction: a CPU endpoint, a model you already serve, or the frontier fallback are all first-class paths. A weaker local model means weaker planning, not a broken install.

The three locks of determinism

An LLM at temperature zero is not enough to make routing reproducible: the local server stays non-deterministic because of speculative decoding with a random seed. Metnos closes the door with three locks, one per noise source:

Figure 11 — On the left, tiers as roles bound to configured endpoints; on the right, the locks that make routing reproducible: pinned seed, curated affinity for ties, and a GBNF decode grammar.

No fragile parsers. Tool use is native: the model emits structured tool_calls, and the grammar guarantees the shape upstream. There is no fishing JSON out of prose — the classic weak point of home-grown agents.

9. The memory that speeds things up

Metnos trains no models: no fine-tuning, no RLHF. Everything it learns is inspectable data — plans, traces, shortcuts — and anything learned can be read, corrected, deleted. The practical effect: the more you use it, the less it calls the model.

Figure 12 — The circle of learning without training: successful plans become shortcuts (Autopath, Fastpath ★), co-activations become mnests, nightly recurrences become synthesis proposals. Everything is readable, reversible data.

10. The senses: the image pipeline

To search your photos, Metnos ships nothing to anyone: three in-process extractors turn every image into three signals — what is seen, who is there, where and when — fused into one unified index queried through the ordinary vocabulary.

Figure 13 — The image pipeline: SigLIP for the scene, RetinaFace+ArcFace for identities, EXIF for place and time. The three signals converge into a unified index queried by an ordinary executor of the vocabulary.

A search arrives from the channel like any other request: the planner composes find_images_indices with the criteria extracted from the sentence, and the channel shows inline previews. Building the index is a background job, incremental and restartable, started with a sentence («index the photos in…»).

11. The channels: Telegram and web

A channel is an adapter: it converts an external interface into messages and replies, plus one optional capability — rendering buttons for confirmations and choices. Two channels come with the install; adding more does not touch the core.

Figure 14 — The two channels. The browser talks directly to the server on 8770 (streaming chat + dashboards); Telegram works by outbound long-poll, so no open ports and no public IP. Below, the pairing that decides who may speak.

Channel	What it offers
Web :8770	Chat in the browser with streaming replies (SSE), image previews, feedback badges; admin dashboards for proposals, executors, runs, safety and turns. The same API answers JSON or HTML depending on `Accept`. Admin key auto-created on first start, file with 0600 permissions.
Telegram	Your personal bot: messages, photos, inline buttons for vaglio confirmations and multiple-choice inputs. Pairing via the `/pair` command and a signed, expiring code.

The Tutor: explaining without executing

Explicit questions about how to use Metnos are intercepted at the shared HTTP/Telegram boundary, before the planner and without consuming a pending dialog. A conservative detector recognizes only the request class; the relevant card is selected neither by synonyms nor exact phrases, but by local BGE-M3 embeddings compared with vectors held in the signed SQLite catalog.

After retrieval, the authenticated principal filters the content. Administrative procedures remain deterministic; for informational guides the local model composes the answer using only the retrieved card and the admitted instance inventory. It receives no tools and crosses the central llm slot in the serial class: it can explain, not act. An invalid catalog, a weak match, or insufficient context produces an explicit outcome, never an invented capability.

The essential split. Deterministic code governs detection, identity, audience, integrity, and safety procedures; embeddings and the local LLM are used where language variety would make a phrase table brittle.

12. Safety and reversibility

Safety is not a module: it is a chain of independent guards, and an action must pass all of them. And since even the best guard makes mistakes, the last defense is being able to go back: honest undo, by construction.

Figure 15 — Five guards in series (pairing, policy, vaglio, sandbox, signature+audit) and, below, the safety net: an undo with a closed catalog of reverse patterns, verified copies before any deletion, and honest counts.

The power is real, which is why it is bridled. With the corresponding executor set, a Metnos instance can genuinely administer a machine (shell, sudo, packages, mounts). That makes it an operational architecture rather than a chatbot. But every privileged action passes through the vaglio with explicit confirmation, runs in the sandbox, lands in the audit; and the whole system skill can be disabled, locking Metnos out of the operating system.

13. Component atlas

The architecture above says what the system promises. This atlas shows which component owns each promise and where its exact contract lives.

13.1 From overview to component contracts

Metnos architecture has two levels of zoom. The upper level describes the system as a whole: its layers, organs, laws and ends. It is the path you have just followed from chapter 1.

The lower level is the set of component deep dives linked below: one HTML document per component, with the detail needed to write the code without guessing. Decisions here are not opinions — they are contracts: data schemas, function signatures, sandbox flags, error conditions. When the code and the document drift apart, the document wins and the code is adjusted; or the document is corrected on the spot — never «later».

The rule of life is short: a component is not implemented until its HTML exists, has been approved, and speaks the same language as the code already in place.

Check. Before you read on, two points should be obvious: (a) Level 1 explains what, Level 2 explains how; (b) Level 2 documents are contracts, not drafts. If either is unclear, re-read the previous paragraph before continuing.

13.2 Four nouns, now as implementation contracts

Everything in Metnos revolves around four nouns. Defining them now saves half an hour of confusion thirty lines down.

executor: An executable capability: a small program that does one thing well (read files, send mail, compute a hash, OCR a PDF, discover fresh URLs on a site). Every executor takes lists in and returns lists out; it has a manifest that describes it, an Ed25519 signature that authenticates it, and a sandbox profile that confines it. Product membership, origin, and transport remain separate axes. GitHub executors maintained by Metnos are builtin with handcrafted origin, not imports; the generated catalog is the single source for the source-tree domain breakdown.
mnest: The thread that links two executors when the planner has fired them together. It is not a code pointer, it is a trace: born from context, reinforced by repetition, decayed when unused.
mnestome: The emergent graph of all mnests. It is the system’s associative memory: it lives on SQLite, is curated by a nightly process (the ager), and gives the planner the intuition for «which executor usually follows which». The Italian counterpart of the term is mnestoma.
agent runtime: The engine that orchestrates everything: it receives the user request, breaks it into steps, picks executors, runs them in a sandbox, gathers observations, decides the next step. It implements the ReAct loop with native tool-use (local LLM, wise tier) and leans on the other components for the hard calls (Vaglio for safety, Synt for missing capabilities, Telos for ends).

Check. Try to finish these sentences out loud: «An e-mail is sent by an…», «When two executors often work together, between them a… is born», «All these threads together form the…», «Who decides the order of the steps is the…». If you answer executor, mnest, mnestome, agent runtime, you may proceed.

13.3 The component map

The documented components are organized by role. The diagram below groups the main nodes by role: thick black border for the central engine, green shapes for the «services» the engine consults, blue shapes for the «tissues» that hold state, bronze shapes for the peripheral organs facing the user and the environment. Arrows show who calls whom.

Component map. Solid arrow: direct call. Dashed arrow: orientation or read.

Three observations to read the diagram well.

Telos sits above everything: it is not a service one calls, it is an orientation. The runtime weighs alternatives in light of the ends declared in the homonymous workspace file.
Vaglio always runs before execution, never after. Once an executor has fired, going back is not free: undo exists, but it costs history and backup blobs.
The executor pool is open: synt can compose new executors out of existing ones, or — if that is not enough — generate one from scratch, with signing and automatic install.

13.4 A second worked request

To pin the map down, let’s follow a simple request from inbox to reply: «move to ~/Archive/2026 the invoice PDFs that arrived this week».

Channel. Telegram receives the user message. The daemon checks that the sender is paired with sufficient authorisation; otherwise the message is silently dropped. Pairing means «channel + sender ID recognised»: it is obtained by replying to a signed Ed25519 code with a TTL.
Agent runtime — planning. The runtime extracts the intent (canonical verb: move; object: files; criterion: invoice-attached PDFs in the «last week» window), asks the prefilter to narrow the catalog to relevant executors, and prepares the first step of the ReAct loop.
Vaglio — guard + judge. Before the executor fires, Vaglio checks two things: that the path is not forbidden, that the shell command is not unrecoverable (rm -rf and friends). For grey-zone operations, the rule-based judge assigns a score and, above threshold, asks the user for confirmation via the three-line card (what / where / why).
Sandbox + executor. The runtime invokes read_messages inside bwrap with the flags derived from the manifest. The output comes back as a list of entries; each entry is a dict with the PDF path and metadata.
Pipe. The next step is move_files; it takes the previous step’s list via from_step: N. Ground truth lives in the scratchpad: the planner sees not the whole list, but a synthetic view large enough to decide.
Mnest + mnestome. The pair read_messages → move_files reinforces an existing trace in the graph; if absent, it creates one. The nightly ager will do upkeep: decay, merges, drops.
Reply. The runtime answers the user via Telegram with the number of files moved and the first reason for skipping if any have been left out. The final_answer includes a truncation marker if the input list had been capped.

Check. Without looking back: who talks to the user? who decides the sequence? who fires the executors? who checks the operation is allowed? who remembers that two capabilities went hand in hand? If your answers are channel, agent runtime, agent runtime, vaglio, mnestome, the map is yours.

13.5 Canonical component deep dives

Below, the documents are grouped by role. All have an Italian counterpart at /it/architecture/.

Central engine

Component	Scope
`Cognitive engine`	The four-layer engine that plans and executes (replaces the iterative step-by-step planner): Fastpath serves user-approved shortcuts (hash + BGE-M3 cosine), Autopath recognises and reuses skills learned from feedback (sqlite, semantic + intent match), Validator checks the plan before running it, and the Engine block proposes the whole plan in a single local LLM call (Proposer), executes it deterministically (Executor), recovers by classifying the error into 4 classes (Recovery) and, if there is no way out, honestly explains what is missing (Terminator). One proposal instead of six calls: faster, zero cost, and faster still as it learns.
`agent_runtime`	ReAct loop, mode router, data piping between steps (`from_step: int` for lists, `{{stepN.field}}` for scalars), scratchpad, mnestome hooks. The engine that calls everyone else.
`scratchpad`	Per-turn temporary store: holds large observations without crowding the planner’s context. Builtin `scratchpad_read` with head/tail/range.
`grammar`	Constrained generation via GBNF for the PLANNER `tool_call`. Discriminated union name+args, recursive schema, contextual pool filter, post-decode validator. Solves thinking-loop, args mix-match, arbitrary escape-hatches. Bench convergence 50% → 100%.
`fastpath and autopath`	Two layers of memoization before the planner. L0 — fastpath: the shortcut for the same request, recognised by fingerprint (hash) or semantic proximity (BGE-M3 cosine) and served again with the plan already prepared, concrete arguments included. L1 — autopath: the generalised plan for a cluster of related requests (the skeleton without arguments), promoted from repeated positive feedback and kept in `autopath.sqlite` (tables `autopaths`/`anti_autopaths`). Hybrid argument extractor (rule + memory + optional LLM).
`Tutor`	Local pre-planner guide: compiles admitted manifests, allowlisted public documentation, and curated procedures into a signed BGE-M3 catalog. It answers with no tools, filters audience before the model, separates explanations from actions, and reports gaps. Includes the F2 architecture and F3/F4 roadmap.
`mail accounts`	Configuration of IMAP/SMTP mailboxes, including non-Google providers: encrypted bindings, multiple accounts, env-file compatibility, and the boundary with Gmail-specific features.
`lifecycle`	Unified system-change lifecycle: one `change_intent` object, one state machine (proposed→accepted→applied→observed→finalized, plus staged/rejected/failed/rolled_back), one UI `/admin/changes`. Funnels the proposal sources (telos, introspective, synt, fast-path, feedback) into one queue. Nightly pipeline: materialization (cross-source fingerprint dedup), per-kind application, observation with a grace window and physical rollback.
`model virtualization`	How Metnos picks and swaps its models: three facades (`get_llm`, `get_embedder`, `get_vlm`) that ask for a role, not a model, and translate the role by reading the `{llm,embedding,vlm}_tiers.toml` files. Change a model = edit a TOML, never the code. Segregation (no consumer imports the concrete embedder any more), embedding autonomy (BGE-M3 and SigLIP run ONNX in-process), remote endpoint via `provider="http"`. A lightweight subset of the supranet pattern: Protocol + factory, no registry/DI.

Executable capabilities and their birth

Component	Scope
`executor`	Anatomy of an executor: TOML manifest, Ed25519 signature, sandbox profile, lifecycle, and vector contract (list-in, list-out). The current count lives in the generated catalog rather than this prose.
`executor catalog`	Inventory generated from signed manifests: every first-party executor grouped by canonical domain, with purpose, criticality, platforms, and source location.
`remote_executors`	How a controlled part of the executor catalog can run on registered devices: `metnos-client`, device pairing, manifest placement, per-OS sandboxing, UI installation and explicit Windows/Linux limits.
`intelligent executors`	Narrow-mandate agents behind a regular executor contract: bounded adaptive loops, deterministic resolvers before models, unchanged authority, and verifiable postconditions. `login_sites` is the first example.
`synt`	How new executors are born: five-stage pipeline (naming, signature, tests, description, code), reactive cascade (compose → generate) and introspective cascade (dedupe, generalize, specialize).
`skill_importer`	Imports third-party skills from `agentskills.io` and turns them into Metnos executors: five-stage pipeline (fetch, parse, map, wrap, register), verb mapping table `skill_vocab_map.json`, wrapping helpers with verb boundary, and CLI `metnos-skills import\|list\|uninstall\|status\|evaluate`.
`skills & backends`	Why skills and backends are two orthogonal axes: the backend says HOW you run a `verb_object` (configuration, `backend_resolver`, invisible to the LLM), the skill says WHETHER/WHICH capabilities are unlocked (activation, dormancy, sandbox). Three tiers (core / first_party / imported), multi-provider architecture transparent to the planner, promotion with a one-off frontier.

Associative memory

Component	Scope
`mnest`	The co-activation trace between two executors: anatomy, lifecycle, decay, persistence, proto-mnest.
`mnestome`	The emergent graph of all mnests: SQLite data schema, atomic operations, nightly ager, snapshot. Italian term: mnestoma.

Safety, rules, sandbox

Component	Scope
`vaglio`	Binary guard (forbidden paths, near-unrecoverable shell commands) and graded rule-based judge with configurable threshold. Probabilistic LLM judge deferred.
`policy`	Closed capability registry, autonomy × capability table (ReadOnly / Supervised / Full), per_target persistent grants, combined `effective_outcome`.
`sandbox`	`bwrap` profile derived from the manifest: read-only mount of the code, network isolation if no capability requires it, graceful fallback if `bwrap` is missing. Landlock deferred.

User-facing channel

Component	Scope
`channel`	Channel adapter (`Protocol` with `send` / `poll`) and first concrete implementation: `TelegramChannel` with long-poll, `last_update_id` persistence, daemon and systemd user unit. Multi-user: `send_to(chat_id, OutboundMessage)` + `/start <token>` for guest pairing.
`http_api`	Second HTTP server (port 8770): uniform agent channel on `POST /agent/turn` (SSE + JSON), `/admin` dashboard in htmx + Jinja2 + uPlot, user management, introvertiva proposals, scheduler runs. Auth via admin key (7-day cookie) or device Bearer.
`pairing`	Two paths: `/pair` with TTL-bound signed Ed25519 codes for technical devices, and `/start <token>` short-lived for family/guests (multi-user). Registry `users.db` with host + guests, `user_channels`, `resolve_recipients`. Host bootstrap on first run.
`approval_ux`	Three-line card for confirmation prompts: `render_approval_card`, `ApprovalRequest`, modulation full / medium / short by recurrence, Telegram dispatcher `approve:<tok>` / `reject:<tok>`.

Multilingual

Component	Scope
`multilang`	Three multilingual layers: LLM prompts (`runtime/prompts/<lang>/<role>.j2`), executor descriptions (TOML manifest + companion JSON), user-facing messages (`i18n.sqlite`). Latest-wins source-of-truth: no language is canonical by construction; the latest editor wins. Admin command `metnos-prompts add-language <code>`. Opt-in `frontier` tier for higher quality.

Visibility and ends

Component	Scope
`observability`	Static HTML dashboard aggregating Metnos’s data sources (mnestome, pairings, turns, Vaglio decisions, scheduler). Generated on demand: no live server, no JavaScript.
`telos`	The user’s ultimate ends, the alignment function, the bother budget with scheduler quotas, the non-renunciation telos (`t.coltivazione_strumenti`) and the stop clause. The `TELOS.md` file lives in the workspace.

13.6 Vocabulary and primitives

The closed vocabulary stands at 26 actions (read, write, move, delete, create, find, list, filter, sort, group, classify, get, set, send, describe, render, extract, compress, compute, compare, change, order, share, open, login, act) and 26 objects (files, dirs, packages, messages, events, contacts, places, processes, urls, numbers, images, signatures, texts, proposals, persons, tasks, inputs, approval, credentials, issues, pulls, calendars, entries, lists, skills, sites). Qualifiers come in four families: format/encoding, modality, safety policy, and provider (for specific non-default backends like _google_workspace). Centralised in runtime/vocab.py.

The planner currently exposes nine consumers for lists of entries. The five structural or numeric operators are filter_entries, filter_lists, sort_entries, group_entries, and compute_entries. The four in-process semantic helpers are classify_entries, compare_entries, extract_entries, and describe_entries. The first group filters one list, combines two lists, sorts, merges/deduplicates, or computes an aggregate; the second classifies, compares semantically, extracts structured records, or summarizes. The list is checked against the runtime registry and signed manifests. Concrete user example: «Is there an HLT appointment that overlaps with an MNM one in the next 3 months?» → the planner builds in six steps read_events → filter HLT → filter MNM → filter_lists(op=overlap) → final_answer.

The verified live inventory contains 115 standard, signed executors: 82 handcrafted executors in the main tree, 16 builtin GitHub executors with handcrafted origin, and 17 runtime builtins using in-process transport. The Composer sees the same contract for all of them and does not select by transport. Five orthogonal producer verbs (find for patterns, get for ids/state, read for blobs from a source, list for containers, filter for reduction).

13.7 Contract-writing conventions

Every component-contract file follows the same template as this handbook's architectural overview:

navy / sage / bronze palette; CSS inline, no external dependencies;
every list presented as exhaustive names its source of truth and is either generated from it or covered by an element-for-element equality test;
title page, numbered table of contents, chapters with stable id;
at least one inline SVG figure (never a PNG: diagrams must remain searchable and PDF-printable);
a Contract section (Protocol + error conditions), an Implementations one, a Conformance tests one;
a sticky breadcrumb navigation at the top;
prescriptive rules to LLMs follow the MUST / MUST NOT / OK / ERROR pattern, four lines max;
option labels as (a)/(b)/(c), never as Greek letters;
no third-party real names: only «Roberto» or generic terms (guest, invited family member).

14. The principles, in eight cards

If you remember only eight sentences from this document, make it these. Everything else — code, prompts, conventions — follows from here.

1Vectorized by construction. Every executor accepts a list and returns a list, even a degenerate one. The batch version is the executor: *_batch does not exist.

2A closed, governed vocabulary. Everything that acts has a composable name inside a closed grammar. A new term enters only if necessary, general and understandable.

3No silent failure. Counts reflect what actually happened; truncation is declared, not hidden; a partial result presented as complete is a bug.

4Deterministic > LLM. Where an automaton or a table suffices, the model is not used. The LLM enters where an equipotent parser would genuinely be too complex — and it enters constrained.

5Never an implicit delete. Every move is copy → check → delete; never DELETE without a confirmed COPY.

6Reversibility with a rationale. Every evolutionary act (synthesis, merge, archive) is reversible and motivated. Saying yes costs less when you can go back.

7i18n by construction. Every user-facing string and prompt is per-language data: a new language is a translation pack, not a fork of the code.

8Understandability as a duty. If the user does not understand the system, the system is useless. Simplicity is not aesthetics: it is the criterion that selected everything else.

15. What Metnos is NOT

Half of the design lives in the no's. Every temptation to add an item from this list must be resisted.

Not a permissionless plug-in framework. The core is domain-independent, but every concrete instance has a bounded, admitted executor set and explicit policies. A new domain is expressed through governed executors and skills, not by letting arbitrary packages inherit the agent's privileges.
It does not run third-party skills as-is. Drop-in formats are execution of someone else's code with your privileges. Here every package passes the 7-layer gate, or it does not run (ch. 7).
It trains no models. No fine-tuning, no RLHF. Growth is inspectable memory + synthesis behind the human filter (ch. 7 and 9).
Not a cloud agent. It runs at home; the frontier is an explicit consult, never the residence. No opening you did not choose.
Not an IDE nor a dev assistant. It does not write code in other projects on your behalf; at most it analyzes with read-only executors.
Not a home-automation replacement. It can ask a home-automation system; it does not duplicate it.
Not multi-channel at all costs. Two channels done well; the others when truly needed.

16. Where to go next

You now have both levels: the system from above and the map of its component contracts. Use the atlas for implementation detail, or continue with the tour, glossary and design dialogues.

reference

Component contracts

Return to the atlas and open the exact implementation contract you need.

operational reference

Domains and examples

What you can ask Metnos, domain by domain, with natural phrases ready to adapt.

introductory guide

The interface

The two channels, the Settings sections, and the map of pages with their navigation paths.

tour · 10 min

Quick Tour

The fast lap with screenshots: what using it feels like, before studying it.

reference

Glossary

Every project term, defined once and linked everywhere.

dialogue · 40 min

Dialogue on ends and limits

The Galilean origins: teleology, the 4 Laws, the vaglio. Why a proactive agent needs a brake.

dialogue · 45 min

Dialogue on executors

The technical foundation: executors, mnest, mnestome, traces, ager, the six original principles.

component contract

Remote executors

How Metnos moves selected executors from the server to a registered PC while keeping policy, audit and explicit limits.

code

The repository

AGPL-3.0, pre-1.0: the public subset of the daily-driven instance, installer included.

Metnos — Architecture Handbook (July 2026).
mētis + noûs: cunning intelligence in the service of the mind — on your own hardware.
Bilingual IT+EN documentation at metnos.com; code at github.com/brunialti/metnos.

Metnos

Your route through the system

One architecture, three levels of zoom

Know what you are looking at

1. What Metnos is

The identity card

2. The three bets

The comparison, with no discounts

3. The key concepts, in seven cards

The anatomy of a name

Follow one turn from words to effects

4. The layered architecture

5. Anatomy of a multitool turn

5.1 The cascade, step by step

5.2 The plan: what the model actually proposes

5.3 Data piping: how the steps talk to each other

Small programs, explicit authority

6. Executors: vectorized by construction

The manifest: the tool's prompt

One execution policy

Remote authority: declared once, consumed three times

Execution on the server or on a registered PC

7. Synt: the tool factory

Two triggers, one cascade

The 7-layer gate

8. Four tiers, one deterministic routing

The three locks of determinism

Memory, senses, channels and safeguards

9. The memory that speeds things up

10. The senses: the image pipeline

11. The channels: Telegram and web

The Tutor: explaining without executing

12. Safety and reversibility

From promises to component contracts

13. Component atlas

13.1 From overview to component contracts

13.2 Four nouns, now as implementation contracts

13.3 The component map

13.4 A second worked request

13.5 Canonical component deep dives

13.6 Vocabulary and primitives

13.7 Contract-writing conventions

14. The principles, in eight cards

15. What Metnos is NOT

16. Where to go next