← Documentation index Foundations › Architecture

Metnos

Architecture — Introduction
Version 2.0 — June 2026
Reference: Metnos 0.1.0 (pre-1.0) — a daily-driven system
Self-contained HTML — printable as PDF

Audience: anyone who wants to understand, in 30 minutes, how Metnos is built and why,
without jargon but without naivety. Fifteen chapters, fifteen diagrams.

Contents

  1. What Metnos is
  2. The three bets
  3. The key concepts, in seven cards
  4. The layered architecture
  5. Anatomy of a multitool turn
  6. Executors: vectorized by construction
  7. Synt: the tool factory
  8. Four tiers, one deterministic routing
  9. The memory that speeds things up
  10. The senses: the image pipeline
  11. The channels: Telegram and web
  12. Safety and reversibility
  13. The principles, in eight cards
  14. What Metnos is NOT
  15. Where to go next

1. What Metnos is

Metnos is a self-hosted personal assistant with an unusual idea at its core: instead of shipping with a fixed catalog of tools, it synthesizes its own executors — small signed programs, generated on the fly inside a closed vocabulary — and orchestrates them with a local LLM planner. The cloud is required neither for thinking nor for acting: frontier models are an optional consult, not the engine.

The name comes from mētis (cunning intelligence) + noûs (mind). It lives on a machine under your physical and legal control; you talk to it from the channels you already use — Telegram or the browser (port 8770) — and it touches files, mail, photos, calendar, web and GitHub: only what you switch on, one skill at a time.

Metnos at a glance your machine — data, logic and models stay here Phone Telegram, anywhere confirmation buttons Browser chat + admin dashboards HTTP :8770 outbound long-poll the Metnos process Channels Telegram daemon · web server :8770 The mind (ch. 5) intent → plan memory → Mētis engine a whole plan in ONE LLM call deterministic routing: same request, same plan The guards (ch. 12) policy · vaglio (consent) · sandbox Vectorized executors (ch. 6) 79 signed in the repo + synthesized on the fly list in → list out backends & skills (your choice) files & folders mail (IMAP/SMTP) photos & indices calendar web & search GitHub they act here local LLM — llama-server :8080 one instance behind the fast · middle · wise tiers reference instance: a quantized ~35B MoE it thinks here persistent memory mnestome · fast-paths · undo history · audit (SQLite) frontier tier (cloud, opt-in) a consult when asked for — never the engine self-hosted: no mandatory dependency on third-party services; the cloud is a door you open, not a room you live in.
Figure 1 — Metnos at a glance. The process lives on your machine: channels receive, the mind plans with the local LLM, the guards filter, executors act on the backends you enabled. The frontier tier is the only thing outside the fence — and it is opt-in.

The identity card

ItemActual state
ShapePython ≥ 3.11 process, executor-based microarchitecture; ReAct runtime with one-shot planning (the Mētis engine, ch. 5).
Tools79 signed executors in the repo, plus those synthesized on the fly by the instance (ch. 7) and those imported behind a gate (ch. 7). All vectorized: list in, list out.
BrainLocal LLM served by llama-server; four abstract tiers fast / middle / wise / frontier (ch. 8). Frontier = cloud opt-in.
ChannelsTelegram (outbound long-poll, no open ports) + web on port 8770 (chat and admin dashboards), ch. 11.
SensesIn-process image pipeline: semantics + faces + EXIF in one unified index (ch. 10).
Languagei18n by construction: every string and prompt is per-language data. IT + EN validated; other languages = drop-in translation packs (not yet tested).
License / statusAGPL-3.0; pre-1.0. Public repo: github.com/brunialti/metnos — a deterministic export-subset of the daily-driven instance.
An honest showcase, not a polished product. Metnos is a real system, used every day — but built by one person, for one person, on one machine. It is shared so that homelab and AI-architecture enthusiasts can read it, run it, and build on it. Many capabilities exist but have barely been exercised outside the reference instance.

2. The three bets

The whole project rests on three architectural bets. They are deliberate positions, not optimizations: each one reverses a widespread habit of agent frameworks.

Three bets, one system 1 · Closed vocabulary Tools are not imported on trust: they are synthesized inside a closed, audited grammar. «don't trust the package: the package has to earn its place» ch. 3 · 6 · 7 2 · Local first The planner is an LLM on your hardware. No cloud round-trip to think or act. the frontier is an optional consult, not the engine ch. 8 3 · Determinism Same request → same plan, every run: pinned seed, ties broken by curated data, grammar. auditable and testable like software, not like a prompt ch. 5 · 8 together: a reproducible, verifiable agent of your own
Figure 2 — The three bets. Each one reverses an agent-framework habit: skills imported on trust, cloud-first design, the LLM as an oracle re-rolled every turn.

The comparison, with no discounts

Typical agent frameworkMetnos
ToolsHand-written, imported or generated free-form, then run as-is with the assistant's privilegesSynthesized at runtime too — but from a closed, audited vocabulary: signed, aged, smoke-tested and screened before they can ever run
SafetyTrust the author of the packageDon't trust the package: the package must pass the checks (7-layer gate, ch. 7)
LLMOften cloud-firstLocal first; frontier opt-in
RoutingThe model picks a tool each turn — non-reproducibleDeterministic by construction: seed-pinned local inference, ties broken by curated affinity (ch. 8)
OutputFree-form, different per toolUniform: list in / list out, pipeable between steps (ch. 6)
UndoRare or best-effortFirst-class: a closed catalog of reverse patterns, moves = COPY-then-DELETE, honest ok_count (ch. 12)
LanguageEnglish only, strings in the codei18n by construction: strings and prompts are per-language data
Why determinism pays off. Most agents treat the LLM as an oracle to re-roll: ask twice, get two different plans. Metnos makes the opposite bet: a local planner, constrained to a closed vocabulary, can be made reproducible. Routing can then be measured, put under regression tests, audited — like ordinary software. And it compounds: a request that has been solved once is replayed by a fast path with no LLM call at all (ch. 9).

3. The key concepts, in seven cards

Seven words carry the whole document. Defining them now saves you half an hour of confusion thirty lines from here; each one has its own microdesign page in architecture/.

executor — an executable capability: a small program that does one thing well (read files, send an email, move messages, search photos). It accepts lists as input and produces lists as output, carries a manifest that describes it, an Ed25519 signature that authenticates it and a sandbox profile that confines it. It is the only class of things that act in the system.
closed vocabulary — every executor is named verb_object[_qualifier[_descriptor]], composing 23 canonical actions and 22 canonical objects plus qualifiers in four families. It is not an aesthetic convention: it is the boundary of what the system can name — and therefore synthesize. New terms enter only through explicit governance (necessary · general · understandable).
manifest — the TOML identity card of an executor: a description in prescriptive chapters (SCOPE / PATTERN / NOT / OUT), the argument schema, affinity keywords, the reversibility pattern, the code digest. It is not documentation for humans: it is the tool's prompt, written so that a mid-size LLM uses it well (ch. 6).
synt — the process that brings into existence what the pool cannot do yet: a cascade of strategies ordered by cost that first composes existing executors and only as a documented exception generates new code, in five stages plus a semantic check (ch. 7). It proposes; the human approves.
vaglio — (Italian for «sifting») the filter that always sits before execution: a deterministic guard (forbidden paths, unrecoverable commands) followed by a judge that weighs grey-zone operations and, above threshold, asks the user for explicit confirmation with buttons on the channel (ch. 12).
mnest · mnestome — a mnest is the thread linking two executors that were activated together: it is born from context, reinforced by use, and decays if not reused. The mnestome is the graph of all mnests: the system's associative memory, on SQLite, curated by a nightly process (the ager). It gives the planner the intuition of «which executor usually follows which» (ch. 9).
skill ↔ backend — two orthogonal axes: a skill decides whether a group of capabilities is active, trusted and configured (dormant until its prerequisite appears); a backend decides how an action runs against a concrete service (calendar = local ICS or Google), chosen by configuration — never by the LLM. The planner never sees the provider.

The anatomy of a name

The closed vocabulary is the project's most fertile idea: it makes names composable (the planner can predict what a capability it has never seen is called), filterable (the prefilter reasons over verb and object) and synthesizable (synt cannot name anything outside the grammar).

Anatomy of a name: verb_object[_qualifier[_descriptor]] find _ images _ indices _ dry-run action 23 canonical verbs read, write, move, find, get, list, filter, send, … object 22 canonical objects files, messages, events, images, urls, entries, … qualifier (opt.) 4 families format · mode safety · provider descriptor (opt.) kebab-case, max 30 behavioural variant with identical arguments the 5 producer verbs, orthogonal find = pattern / query get = known ids or snapshot read = id → content list = enumerate the container filter = predicate over a list the axis is the primary input, never a synonym read_messages move_files get_urls classify_entries write_files_doc find_issues_github The vocabulary is CLOSED: a new term enters only if necessary, general and understandable to a mid-size LLM. Synonyms before extension; human escalation for every new word. The grammar decides what is nameable.
Figure 3 — The anatomy of a name. Four positional levels, the last two optional; the five producer verbs are distinguished by their primary input, so the planner never has to choose among synonyms.

4. The layered architecture

Metnos is an onion: the outside talks to the world, the inside executes. Each layer trusts only the one beneath it, and privileges shrink as you move toward the core. A request — whether from a user or from a scheduled task — crosses all of them, in order.

The layers, from the outside in 1 · Channels adapters to the world: Telegram daemon (pairing, buttons) · web server :8770 (chat, admin, SSE) runtime/channels/ · metnos_http_server 2 · Turn runtime normalizes the request · literal shortcuts · intent extraction (verb + object + keywords) agent_runtime · intent_extractor 3 · Cognitive engine — Mētis plan memory (fastpath ★ · autopath) → prefilter → proposer (1 call, GBNF) → validator → execution targeted recovery on errors · honest terminator on dead ends — ch. 5 runtime/engine/* 4 · Guards policy (three autonomy levels) · vaglio = guard + judge + consent · bubblewrap sandbox around every invoke policy · vaglio · sandbox 5 · Executors 79 signed in the repo + synthesized on the fly + imported behind the gate — all vectorized, all with their manifest executors/ · ~/.local/…/executors/ 6 · Backends & skills the concrete provider (local files, IMAP, Google Workspace, GitHub, web…) chosen by configuration, never by the LLM backends/ · skills 7 · Persistent tissues mnestome · fast-path/autopath archives · undo history + blobs · append-only audit (all SQLite + filesystem) ~/.local/share/metnos/ a request crosses them in order privileges and trust shrink downwards
Figure 4 — The seven real layers, with the modules that implement them. The cognitive engine (layer 3) is the heart of chapter 5; the guards (layer 4) sit always between the plan and the effect.

5. Anatomy of a multitool turn

If you read only one chapter, read this one. We follow a real request — «find the spam mails and move them to the trash» — from entry to answer: four tools chained together, a single call to the model, every step measured and annotated.

5.1 The cascade, step by step

The ground rule: the model is the last resort, not the first. Memory is tried first (zero LLM, milliseconds); if the request is new, the model is asked once, for the whole plan; the execution that follows is pure deterministic mechanics.

One turn, from entry to answer «find the spam mails and move them to the trash» literal shortcuts a closed table: «what time is it», «where am I», «undo»… microseconds ✗ no match → continue intent_extractor — fast-tier LLM, reasoning off verb = move object = messages kw = spam… ~0.4 s compound requests → an ordered list of clauses the Mētis engine — one entry point, every layer records whether it answered Fastpath ★ — user-approved shortcuts exact fingerprint <5 ms · semantic similarity <150 ms 0 LLM ✗ miss Autopath — plans learned from successful turns searches by meaning of the request, then by exact intent 0 LLM ✗ miss: new request prefilter → the clause's pool find_messages classify_entries filter_entries move_messages rank: verb+object » qualifier » affinity (cap +3) 0 LLM deterministic: same query → same pool, same order Mētis Proposer — wise tier, ONE call proposes the whole plan: steps + links + final message N adaptive candidates, each constrained by the GBNF grammar verb-filter on the pool · early-stop if the first convinces · teleological ranking 1× LLM the model chooses INSIDE the rail: no prose, no invented args (ch. 8) Validator — deterministic plan check do the tools exist? are the args well-formed? do references point to real steps? 0 LLM trivial error → 1 re-proposal, never run Deterministic execution — step by step, no dice 1 find_messages → 42 entries 2 classify spam / not spam 3 filter → 12 entries 4 move_messages ⚙ vaglio: consent → ok_count=12 for each step: resolve from_step and placeholders → vaglio → invoke in sandbox → observation safety caps: max 12 steps per turn · same executor max 3 times in a row render of the final message "Moved ${step4.ok_count} mails to the trash." → real values Autopath records the successful plan next time: same answer, 0 LLM (ch. 9) a step fails? Targeted recovery classifies: wrong tool · wrong args · missing input re-proposes excluding the failed tool, re-runs Terminator — the honest dead end «I cannot solve: X. To proceed: Y.» records the gap — never an invented answer «Moved 12 mails to the trash.» the log records which layer answered and the ms of every phase per-phase telemetry: intent_ms · prefilter_ms · vaglio_ms · exec_ms — every turn is measurable and comparable
Figure 5 — The anatomy of a multitool turn. Shortcuts (0 LLM) are tried first; if the request is new, the Proposer asks the model for the whole plan in a single constrained call; execution is deterministic, with the vaglio in front of the only state-changing step. On the right, the two error paths: targeted recovery and the honest dead end.
  1. Literal shortcuts. A closed table recognizes the most common phrases («what time is it») in microseconds. Here: no match.
  2. Intent. One call to the fast tier (reasoning off, ~0.4 s) extracts the canonical verb, the object and keywords. Compound requests become an ordered list of clauses, each with its own pool.
  3. Plan memory. Fastpath (shortcuts you approved with the ★ button) and Autopath (plans learned on their own) answer without the model if they recognize the request. Here: miss, it's the first time.
  4. Prefilter. The catalog shrinks to the relevant pool for the clause: verb+object match, qualifier bonus, and — to break ties among siblings — the curated affinity bonus (cap +3). All deterministic: same query, same pool, same order.
  5. Mētis Proposer. ONE call to the wise tier produces the whole plan: steps, links, final message. It generates up to N candidates (adaptive, with early-stop), each physically constrained by the pool's GBNF grammar; a teleological ranking picks the best.
  6. Validator. A typecheck of the plan before running it: existing tools, well-formed args, real references. A trivial error costs one re-proposal, not one wrong execution.
  7. Execution. Pure mechanics: for every step the runtime resolves the placeholders, passes through the vaglio, invokes in the sandbox, accumulates the observation. Caps: 12 steps per turn, same executor max 3 times in a row.
  8. Closing. The final message is a template filled with the real results. If the turn succeeds, Autopath records it: next time we jump straight to point 3.

5.2 The plan: what the model actually proposes

The Proposer does not produce prose: it produces a structured object — steps, slots to fill (fillers), final message. This is the real plan for our request:

{
  "steps": [
    {"tool": "find_messages",
     "args": {"folder": "INBOX", "query": "is:unread"}},
    {"tool": "classify_entries",
     "args": {"from_step": 1, "dimension": "spam"}},
    {"tool": "filter_entries",
     "args": {"from_step": 2, "where_field": "spam", "where_value": "spam"}},
    {"tool": "move_messages",
     "args": {"from_step": 3, "dst_folder": "${FILLER:trash_folder}"}}
  ],
  "fillers": {
    "trash_folder": {
      "prompt": "What is the trash folder called for this account?",
      "default": "Trash",
      "tier": "fast"
    }
  },
  "final_message": "Moved ${step4.ok_count} mails to the trash."
}

Worth noting: the model does not know the account's trash folder name — and does not make one up. It declares a slot (${FILLER:trash_folder}) that the runtime will fill at the right moment with a cheap micro-call (cached) or with the default.

5.3 Data piping: how the steps talk to each other

PlaceholderWhat it does
from_step: NTake the entries produced by step N (1-based) and pass them whole to this step. Lists travel only this way: never pasted back into the prompt.
${stepN.field}Extract a scalar field from step N's result (nested paths supported). Used mostly in the final message.
${FILLER:name}A slot filled on the fly by a micro-call to the fast tier (cached) or by the declared default.
${RUNTIME:key}Turn context, resolved by the runtime: actor (who is speaking), lang, channel.
Data piping: lists between steps, scalars in placeholders step 1 · find_messages folder="INBOX" query="is:unread" → entries (42 mails) step 2 · classify_entries from_step: 1 dimension="spam" → entries + spam field step 3 · filter_entries from_step: 2 where spam == "spam" → entries (12 mails) step 4 · move_messages from_step: 3 dst=${FILLER:trash_folder} → results, ok_count=12 entries entries entries ${FILLER:trash_folder} a slot declared by the plan, filled by the runtime: fast-tier micro-call (cached) or the default → «Trash» ${RUNTIME:actor · lang · channel} turn context, injected by the runtime: who is speaking, in which language, from which channel final_message — the template of the reply "Moved ${step4.ok_count} mails to the trash." filled after execution with the real values → «Moved 12 mails to the trash.» ${step4.ok_count} a scalar, not a list legend of the links from_step — whole lists between steps ${stepN.field} — a scalar field ${FILLER:name} — a slot filled on the fly ${RUNTIME:key} — turn context
Figure 6 — The plan of Figure 5 seen as a data flow. Lists stream between steps via from_step; scalars, slots and context pass through typed placeholders that the executor resolves deterministically.
When a cap bites, you see it. If a limit truncates a result (entries, bytes, steps), the executor declares it in the fields (truncated: true, used, available_total) and the runtime says so in the reply — offering to widen only if technically possible, and never widening on its own. A partial result presented as complete is considered a bug, not an optimization.

6. Executors: vectorized by construction

Every executor accepts a list and returns a list — even when the list has zero or one element. There is no *_batch anywhere: the batch version is the executor. It is the decision that keeps plans short and results composable.

One contract for N = 0, 1, a thousand paths = [] paths = ["/tmp/x.txt"] paths = [… ×1000] degenerate or huge list: same entrance, no special case move_files iteration, pagination and time windows live INSIDE; branching goes back to the planner explicit caps: max_total, max_results, max_bytes always a list, plus the truth results = […] ok_count = 12 (real, not hoped) truncated = true used = 200, available_total = 312 cap_field = "max_total" move_files_batch  does not exist — and never will: the vectorized form is the only form.
Figure 7 — The vectorized contract. Zero, one or a thousand elements cross the same code; caps are explicit arguments and truncation is declared in the fields, never hidden.

Three conventions follow from the contract, and you will see them everywhere:

The manifest: the tool's prompt

Every executor carries a TOML manifest. It is not courtesy documentation: it is what the planner reads when it decides whether and how to use the tool — written for a mid-size local LLM, not for a frontier model. Short sentences, literal examples, defaults spelled out; the description follows four prescriptive chapters:

[description]
en = "SCOPE: search files by pattern in directory.
      PATTERN: find_files(base_path=\"/\", patterns=[\"*.jpg\"]).
      NOT: list_dirs+filter_entries; get_files (ID lookup).
      OUT: entries=[{path,name,type,mime,kind,size,mtime}]."
One manifest feeds four different mechanisms executors/find_files/manifest.toml name = "find_files" affinity = ["find","search","cerca", "file","glob","pattern",…] [description] SCOPE: … PATTERN: find_files(…) NOT: … OUT: entries=[{…}] per language (IT+EN), with state tracking [args] — JSON Schema base_path (req) · patterns · recursive max_total … types, defaults, examples reverse_pattern + capabilities e.g. "swap_src_dst" · fs_read · net [code] sha256 digest + signature files = ["find_files.py"] prefilter (ch. 5) curated affinity breaks ties among siblings Proposer pool (ch. 5) the model copies the FORM from PATTERN, never invents GBNF grammar (ch. 8) the args schema becomes the decode rail undo (ch. 12) the reverse pattern comes from a closed catalog the digest binds manifest to code: if the file changes without re-signing, the loader discards the executor — no code drifting away
Figure 8 — One manifest, four consumers: prefilter, planner pool, grammar and undo each read different fields of the same TOML. The digest binds the manifest to the signed code.

7. Synt: the tool factory

When the pool cannot do something, the planner does not improvise code in the middle of the turn: it hands over to synt, the process that brings into existence what is missing. It first tries to compose existing executors; only as a documented exception does it generate a new one — in five stages, each with its own contract.

The assembly line: five stages + verification each stage sees only the minimal slice of context; the closed vocabulary enters ONLY at stage 1 1 · NAMING a name conforming to the closed vocabulary + revertible, critical middle tier 2 · SIGNATURE args schema, required capabilities, reversibility pattern middle tier 3 · TESTS 4-6 birth tests: happy case, empty list, invalid args, edge middle tier 4 · DESCRIPTION chaptered description (SCOPE/PATTERN/NOT/OUT) + affinity keywords middle tier 5 · CODE the Python file with def invoke() (+ reverse if needed) wise tier stage 6 · semantic verification (fail-safe) a separate LLM compares description and code: do they say the same thing? when in doubt it rejects: better to lose a good synth than admit a bogus one Ed25519 signature + digest manifest and code bound together birth tests in the sandbox the 4-6 tests of stage 3, actually run into the pool, next to its siblings same vectorized contract, same manifest, same sandbox as the hand-written executors Synthesis is local: no external provider writes code that will run on your machine. And a bug in a synthesized executor is fixed by iterating the stage's prompt, never by hand-editing the generated file.
Figure 9 — The synthesis pipeline: four procedural stages on the middle tier, the code on the top tier, then independent semantic verification, signature and birth tests. The multi-stage design converges where the single prompt failed.

Two triggers, one cascade

ModeTriggerTiming
ReactiveDuring a turn: the planner finds no executor that satisfies the request.Synchronous — the user is waiting; composition of existing executors is tried first.
IntrovertAt night: the ager walks the mnestome and finds recurrences, overlapping traces, families with the same shape.Asynchronous, in homeostasis: it proposes merges, generalizations, specializations.

In both cases the same rule holds: synt proposes, the human approves. No self-modification without a filter; every proposal comes with its rationale, and is reversible.

The 7-layer gate

The same funnel applies to synthesized code and to skills imported from outside: no package runs on trust.

The admission gate: seven layers, no exceptions package / new synth untrusted 1 signature Ed25519+digest 2 vocabulary name + affinity 3 aging usage quarantine 4 sandbox profile from manifest 5 smoke test execution proven 6 LLM verifier description vs code 7 audit append-only trusted executor in the pool «don't trust the package — the package has to earn its place» drop-in skill formats execute third-party code with the assistant's privileges: for an agent that touches files, mail and shell, that is remote code execution by design. Metnos chooses security by construction. roadmap: map the public skill ecosystem INTO this model, not run it raw
Figure 10 — The 7-layer gate, identical for synthesized and imported executors: signature, vocabulary, usage quarantine, sandbox, smoke test, semantic verification, audit. Only at the end of the funnel does a package become a trusted executor.

8. Four tiers, one deterministic routing

Tiers are abstract roles, not pinned models: fast / middle / wise are assignments you bind to whatever endpoint you have, and frontier is the only cloud opt-in. In the reference instance the three local tiers all point to the same instance of llama-server: only the per-call parameters change.

TierRoleIn the reference instance
fastShort structured extractions: intent, fillers, classifications. Reasoning off.llama-server :8080 — a quantized ~35B MoE, think=False, short replies. Mandatory (the safety net).
middleProcedural work: synthesis stages 1-4, descriptions, judgments.Same instance, default parameters.
wiseThe planner: proposes the whole plan; writes the stage-5 code.Same instance. Mandatory: it never degrades to fast.
frontierAn external consult when explicitly requested (e.g. analyzing an issue).Cloud API, opt-in, with managed fallback if the key is absent.
Tier ≠ model. No GPU or NPU is required by construction: a CPU endpoint, a model you already serve, or the frontier fallback are all first-class paths. A weaker local model means weaker planning, not a broken install.

The three locks of determinism

An LLM at temperature zero is not enough to make routing reproducible: the local server stays non-deterministic because of speculative decoding with a random seed. Metnos closes the door with three locks, one per noise source:

Abstract tiers on the left, determinism on the right fast intent · fillers · classify middle synt 1-4 · descriptions wise plans · synth code llama-server :8080 ONE local instance only the parameters change: think · num_predict frontier cloud consult, opt-in only when asked, never the engine Lock 1 — pinned seed at temperature 0 the server stays non-deterministic (speculative decoding with a random seed): the seed must be pinned via env. METNOS_LLM_SEED=42 (default; -1 = explicit randomness) Lock 2 — ties broken by curated data among siblings with the same object, the manifest's distinctive affinity decides (generic verbs excluded), never a coin flip. prefilter: bonus = min(|query ∩ affinity|, 3) Lock 3 — the grammar as a rail from the step's pool a GBNF is generated (discriminated union): the model CANNOT emit a malformed tool_call, nor mix one tool's name with another tool's args. same request → same pool → same plan, every run routing goes under benchmarks and regression tests like ordinary software a soft constraint = «please keep to the right lane» · a grammar = the guard-rail the first can be ignored, the second cannot: every candidate token is filtered against the grammar before being chosen + verb-filter: the pool narrows to the verbs compatible with the clause's intent
Figure 11 — On the left, tiers as roles bound to a single local instance (frontier aside); on the right, the three locks that make routing reproducible: pinned seed, curated affinity for ties, GBNF grammar on the decode.
No fragile parsers. Tool use is native: the model emits structured tool_calls, and the grammar guarantees the shape upstream. There is no fishing JSON out of prose — the classic weak point of home-grown agents.

9. The memory that speeds things up

Metnos trains no models: no fine-tuning, no RLHF. Everything it learns is inspectable data — plans, traces, shortcuts — and anything learned can be read, corrected, deleted. The practical effect: the more you use it, the less it calls the model.

The circle: use → remember → stop asking a successful turn the plan actually worked Autopath records it the plan is indexed by the meaning of the request (embeddings) a similar request, tomorrow the plan is ready: replayed in milliseconds, 0 LLM ★ explicit promotion a button under the reply promotes the plan to a guaranteed Fastpath shortcut mnestome — the graph of mnests (SQLite) two executors activated together → a thread that strengthens with use and decays if unused; gaps remain as aspirations the clock is usage time, not the calendar: a system that sleeps does not age at night: the ager + synt proposals heavily used multi-step sequences (>50 runs) become candidates for synthetic executors; overlapping traces propose merges and generalizations — always behind the human filter proposals are logged, never auto-applied (ch. 7) every turn leaves traces learning = accumulating verifiable data, never touching weights
Figure 12 — The circle of learning without training: successful plans become shortcuts (Autopath, Fastpath ★), co-activations become mnests, nightly recurrences become synthesis proposals. Everything is readable, reversible data.

10. The senses: the image pipeline

To search your photos, Metnos ships nothing to anyone: three in-process extractors turn every image into three signals — what is seen, who is there, where and when — fused into one unified index queried through the ordinary vocabulary.

Three signals from every photo, one single index a photo from the archive semantics — SigLIP the image becomes a vector: «sunset at the sea», «birthday cake», «mountain trail» faces — RetinaFace + ArcFace finds the faces, turns them into identity prints; people get registered by name only if you ask context — EXIF GPS coordinates, date and time, camera: the where and the when with no model at all unified index one record per photo: scene + people + place + time built once, queried forever «the mountain photos from last summer» find_images_indices the same vocabulary as everything else all in-process, on your machine: the photo archive never leaves home
Figure 13 — The image pipeline: SigLIP for the scene, RetinaFace+ArcFace for identities, EXIF for place and time. The three signals converge into a unified index queried by an ordinary executor of the vocabulary.

A search arrives from the channel like any other request: the planner composes find_images_indices with the criteria extracted from the sentence, and the channel shows inline previews. Building the index is a background job, incremental and restartable, started with a sentence («index the photos in…»).

11. The channels: Telegram and web

A channel is an adapter: it converts an external interface into messages and replies, plus one optional capability — rendering buttons for confirmations and choices. Two channels come with the install; adding more does not touch the core.

Two entrances, zero ports open to the Internet the Metnos machine web server — port 8770 chat (SSE streaming) + admin dashboards Telegram daemon OUTBOUND long-poll toward the bot API browser on the LAN (or via your own VPN overlay) admin key on first connect proposals · executors · runs safety · turns · charts HTTP :8770 Telegram API the daemon does the asking: no open ports, no public IP phone wherever there's network, you talk to your bot inline buttons who may speak? only the paired: signed Ed25519 codes with expiry, an authorization level per person an unknown sender is discarded without echo; vaglio confirmations and choices (get_inputs) arrive as buttons on the channel
Figure 14 — The two channels. The browser talks directly to the server on 8770 (streaming chat + dashboards); Telegram works by outbound long-poll, so no open ports and no public IP. Below, the pairing that decides who may speak.
ChannelWhat it offers
Web :8770Chat in the browser with streaming replies (SSE), image previews, feedback badges; admin dashboards for proposals, executors, runs, safety and turns. The same API answers JSON or HTML depending on Accept. Admin key auto-created on first start, file with 0600 permissions.
TelegramYour personal bot: messages, photos, inline buttons for vaglio confirmations and multiple-choice inputs. Pairing via the /pair command and a signed, expiring code.

12. Safety and reversibility

Safety is not a module: it is a chain of independent guards, and an action must pass all of them. And since even the best guard makes mistakes, the last defense is being able to go back: honest undo, by construction.

The chain of guards — and undo as the last defense pairing who are you? signed code, a role per person unknown = discarded policy three autonomy levels: ReadOnly · Supervised · Full capability per category vaglio guard: forbidden and unrecoverable; judge + consent via buttons always BEFORE running sandbox bubblewrap with a profile from the manifest: network, user, IPC isolated never bare subprocesses signature + audit code bound to manifest via digest; every action in an append-only ledger drift = silent discard the last defense: first-class undo ● a closed catalog of reverse patterns (5): swap src/dst · delete what was created · restore from blob · delete by id ● every move is COPY → check → DELETE: never a deletion without a confirmed copy ● overwritten content goes to sha256-hashed blobs in the turn history: «undo» puts it back where it was ● honest ok_count in undo too: if it says it undid 3 things, it undid 3 things skills stay dormant until their prerequisite appears; disabling one removes the whole surface the system skill (shell, sudo, packages, mounts) exists — which is exactly why every privileged action requires explicit consent, and the whole skill can be switched off with one sentence
Figure 15 — Five guards in series (pairing, policy, vaglio, sandbox, signature+audit) and, below, the safety net: an undo with a closed catalog of reverse patterns, verified copies before any deletion, and honest counts.
The power is real, which is why it is bridled. Metnos can genuinely administer the machine (shell, sudo, packages, mounts) — that is what makes it a host assistant rather than a chatbot. But every privileged action passes through the vaglio with explicit confirmation, runs in the sandbox, lands in the audit; and the whole system skill can be disabled, locking Metnos out of the operating system.

13. The principles, in eight cards

If you remember only eight sentences from this document, make it these. Everything else — code, prompts, conventions — follows from here.

1Vectorized by construction. Every executor accepts a list and returns a list, even a degenerate one. The batch version is the executor: *_batch does not exist.
2A closed, governed vocabulary. Everything that acts has a composable name inside a closed grammar. A new term enters only if necessary, general and understandable.
3No silent failure. Counts reflect what actually happened; truncation is declared, not hidden; a partial result presented as complete is a bug.
4Deterministic > LLM. Where an automaton or a table suffices, the model is not used. The LLM enters where an equipotent parser would genuinely be too complex — and it enters constrained.
5Never an implicit delete. Every move is copy → check → delete; never DELETE without a confirmed COPY.
6Reversibility with a rationale. Every evolutionary act (synthesis, merge, archive) is reversible and motivated. Saying yes costs less when you can go back.
7i18n by construction. Every user-facing string and prompt is per-language data: a new language is a translation pack, not a fork of the code.
8Understandability as a duty. If the user does not understand the system, the system is useless. Simplicity is not aesthetics: it is the criterion that selected everything else.

14. What Metnos is NOT

Half of the design lives in the no's. Every temptation to add an item from this list must be resisted.

15. Where to go next

This was Level 1: the system from above. Level 2 — the microdesign — has one page per component, with enough detail to write its code without inventing anything.

microdesign
Component index
Twenty-one canonical components, one page each, bilingual IT+EN: the level where choices become contracts.
tour · 10 min
Quick Tour
The fast lap with screenshots: what using it feels like, before studying it.
reference
Glossary
Every project term, defined once and linked everywhere.
dialogue · 40 min
Dialogue on ends and limits
The Galilean origins: teleology, the 4 Laws, the vaglio. Why a proactive agent needs a brake.
dialogue · 45 min
Dialogue on executors
The technical foundation: executors, mnest, mnestome, traces, ager, the six original principles.
code
The repository
AGPL-3.0, pre-1.0: the public subset of the daily-driven instance, installer included.

Metnos — Architecture: Introduction (v2).
mētis + noûs: cunning intelligence in the service of the mind — on your own hardware.
Bilingual IT+EN documentation at metnos.com; code at github.com/brunialti/metnos.