← Documentation index Microarchitecture › introvertive fast-path

Metnos

introvertive fast-path — how the assistant learns to answer instantly

Microarchitecture

Audience: anyone who wants to understand why Metnos sometimes answers
in half a second, and sometimes takes ten.
Reading time: 10 minutes.

The idea, in two lines
Why a fast-path is necessary
The two layers
L0 — auto-produced cache (fastpath.py)
L1 — autopaths promoted by rating (autopath.py)
Aging, death and inheritance
The arguments extractor
Configuration
Promotion to a synthesised executor

1. The idea, in two lines

Before calling on the planner, Metnos tries to recognise the request. If it has seen it before and already knows how to handle it, it reuses the ready-made path — the sequence of executors that carries the request through — and answers in half a second instead of ten. No language model to consult: memory is enough.

Figure 1 — The memory layers ahead of the engine: L0 fastpath (hash + cosine), L1 autopath (user feedback); if neither fires, the request goes through to the engine — the LLM planner.

2. Why a fast-path is necessary

The Metnos planner is a local language model — the wise tier, the most capable one. It reasons well, but it takes about ten seconds to choose the first step of a turn: for many requests that wait is out of all proportion. “What time is it?” doesn't call for a model, it calls for the get_now tool. And “download this page and sum it up in two lines”, when we ask it often, doesn't call for the planner but for a path we already know: get_urls, then describe_entries.

What remains is to recognise a known request — and to judge when one that is merely similar sits close enough to be handled the same way. That is the job of the introvertive fast-path, on two layers: they catch different degrees of similarity, but reach the same result, the right path run without a second thought.

3. The two layers

The two layers differ in how much of the path they recognise. L0 recognises all of it, arguments included: the same request as before, with the same concrete values. L1 recognises the path alone — the skeleton of the solution, stripped of its arguments — and so it holds for a whole family of related requests.

Layer	What it recognises	How	Cost per reuse
L0	Path + arguments. The same request already solved — identical or very close in meaning — with its concrete values (that name, that URL, that date).	Exact hash (0a) + BGE-M3 cosine (0b)	< 5 ms (hash) / < 150 ms (cosine)
L1	The path alone. The generalised skeleton, without arguments, valid for a group of related requests the user has confirmed.	Lookup by intent and group	~30 ms

The order is fixed: L0 first; if it finds nothing, L1. If both miss, the request reaches the planner (the engine), as always.

Note. The planner doesn't go away: the fast-path sits beside it, it doesn't replace it. If the request is new, ambiguous, or clears no threshold in either search, the planner takes over again, as if the fast-path weren't there.

4. L0 — auto-produced cache (`fastpath.py`)

The first layer lives in runtime/engine/fastpath.py and keeps the plans it has already run in a SQLite database (fastpaths.sqlite). The entries appear on their own: whenever a turn succeeds — a fresh plan from the engine, the reuse of an L1, a promotion from cosine 0b — Metnos notes down the canonical query, its hash, the BGE-M3 embedding (the numeric form of the text, the one used to gauge closeness in meaning), the whole plan (the skeleton, or framework) and the intent: the verb and the object of the request. No approval involved: the chains are built from executors already vetted and tested.

Two ways to find a match

The search runs in two phases:

0a — hash match. The query, reduced to a normal form, becomes a fingerprint (a deterministic hash) and is looked up for an exact match. Under 5 ms, no model and no embedding. It is the only route allowed for plans tied to a single request: the ones with literal arguments lifted from the question — a proper name used as a search key, say — which, replayed on a similar but different request, would return the wrong data.
0b — semantic cosine. If the fingerprint finds nothing, the query becomes its BGE-M3 vector and is compared, by cosine, against the entries above the configured threshold. This is what brings the variants of one request under the same answer (“what time is it?” and “tell me the time”). Cosine serves only the general plans, the ones not tied to a single request: that way one question's arguments don't stray onto another that is close in meaning but distinct.

They rebuild themselves. A fastpath that is deleted — by aging, by death, or by a negative user rating — comes back on the first successful repeat. Pruning is cheap, then, and the defaults can afford to be aggressive: nothing is lost for good.

Safety valves

Negative rating: a user “✗” deletes the fastpath for that query — which comes back if the request returns and succeeds (the last word always wins).
Admin: fastpaths can be deleted by hand from the /admin/praxis console.
Tools that can't be stored: undo_last_turn and get_inputs produce no fastpath, because what they mean depends on the turn under way.
Absolute dates: a plan with explicit ISO dates (say since_iso="2026-06-11") is not recorded: replaying it on another day would open a time window already frozen in the past. Relative dates (time_window="today") are kept, because each replay works them out afresh.

5. L1 — autopaths promoted by rating (`autopath.py`)

The second layer lives in runtime/engine/autopath.py and thinks differently: it doesn't repeat the same query but generalises to a group of related intents — the one the user has approved with a positive rating.

At each user “✓”, Metnos notes the turn's skeleton, its hash and the group of meaning it belongs to. A few confirmations — configurable, one by default — on the same skeleton and group are enough for the plan to become a reusable autopath: next time, an intent from the same group sets it going again without troubling the planner.

Anti-autopaths, champion and challenger

Anti-autopath: after a repeated negative rating (three failures or more), Metnos records an anti-autopath that bans that plan, for that group, for thirty days. The same symmetry holds: a positive rating lifts the ban.
Champion and challenger: when several plans compete for the same group, the one with more confirmations and fewer failures wins, by a composite score.

The L0/L1 boundary. L0 repeats the same query and also admits plans tied to a single request, recognised by exact fingerprint. L1 generalises to a group of related intents, with the user's explicit consent. In the cascade L0 comes first and takes precedence: if it finds an exact match it skips everything else — even an L1 autopath fit for that intent.

6. Aging, death and inheritance

L0 fastpaths age and die by fixed rules, with no model in the loop. Each night the task_state_reaper process applies three aging rules and four death conditions.

Aging

Rule	Criterion	Default	Env
Never reused	Created more than N days ago but never served a second time	14 days	`METNOS_FASTPATH_GRACE_DAYS`
Stale	Last use more than N days ago	30 days	`METNOS_FASTPATH_STALE_DAYS`
LRU cap	Total entries above the cap; least recently used are pruned	500	`METNOS_FASTPATH_MAX`

Death (only with complete catalogue)

Code	Cause	Inheritance
C1	A tool in the plan no longer exists in the catalogue (retired, renamed, archived). Replay would fail.	No
C2 provenance	The fastpath was promoted to a synthetic executor (see §9) and that executor is now in the catalogue.	Yes
C2 name	An executor named `{verb}_{object}` matching the intent exists, but no tool in the plan belongs to that family. The fastpath would shadow the executor.	Yes
C2 pre-filter	For multi-step plans: the deterministic routing pre-filter on the canonical query shows that a single executor now covers the intent (even under a different name).	Yes

Point inheritance

When a fastpath dies because an executor has superseded it (the C2 conditions), its usage counts (n_uses) pass to the heir executor, along the same aging machinery the executors use. The demand already gathered isn't thrown away.

Pruning is nothing to fear. Even a fastpath removed by mistake comes back on the first successful repeat. That is why it can be pruned freely: nothing is lost for good.

7. The arguments extractor

Recognising the request is half the work. The other half is drawing out its concrete values: which paths, which URLs, which date, which threshold. A rule-based extractor (args_extractor.py) handles this, again with no model:

Closed rules. Regular expressions for the known types: URL (https://...), path (~/... or /..., with the “home” shortcut becoming ~/), email, numbers, file extensions (“PDF file” → *.pdf), dates (today, yesterday, tomorrow, the day after — in Italian and English — turned into ISO format), and time windows (“this week”, “last 24 hours”, “last 7 days”).
Arguments that depend on the request. Some values — the mail account, the time window — change with the question: at execution (and at recording) they are worked out afresh on the current query, not inherited from the stored plan.

8. Configuration

The fast-path is tuned with METNOS_* environment variables. For values meant to last there is a TOML file (~/.config/metnos/runtime.toml); the value written in the module is the last net, when everything else is silent.

Layer 0 (fastpath)

Variable	Default	Meaning
`METNOS_FASTPATH_STALE_DAYS`	30	Calendar days after which an unused entry is pruned
`METNOS_FASTPATH_GRACE_DAYS`	14	Grace days for never-reused entries
`METNOS_FASTPATH_MAX`	500	Maximum rows (LRU cap)

Layer 1 (autopath)

Variable	Default	Meaning
`METNOS_AUTOPATH_MIN_OBS`	1	Minimum positive observations to promote an autopath
`METNOS_AUTOPATH_TTL_ANTI`	2592000 (30 d)	Anti-autopath duration in seconds
`METNOS_AUTOPATH_TTL_REPEAT`	3600 (1 h)	Short window for repeated ratings

Promotion to executor

Variable	Default	Meaning
`METNOS_FP_PROMOTE_MIN_CLUSTER`	3	Minimum distinct fastpaths in the group
`METNOS_FP_PROMOTE_MIN_USES`	15	Minimum cumulative usage
`METNOS_FP_PROMOTE_MIN_AGE_DAYS`	30	Minimum group age
`METNOS_FP_PROMOTE_MAX_PER_NIGHT`	3	Maximum new proposals per night
`METNOS_FASTPATH_AUTOPROMOTE`	off	Enables Mode 2 auto-promotion (no human approval)

9. Promotion to a synthesised executor

When several recurring L0 fastpaths share the same plan structure (the skeleton hash) and the same intent, each night the task_fastpath_promotion process weighs them as candidates to become a synthetic executor in their own right. What it weighs is the group, never the single instance: at least three distinct fastpaths, fifteen uses in all, and thirty days of age. And only multi-step shapes are promoted: single-step ones already have an executor, and there the fastpath only saves the LLM call, not the plan.

Why from L0 and not from L1. The analysis looks at the L0 fastpaths, not the L1 autopaths, because the evidence of real demand lives in L0: how many concrete, distinct requests recur, how often, for how long (the three numbers above). L1 is already general, but it springs from a different signal — the user's “✓” — and carries no such count of instances. The generalisation one would credit to L1 happens all the same, but here: by gathering the L0 fastpaths that share a plan shape and an intent.

Two modes of promotion

Mode 1 (default, active). The candidate becomes a proposal awaiting a person's approval: it shows up in /admin/changes, the unified view of changes (ADR 0158). On a yes, a synt_pending/ marker starts the full synthesis — the five stages, the tests, the signing, the installation.
Mode 2 (optional, off). Past a much higher threshold, Metnos synthesises the executor on its own, with no one approving. Safety still holds: the synthesis goes through vetting, testing and signing all the same, like every executor. One attempt a night, no more.

Where an executor comes from

At promotion time Metnos notes, in a table (promotions), which fastpaths the new executor was born from: it records their identifiers and fingerprints. This is what closes the loop. Once the executor is in the catalogue, the fastpaths that produced it are no longer needed and get retired (the “C2 provenance” death condition from §6). Thanks to that note the retirement is exact: Metnos knows for certain which fastpaths to remove, because the link between each fastpath and its heir is written down, not guessed from the name.

Metnos

Table of contents

1. The idea, in two lines

2. Why a fast-path is necessary

3. The two layers

4. L0 — auto-produced cache (fastpath.py)

Two ways to find a match

Safety valves

5. L1 — autopaths promoted by rating (autopath.py)

Anti-autopaths, champion and challenger

6. Aging, death and inheritance

Aging

Death (only with complete catalogue)

Point inheritance

7. The arguments extractor

8. Configuration

Layer 0 (fastpath)

Layer 1 (autopath)

Promotion to executor

9. Promotion to a synthesised executor

Two modes of promotion

Where an executor comes from

4. L0 — auto-produced cache (`fastpath.py`)

5. L1 — autopaths promoted by rating (`autopath.py`)