← Documentation index Microdesign › synt

Metnos

synt — how the executor pool is born and matures
Aligned with the canonical triad
executor, mnest, mnestome.

Audience: those who implement the reactive loop, the nightly homeostasis,
the approval gates and the cost controls.
Reading time: 18 minutes.
Quality filters and short-circuit. The introspective cascade applies deterministic filters before proposing: names outside the closed vocabulary are rejected, pipeline args are skipped, and a minimum use threshold rules out noise. The explicit synthesis request (handle_synth_request) first checks whether the intent is already covered by the catalog or redirectable to a canonical executor (e.g. list_processesget_processes), saving the whole pipeline. A garbage collector moves colliding synthesised executors into a temporary area without deleting them.

Contents

  1. Scope and boundaries
  2. What synt is
  3. The cascade: overview
  4. Reactive strategies: compose, generate
  5. Introspective strategies: merge, generalise, specialise
  6. Weighting: the extended R score
  7. The non-retreat telos
  8. Human approval, budget, abandonment
  9. Python contract
  10. Audit and observability
  11. Alternatives considered
  12. Conformance tests
  13. Open questions

1. Scope and boundaries

This document defines what synt does: the process that, in front of a user request or a pattern emerging in the mnestome, grows and matures the pool of executors in Metnos. It is not a single loop: it is a cascade of strategies ordered by increasing cost, which honours the «cultivate the tools» telos without wasting frontier-LLM calls and without polluting the library.

Boundaries

The doc covers:

It does not cover, and defers elsewhere:

2. What synt is

synt is the process (not an agent, not an object) with a single task: to bring into existence what the pool cannot yet do. It does not write from scratch every time, it does not write only from scratch, and sometimes it does not write at all. Its intelligence lies in knowing when to apply which strategy.

synt ≠ frontier LLM. A common mistake is to think that «synthesis» means «call an expensive model and have it write code». In most cases, synthesising means orchestrating what already exists: a chain of active executors that closes a proto-mnest. True generation is the documented exception, not the default.

Triggers

synt comes into action in two distinct ways:

ModeTriggerTime
Reactive The gateway, during a user turn, records a proto-mnest pointing to a non-existent executor; or the planner cannot find any executor that satisfies the request. Synchronous to the turn (the user is waiting for a response).
Introspective The ager sweeps the mnestome and signed pool at night; it finds recurring proto-mnests, executors with overlapping traces, families of specialised executors with the same shape. Asynchronous, in homeostasis (night, pauses, low load).

The two modes share the same cascade of strategies but differ in timing (synchronous vs asynchronous) and in tone (answer vs propose).

A third source of proto-mnest, from the fast-path. Since there is a third trigger channel, not substitutive but complementary: the nightly job multi_tool_promote reads the pipelines memoised in the L2 fast-path and, when a sequence exceeds 50 uses, creates a proto-mnest in the mnestome with the desired signature (executor chain + placeholders + canonical query). From that moment on, the pipeline becomes a candidate for actual synthesis: a new unified executor that wraps the whole sequence, typically named <last_verb>_<first_object>. The bridge does not duplicate work: L2 captures low volume (3–50 uses, deterministic replay), L3 takes over when the pattern is stably recurring.

3. The cascade: overview

The synt cascade strategies ordered by increasing cost REACTIVE user turn proto-mnest (unmet request) 1. Compose chain in pool — zero frontier LLM 2. Generate 5-stage multistage — local €0 Abandon motivated and traced if chain found → orchestrated if compose empty → new executor if max retry or budget INTROSPECTIVE nightly homeostasis mnestome + pool (periodic sweep) 3. Merge A + B with overlapping traces 4. Generalise N specialised → one parametric 5. Specialise hot case with measured benefit Batch proposals → human gate (approval_ux) Roberto sees the dossier with R score, can approve/amend/discard Principle: every strategy is a proposal. Nothing enters the pool without a signature and without a human gate.
Figure 1 — The synt cascade. Top row: two reactive strategies that answer a proto-mnest within the turn. Bottom row: three introspective strategies that curate the pool at night and reach Roberto in batch.

The five strategies at a glance

#StrategyTimeWhat it does, in one lineFrontier cost
1ComposereactiveSearch for a chain of active executors that closes the proto-mnest.0
2GeneratereactiveFive-stage multistage LLM pipeline (naming, signature, tests, description, code) that produces a new signed executor (see §4.2.3).€0 (local)
3MergeintrospectiveJoins two executors with overlapping traces and compatible profiles.~€0.5
4GeneraliseintrospectiveFrom N specialised with the same shape, derive one parametric executor.~€1
5SpecialiseintrospectiveFrom a general one, derive a focused version for a hot case.~€0.5
Two axes, not a linear list. The strategies differ on two independent dimensions: when (reactive / introspective) and how (reuse / creation / transformation). Compose is pure reuse; generate is creation; merge/generalise/specialise are transformations of the pool. The numbering here is an expository order, not an absolute priority.

4. Reactive strategies: compose, generate

4.1 Compose

The basic strategy. When a proto-mnest signals an operational gap, synt first looks at the signed pool. The question is: does there exist a chain A → B → C that, executed in sequence, covers the need? If yes, synt does not write new code: it proposes the orchestration.

How the chain is searched

The mnestome is a directed graph between executors. Finding a chain that connects the requested input to the desired output is a guided walk on the graph. Three criteria order the candidates:

What a successful composition leaves behind

When synt resolves via composition, it records in the mnestome a proto-mnest pointing at the composition itself: a virtual node that says «I have just used A→B→C as if it were a single executor named X». If this proto-mnest recurs, it becomes a candidate for generalisation (ch. 5): a single executor that incorporates the chain. Composition is therefore a generalisation gym: every repeated chain is a hint for the future.

v1 DECISION: the maximum length of a composition chain is 5 hops. Beyond, the orchestration becomes unreadable and the risk of cascading errors outweighs the reuse benefit. synt jumps to generate. The threshold is tunable.

4.2 Generate

When composing is not enough — because there is no chain, because it is too long, because a missing link is critical — synt enters the multistage five-stage pipeline described in §4.2.3.

Generating costs: typically 1–2 calls at the wise tier plus human approval time. The fact that it comes after compose is not aesthetic: it is real saving.

Wise tier quality floor. The code stage (stage 5 of the multistage pipeline in §4.2.3) requires an LLM at the level of Qwen 3.6 35B-A3B or above. Smaller models (e.g. a small model) produce fragile code with high probability of violating the convention. The procedural stages (1-4: naming, signature, tests, description) use the middle tier, which is enough for closed-vocabulary lookup and fixed formats. The rule is encoded in the tier resolver (runtime/llm_router.py) as a quality floor: if the user does not have a local wise of that calibre, an external provider (Anthropic, OpenAI, etc.) must be configured. Explicit boot error, never a silent downgrade to fast.

4.2.1 Real measurements on the POC

End-to-end synthesis via wise tier = local Qwen 3.6 35B-A3B (Q4_K_M on llama-server). Case: format_json (formats a JSON string with readable indentation).

StageTimeNotes
1 Pattern detect~0 (caller)caller of react already holds the proto-mnest.
2+3 Spec + Skeleton~36ssingle LLM call at tier=wise via tool-use propose_executor; ~640 in tokens, ~1700 out.
4 Profile~msderived from AST + import whitelist; "pure" profile when no external I/O.
5 Level-2 birth tests~28s + ~1s/testsecond LLM call yields 3-5 declarative tests; runner executes each via subprocess.
6 Approvalhumantoday CLI (synt approve|reject <id>); HTML UX deferred.
7 Sign + install~msEd25519 via runtime/sign.py, executor copied into executors/<name>/.

Wall-clock total of LLM stages: ~64s for the first non-trivial executor. This is time the dialog manager surfaces to the user as "I am building a new component to extend my capabilities": once signed, every subsequent reuse is pure code execution (~ms).

Simplification. Stages 2 (Spec) and 3 (Skeleton) are unified into a single LLM call with tool-use propose_executor. The canonical doc keeps them separate because will introduce an amend UX between Specification and Skeleton: Roberto will be able to review and correct the spec before code generation. The unified tool of the POC is an explicit simplification, not a renunciation of the design.

4.2.2 Provider-specific prompt repertoire

Each model family has idiosyncrasies that show up only with use. Synt keeps a repertoire of prompt addenda automatically applied to stages 2/3 when the caller signals for_code=True. Three style rules, ratified after the first real comparisons:

The repertoire lives in runtime/prompts.toml (bundled default) and is overridable via ~/.config/metnos/prompts.toml (user override). TOML schema:

[[hint]]
provider = "anthropic"
model_pattern = "claude-*" # glob on prov.model
use_case = "code_gen"
text = "\\n\\nVincoli: codice fedele alla spec. Regex semplice. Niente lookbehind/lookahead."

[[hint]]
provider = "llamacpp"
model_pattern = "qwen*"
use_case = "code_gen"
text = "\\n\\nVincoli: raw string r'...' con UN backslash. Niente triple-quote docstring."

[[hint]]
provider = "openai"
model_pattern = "gpt-*"
use_case = "code_gen"
text = "\\n\\nVincoli: compila python_code per intero (def invoke + def main). Mai vuoto."

The first match on (provider, model_pattern, use_case) wins. When a new model arrives (Gemma 5, Claude Opus 4.7, GPT-5, …) a new entry is added based on what is observed in the first syntheses. The base system prompt and the executors do not change: the only lever touched is the repertoire.

4.2.3 Multistage pipeline

The architectural answer is multistage: five small stages, each with a prompt focused on its own task, ordered from the most procedural (closed-vocabulary lookup) to the most creative (code). The manifest skeleton is filled in progressively, and each stage sees ONLY the slice it needs — no cumulative blob.

StageTypeLLM TierOutput
1. Naming + classificationproceduralmiddle (Qwen 3.6 35B-A3B think=true)name {action}_{object}[_qualifier] from the closed vocabulary (23 actions, 22 objects) + revertible/critical/target_kind. If no combination fits: explicit rejection with reason.
2. Signatureproceduralmiddleargs_schema (JSON Schema), capabilities (closed set), reverse_pattern from the deterministic catalogue (runtime/reverse_patterns.py).
3. Birth testsproceduralmiddle4-6 tests in fixed format (setup/input/expect/teardown), at minimum: happy path, empty list, invalid args, domain edge.
4. Description + affinitycreativemiddleLLM-readable description structured in four chapters (SCOPO: / PATTERN: / NON: / OUT:). The proposer shows the planner only the head (up to OUT:, excluded). + 6-10 affinity keywords for the composer.
5. Codecreative + proceduralwise (Qwen 3.6 35B-A3B think=true)<name>.py Python file with def invoke, runtime conventions (runtime/messages.py, runtime/platform_policy.py, helpers).

The closed vocabulary appears ONLY in the stage 1 prompt; subsequent stages receive the already-decided name and work within their own boundaries. The manifest description is not written by the developer but by the dedicated stage 4 LLM, in a four-chapter format: SCOPO: (what it does), PATTERN: (canonical call form), NON: (anti-patterns and disambiguation), OUT: (output shape). The proposer truncates the description to the head (up to OUT:) to save context budget.

Measured results on the 35-query stress dataset: stage 1 (naming + classification) reaches 88.6 % (31/35) under the bilingual prompt. The remaining 4 escalations (resize, query SQL, validate YAML, crack) are explicit rejections for verbs semantically outside the closed vocabulary, not synthesis errors: the vocabulary is only worth extending if those patterns recur in real cases («synonyms before vocabulary»).

Synthesis time. An end-to-end multistage synthesis costs between 140 and 160 seconds wall-clock with the local Qwen 3.6 35B-A3B provider (zero network cost). Average per-stage breakdown observed: stage 1 ~13 s, stage 2 ~25 s, stage 3 ~35 s, stage 4 ~25 s, stage 5 ~45 s. The 5 stages run sequentially because each one depends on the skeleton enriched by the previous; parallelising them breaks the «minimum context per stage» rule. A one-shot frontier reconstruction remains the last resort never to cross: never use online wise when local can deliver.

5. Introspective strategies: merge, generalise, specialise

Introspective strategies do not answer a one-off request: they answer the structural need to keep the pool small, coherent and reusable. They run at night as part of the ager's homeostasis (see Architecture ch. 10), with low CPU and economical LLMs (local-fast tier).

5.1 Merge

When two executors have overlapping traces (weight of the mnests connecting them > threshold) and compatible sandbox profiles, synt proposes a merge: a new executor that covers both. The fused state in the executor lifecycle (ch. 6) is foreseen for exactly this case: the two originals stay loaded as long as residual mnests cite them, then they get archived.

Typical trigger: two executors that do similar things under different names (archive_pdf and store_pdf), perhaps born at different times without synt having correlated them at birth.

5.2 Generalise

When N specialised executors share the same shape (same I/O schema except for one dimension, same sandbox profile except for one parameter), and when the proto-mnests point at their family on new dimensions, synt proposes a parametric executor. The new version takes as an explicit argument what was re-encoded N times before.

Canonical example: order_image_file, order_audio_file, order_doc_file — all sort files by date — get proposed as order_file with argument file_kind. Once the generalised version is signed, the three specialised executors move to superseded and progressively to archived.

Generalisation is not opportunistic refactoring. The threshold to propose a generalisation requires: (a) at least three specialised executors with coherent shape; (b) at least one uncovered proto-mnest on the same family; (c) no divergence in sandbox profiles that would make the parametric more permissive than the sum. If (c) fails, the proposal is suspended: better three tight executors than one loose one.

5.3 Specialise

The reverse strategy, and the rarest. From a general executor, when its invocation on a specific case recurs with very high frequency and has a costly profile or latency, synt proposes a specialised version. The specialisation lives next to the general one, not replacing it: routing becomes «if argument X ≡ hot case, use the specialised; otherwise the general one».

It applies only with measured benefit: observed latency reduction, cost reduction, or simplification of the sandbox profile. We do not specialise for preemptive optimisation.

5.4 Comparison

StrategyDirectionEffect on the poolTrigger
MergeN → 1reduces the executor countoverlapping traces, compatible profiles
GeneraliseN → 1 (parametric)reduces and covers new dimensionsspecialised executors with coherent shape + proto-mnests on the family
Specialise1 → 2 (gen + spec)adds an executorhot case with measured benefit

6. Weighting: the extended R score

Every synthesis proposal — any strategy — produces a score R ∈ [0, 1] that synt computes on observed data and that Roberto sees in the approval dossier. The formula includes a strategy_cost component that rewards the cheaper strategies:

R = 0.35 · det_pass_rate # fraction of deterministic tests passing
 + 0.20 · judge_score # LLM-as-judge (local-fast) on constitutional rubric
 + 0.15 · cost_ratio # clip(1 - actual_cost / estimate, 0, 1)
 − 0.10 · similarity_penalty # cosine-sim of embedding vs existing pool > 0.85
 + 0.10 · coverage_bonus # bonus if it covers uncovered proto-mnests
 + 0.10 · strategy_cost_bonus # 1 for compose, 0.6 for merge/specialise,
 # 0.4 for generalise, 0 for generate

gate_threshold = 0.65 # v1 DECISION

The strategy_cost_bonus is the new piece: it pushes synt towards cheaper strategies at parity of quality. A composition that closes a proto-mnest with a score similar to a generation wins because of the strategy bonus, as it should.

v1 DECISION (calibration). The weights above are a first approximation, to be revisited after 30 real syntheses on the system. The calibration rule: no strategy should have a bonus large enough to make it win when it is structurally wrong (e.g. a 7-hop composition that wins over a correct generation only because of the bonus). If this happens, lower the bonus.

7. The non-retreat telos

The cascade is not just cost discipline: it is the mechanism that honours the «cultivate the tools» telos (Architecture ch. 11): as long as a user request is within the constitution and within the budget, synt exhausts the available strategies before answering «I cannot».

The telos lives in TELOS.md in the workspace, alongside the others:

5. Cultivate the tools: if I ask for something your pool cannot yet do,
 exhaust the synthesis strategies (compose, generate, ask) before
 answering "I cannot". Failure is allowed; silent retreat is not.

The difference between failure and retreat is exactly here. synt may conclude that the request cannot be satisfied within the budget, but it must do so after attempting the cascade, and it must explain it: «I tried to compose with these N executors, I tried to generate with this specification, the birth test failed on case X». A motivated failure is data for the future; a silent retreat is a hole in the mnestome.

Do not confuse non-retreat with stubbornness. The abandonment thresholds still hold: max 3 retries per stage, 24h suspension for a pattern-rejected proto-mnest, 30-day lock for a rejected internal direction. Non-retreat is exhausting the cascade, not insisting forever. Courage, not stubbornness.

8. Human approval, budget, abandonment

8.1 Human gate: always, everywhere

Regardless of strategy, every synt proposal that produces or modifies an executor passes through the human gate (see approval_ux.html, to be rewritten). The principle is the third of the six principles of the Architecture: no synthesis without a human filter, in any autonomy level.

A composition proposal is a separate case: synt does not create any new signed artefact, only orchestrates the execution. The human gate in this case applies to the first use (Roberto sees «I am about to call A → B → C, OK?») and is then relaxed symmetrically with the autonomy level: in Supervised every chain invocation asks for confirmation; in Full, after the first 5 clean executions, confirmation is skipped.

8.2 Budget

CapDefaultBehaviour at exceed
Soft per request€2synt warns Roberto and asks whether to continue.
Hard per request€5synt stops, records abandonment for budget.
Hard per day€20constitutional cap (not overridable via config).

8.3 States of a synthesis request

StateMeaningTransitions
composingSearching a chain in the pool.composed (chain found) or generating.
composedChain proposed, awaiting user gate.delivered or generating if Roberto rejects.
generating5-stage multistage pipeline in progress.born (signed executor) or abandoned.
bornExecutor signed, in pool.terminal.
abandonedAll strategies exhausted, motivated failure.terminal; 24h lock on the same target.
proposedIntrospective strategy, awaiting user batch.born/fused/generalised/specialised or rejected.
rejectedRoberto rejected the introspective proposal.terminal; 30-day lock on the same direction.

9. Python contract

from typing import Protocol, Literal
from dataclasses import dataclass

Strategy = Literal[
 "compose", "generate",
 "merge", "generalise", "specialise",
]

ProposalState = Literal[
 "composing", "composed",
 "generating", "born", "abandoned",
 "proposed", "rejected",
 "fused", "generalised", "specialised",
]

@dataclass(frozen=True)
class SynthRequest:
 """A synthesis request, reactive or introspective."""
 request_id: str
 mode: Literal["reactive", "introspective"]
 proto_mnest: str | None # id of the triggering proto-mnest (reactive)
 target_intent: str # NL: what needs to be done
 budget_cents: int # frontier cap to spend
 capability_hint: list[str] # capabilities already assumed

@dataclass(frozen=True)
class RewardBreakdown:
 det_pass_rate: float # [0,1]
 judge_score: float # [0,1]
 judge_reasoning: str
 cost_ratio: float # [0,1]
 similarity_penalty: float # [0,1] (negative sign in the formula)
 coverage_bonus: float # [0,1]
 strategy_cost_bonus: float # [0,1] depends on Strategy
 total: float # aggregated R

@dataclass
class SynthProposal:
 """A concrete proposal that synt presents to the user."""
 request_id: str
 strategy: Strategy
 state: ProposalState
 artefact: dict # chain (compose) or candidate executor (generate/…)
 reward: RewardBreakdown
 cost_cents: int # consumption so far
 rationale: str # 2-3 lines NL: why this strategy, here

class Synt(Protocol):
 async def react(self, req: SynthRequest) -> SynthProposal:
 """Reactive cascade: compose first, then generate, then abandon."""...

 async def homeostasis(self, lookback_days: int = 30) -> list[SynthProposal]:
 """Sweeps mnestome and pool; returns 0..N introspective proposals in batch."""...

 async def revise(
 self, request_id: str, feedback: str,
 target_strategy: Strategy | None = None,
 ) -> SynthProposal:
 """Resumes a proposal after human feedback."""...

# Errors (logged and turned into state, not propagated)
class StrategyExhaustedError(Exception):...
class BudgetExceededError(Exception):...
class PolicyVetoError(Exception):...
class ConstitutionViolationError(Exception):...

10. Audit and observability

Every step of the cascade produces a JSONL line in workspace/.audit/synt/YYYY-MM-DD.jsonl. Example for a successful composition:

{
 "ts": "2026-04-25T22:14:33Z",
 "request_id": "01HX...",
 "mode": "reactive",
 "proto_mnest": "mnest_01HW...",
 "strategy": "compose",
 "state": "composed",
 "chain": ["read_files", "read_files_pdf", "classify_entries", "move_files"],
 "reward": {"total": 0.78, "det_pass_rate": 0.95, "…": "…"},
 "cost_cents": 0,
 "duration_ms": 142
}

Three invariants of the synt audit, guaranteed by loader and runtime:

  1. Complete traceability: for every request_id the entire sequence of states visited is on record, up to a terminal state.
  2. Motivated decisions: every transition between strategies includes a NL rationale readable by Roberto after the fact.
  3. Cost recorded: the sum of cost_cents for a request never exceeds the declared budget_cents; the final line certifies it.

11. Alternatives considered

AlternativeWhy rejected (or postponed)
Generation only, no cascadeOriginal default of the first design. Costly and pollutes the pool: every proto-mnest yields a new executor even when an existing chain would have sufficed. Replaced by the cascade.
Extended cascade (introspective in the user turn too)Merging/generalising/specialising during the user turn multiplies risk: three concurrent proposals in parallel, human gate under pressure. They stay introspective.
Generation with N parallel models3× the cost for small variance reduction. Possible in v2 if the pipeline hit-rate falls below 60%.
Reward learning (RLHF) on R weightsRequires a signed human-feedback dataset; over-engineered for domestic use. Manual calibration + semi-annual review is more transparent.
Composition via LLM (planner)Replaces the graph walk with a frontier call that invents the chain. Costly and fragile: the walk is deterministic and inspectable, the LLM is not.
Auto-merge without gateA merge that changes sandbox profiles is a security act. Stays gated.

12. Conformance tests

InvariantTest
The reactive cascade always tries compose before generateInject a proto-mnest solvable by a known chain → strategy = compose in the proposal, cost_cents = 0 in frontier calls.
Generate fires only if compose failsProto-mnest with no possible chain → composing → generating transition in audit; compose not emitted as final strategy.
Chain max 5 hopsForce 6 hops → the walk rejects the chain, falls back to generate.
Human gate not bypassableMock approval_ux that is never called → no born/fused/… state reached.
Hard budget respectedRequest with budget €0.10 → abandon before the second frontier call; effective cost ≤ €0.10.
R reproducibleReplay with same seed → same R ± 0.01 (judge_score with tolerance).
Strategy cost bonus monotonicFor the same quality (det_pass_rate, judge), R(compose) > R(merge) > R(specialise) > R(generalise) > R(generate).
24h lock post-abandonmentSame target_intent within 24h → SynthRequest rejected with immediate abandoned.
30-day lock post-rejected internalIntrospective direction rejected by Roberto → no new proposal on the same direction for 30 days.
Complete auditFor every request_id with a terminal state, the audit contains the initial line, the terminal line, and all intermediate states.
Profile coherence in mergeAttempt to merge two executors with incompatible sandbox profiles (e.g. one writes on ~/Pictures/, the other does not) → proposal suspended with motivation.

13. Open questions

  1. Home of the three introspective steps. Stay in synt.html, or migrate to a new consolidator.html? The split could simplify the contracts (synt = reactive, consolidator = introspective) but would duplicate parts of the weighting. For now: all here. Open.
  2. Homeostasis timing. Fixed nightly or load-adaptive? For Metnos on metnos-server with a single user, nightly suffices; for multi-sender scenarios it should be revisited. Open.
  3. Reinforced compositions → promotion. How many recurrences of the same chain trigger a generalisation proposal? Tentative default: 5 in 30 days. To be calibrated.
  4. Graph walk for compose. Algorithm: breadth-first over the mnestome with pruning by minimum weight? A* with I/O coverage heuristic? Open — for now plain BFS.
  5. Signature revocation. When Roberto rejects an already born proposal after n invocations, is it an archive or a quarantine? The executor lifecycle (ch. 6) admits both; criterion not yet fixed.
  6. Remote synt. In topologies with remote executors (Architecture ch. 4), does the cascade run on the metnos-server server or can it spawn compositions that span server and laptop? Open; conservative default: all on metnos-server, remote executors called like any other.

canonical
executor
The unit of code that synt brings to life or orchestrates. 5-stage multistage pipeline at ch. 7 / §4.2.3.
canonical
mnest
The trace of co-activation that synt reads to decide. Proto-mnest at ch. 8.
level 1
Architecture, ch. 9
The cascade in the big-picture view.
level 1
Architecture, ch. 11
Telos and Vaglio: the non-retreat telos.
index
Microdesign
All docs.

Metnos — synt, canonical microdesign.