TESTED Microdesign v1.1 — new, tested on the evening of April 27, 2026. Cluster policy 10/10 green. Reference: runtime/policy.py.
Status sequence: under approvalapprovedtestedimplemented. Replaces the obsolete v1.0 written before implementation.
← Documentation index Microdesign › policy

Metnos

policy — the legality filter for every action
Microdesign v1.1 — status TESTED (April 27, 2026 evening)
Audience: those who want to understand how Metnos decides whether an action is permitted, requires approval, or is forbidden.

Reading time: 10 minutes.

Contents

  1. What policy is
  2. Capability Registry v1.1
  3. Autonomy × capability table
  4. Persistent per_target grants
  5. Combined outcome (effective_outcome)
  6. Runtime integration
  7. CLI
  8. Tests
  9. Limits and what is deferred to v1.2+

1. What policy is

Policy is the legality filter of Metnos: layer 2 of the architecture (ch. 6 of the Architecture). For every combination of autonomy level, capability and target it decides one outcome among allowed, denied, approval_required. It is the module that embodies the rules shared between Roberto and the agent: what may be done without asking, what must be asked, what is never done.

Policy lives alongside two other layers that decide different things, and this is the clarity point that v1.0 lacked:

The three filters are in series: policy runs first (it is the cheapest, a table lookup plus an optional SQLite query); if it passes, vaglio applies contextual LLM judgement; if vaglio also passes, the sandbox wraps the invocation. The runtime/policy.py module is ~360 lines, no daemon, no global state apart from the table cache and the grants SQLite file.

No policy decision inside executor code. An executor knows nothing about the current autonomy level or the active grants. It only declares its capabilities in the manifest. Policy is the module where all rules live, readable in one place, modifiable without touching a single executor.

2. Capability Registry v1.1

The registry is the closed dictionary of actions Metnos recognises. Thirteen canonical entries, defined in runtime/policy.py:CAPABILITY_REGISTRY (runtime/policy.py:46-121). Each entry is a CapabilitySpec with four attributes:

namecriticaldefault_approvaltarget_kinddescription
fs:readnoper_targetpath_globread files from the local filesystem within declared path_globs
fs:writeyesper_targetpath_globwrite/modify files within path_globs (critical)
code:execyesalwaysexactexecute a shell command from a whitelist (e.g. package manager)
network:httpnoper_targethostHTTP/HTTPS GET/POST to authorised hosts
llm:localnononenonelocal LLM call (Ollama, llama.cpp), zero cost
llm:onlinenoper_targetnoneonline LLM call (Anthropic, OpenAI, ...), cost > 0
mail:readnoper_targetexactread IMAP messages from an authorised mailbox
mail:sendyesalwaysexactSMTP send to recipients (irreversible, high stakes)
channel:innononeexactreceive messages from a channel (Telegram, CLI, voice)
channel:outnoper_targetexactsend messages to a specific channel
time:readnononenoneread current time and timezones
parse:localnononenonelocal parsing of known formats (PDF, HTML, JSON, CSV)
calendar:readnoper_targetexactread events from an authorised calendar

The registry is closed: record_grant rejects a capability that is not in it (runtime/policy.py:243-244). Adding a capability means modifying the registry in code and running the tests — no runtime registration. This is a deliberate choice: the action vocabulary is a security asset, not an extension surface freely open.

2.1 Reading the registry

Three families are critical: fs:write, code:exec, mail:send. These are actions that change the world irreversibly (a written file, an already-run shell command, a sent email). Two of them have default_approval = always: code:exec and mail:send. Even at the highest autonomy level, these two always pass through an explicit confirmation.

Four capabilities have default_approval = none: llm:local, channel:in, time:read, parse:local. These are zero-cost, reversible actions, with no side effects on the outside world (listening to a channel, reading a clock, parsing a PDF inside the sandbox). They stay outside the approval flow even at the most conservative level.

3. Autonomy × capability table

The table is the cartesian product of the three autonomy levels (ReadOnly, Supervised, Full) by the 13 capabilities. For each cell an outcome: allowed, approval_required, denied. Generated in runtime/policy.py:_TABLE and _init_table (runtime/policy.py:140-171) according to canonical rules, not hand-written.

capabilityReadOnlySupervisedFull
fs:readapprovalapprovalallowed
fs:writedeniedapprovalallowed
code:execdeniedapprovalapproval
network:httpdeniedapprovalallowed
llm:localallowedallowedallowed
llm:onlinedeniedapprovalallowed
mail:readapprovalapprovalallowed
mail:senddeniedapprovalapproval
channel:inallowedallowedallowed
channel:outdeniedapprovalallowed
time:readallowedallowedallowed
parse:localallowedallowedallowed
calendar:readapprovalapprovalallowed

3.1 The three rules that generate the table

The table is not arbitrary: it derives from three rules, one per level, that _init_table applies while iterating over the registry.

ReadOnly. Read-only capabilities only, no side effects visible to the outside world. Never write, never send, never exec, never online LLM (which carries an outgoing monetary cost). Read capabilities with default_approval = per_target (fs:read, mail:read, calendar:read) stay approval_required: the most conservative level does not give them up but asks confirmation for each new target. Critical ones and others with default per_target or always become denied.

Supervised. Everything ReadOnly can do, plus it raises denied to approval_required: every capability with default_approval ≠ none requires approval. This is the level at which the system is fully operational but every action with effects on the world goes through Roberto.

Full. All allowed except capabilities with default_approval = always: code:exec and mail:send stay approval_required. These are the two where a mistake cannot be undone, and for that reason they are never granted without explicit confirmation, regardless of trust level.

Full does not mean carte blanche. It is the level where friction is given up on the reversible, not on actions that burn the bridge. A sent email cannot be recalled; a shell command that has deleted files cannot be undone. For these two capabilities the approval request stays in any profile in which the system can run.

4. Persistent per_target grants

The table alone is not enough. When Roberto approves a request — "yes, go ahead and write to ~/Documents/invoices-2026/* for the next two months" — we want the system to remember that concession and not ask again for every saved file. The memory of these concessions lives in a single-file SQLite table: the grants.

4.1 SQLite schema

Defined in runtime/policy.py:SCHEMA (runtime/policy.py:186-202):

CREATE TABLE IF NOT EXISTS grants (
    id              INTEGER PRIMARY KEY AUTOINCREMENT,
    channel         TEXT NOT NULL,
    sender_id       TEXT NOT NULL,
    capability      TEXT NOT NULL,
    target          TEXT NOT NULL,
    granted_at      TEXT NOT NULL,
    expires_at      TEXT,
    granted_by      TEXT,
    revoked_at      TEXT
);

A grant is identified by the four-tuple (channel, sender_id, capability, target): who approved (e.g. Telegram + user Roberto), for which action, on which target. Dates granted_at/expires_at/revoked_at are ISO 8601 UTC. Two indexes accelerate the two typical queries: exact lookup and active-grants scan.

File path: ~/.local/state/metnos/grants.db, override available via the METNOS_GRANTS_DB environment variable (runtime/policy.py:27, 223-229). The parent folder is created on first access.

4.2 API

functionwhat it doescitation
record_grant(channel, sender_id, capability, target, expires_at=None, granted_by=None) Records a concession. Raises ValueError if capability is not in the registry. Returns the Grant object with assigned id. runtime/policy.py:232-259
has_grant(channel, sender_id, capability, target) True if an active grant (not revoked, not expired) exists for the four-tuple. The query compares expires_at with the current time. runtime/policy.py:262-284
list_grants(channel=None, sender_id=None, include_revoked=False) Lists grants, filterable by channel/sender, optionally including revoked. Sorted by granted_at descending. runtime/policy.py:287-311
revoke_grant(grant_id) Sets revoked_at to the current time. Returns True if something was modified, False if the grant was already revoked or did not exist. runtime/policy.py:314-325

All four functions open and close the connection per call: no in-memory state. Throughput is not the point — we are in the few-queries-per-second regime — and per-call isolation simplifies test reasoning.

4.3 When grants are written

In v1.1 the policy module is read-only from the planner: the planner reads the table and the grants but never creates them. Creation happens in the approval dispatcher (ch. 5 phase 5, see approval_ux): when a pending request resolves as approved with scope this and similar, the dispatcher calls record_grant and from that moment the concession is persistent.

5. Combined outcome (effective_outcome)

The effective_outcome function (runtime/policy.py:330-355) is the single entry point for the planner. It combines table and grants into one outcome, according to four cases:

table saysactive grant for (channel, sender, target)?outcome
allowedindifferent, the DB is not queriedallowed
deniedindifferent, the DB is not querieddenied
approval_requiredyesallowed
approval_requiredno (or scope parameters missing)approval_required

The logic is linear: if the table already decides cleanly (allowed or denied), the grant is not even consulted; if it decides approval_required, an active grant turns it into allowed, otherwise it stays approval_required.

Security guarantee: a grant can never elevate a denied to allowed. This is the constraint that separates a tactical concession (single target, level that allows it) from a structural level upgrade. If a level says "this action is forbidden", no past approval can unblock it: the level itself must change, a decision that lives in pairing and its signing flow. The effective_outcome_denied_non_e_alzato_da_grant test verifies exactly this invariant (ch. 8).

5.1 Concrete examples

Example 1 — Supervised, fs:write on an already-granted file
Two weeks ago Roberto approved fs:write on ~/Documents/invoices-2026/*. The agent is about to save ~/Documents/invoices-2026/04-Acme.pdf.

Table for (Supervised, fs:write) → approval_required. Active grant for (telegram, roberto, fs:write, ~/Documents/invoices-2026/*) → outcome allowed. No approval card, direct save.
Example 2 — ReadOnly, fs:write
Same scenario but the active level is ReadOnly (e.g. a more restrictive delegated session).

Table for (ReadOnly, fs:write) → denied. Even if the grant existed, the DB is not queried. Outcome denied: the agent reports that the active level does not permit disk writes and suggests rising to Supervised.
Example 3 — Full, mail:send
Level Full, capability mail:send, recipient [email protected] to whom Roberto has sent ten emails before.

Table for (Full, mail:send) → approval_required (the "always" rule). Active grant? For mail:send the default mode is always, and in v1.1 no grant is ever created for capabilities marked always. Outcome approval_required: the email goes to the queue, Roberto sees the card, approves or denies that single email.

6. Runtime integration

In v1.1 runtime/policy.py is a separate module, complete and tested, but not yet stitched into the planner. Integration into agent_runtime is the last line of phase 5 and will arrive in the next iteration. The planned shape:

def execute_step(step, ctx):
    cap = step.capability        # e.g. "fs:write"
    target = step.target         # e.g. "/home/roberto/Documents/invoices-2026/04.pdf"
    outcome = policy.effective_outcome(
        ctx.autonomy_level,      # "ReadOnly" | "Supervised" | "Full"
        cap,
        channel=ctx.channel,     # e.g. "telegram"
        sender_id=ctx.sender_id, # e.g. "roberto"
        target=target,
    )
    if outcome == "denied":
        return Refused(reason="insufficient level for " + cap)
    if outcome == "approval_required":
        pending = approval_registry.create_pending(step, ctx)
        channels.approval.render_approval_card(pending, ctx.channel)
        return Awaiting(pending_id=pending.id)
    # outcome == "allowed"
    return invoke_executor(step.executor, step.args, autonomy=ctx.autonomy_level)

Three references to existing modules:

The planner never reads the table or the grants directly: it makes a single call to effective_outcome and branches on the outcome. This keeps policy as the single source of truth for the rules; the day the rules change (v1.2: cost tier, rate limit), the planner is not touched.

7. CLI

The module is runnable as a script (runtime/policy.py:360-423): useful for manual inspection and to build dashboards in a few lines. Four subcommands.

commandwhat it does
python3 -m policy registryprints one JSON line per each of the 13 capabilities with all attributes.
python3 -m policy tableprints three JSON lines, one per level (ReadOnly/Supervised/Full), with the outcome for each of the 13 capabilities.
python3 -m policy check <level> <capability> [--channel C --sender S --target T]prints the effective_outcome result. Without --channel/--sender/--target it returns the table-only outcome.
python3 -m policy grants [--channel C] [--sender S] [--all]lists grants, active by default, all with --all.
python3 -m policy revoke <grant_id>revokes a grant by id; prints revoked or no-op.

JSON-line output makes piping into jq easy: for instance python3 -m policy table | jq shows the matrix in readable form.

8. Tests

Cluster policy in the runtime test framework: 10/10 green as of April 27, 2026. Cases cover the registry, the three table rules, the grants round-trip, and the security invariant separating denied from grant-elevatable cases.

#casewhat it verifies
1registry_contiene_13_capability_canonicheCAPABILITY_REGISTRY exposes exactly the 13 expected entries, each with the four mandatory attributes.
2is_allowed_readonly_blocca_write_e_exec [security]at ReadOnly level: fs:write, code:exec, mail:send are denied.
3is_allowed_supervised_richiede_approval_per_criticalat Supervised: fs:write is approval_required; llm:local is allowed; mail:send and code:exec are approval_required.
4is_allowed_full_mantiene_always_per_critical_irreversibili [security]at Full level: mail:send and code:exec stay approval_required; fs:write becomes allowed.
5record_grant_e_has_grant_round_tripafter record_grant on the four-tuple (channel, sender, capability, target), has_grant on the same four-tuple returns True.
6record_grant_capability_sconosciuta_sollevarecord_grant with a capability outside the registry raises ValueError.
7revoke_grant_disattiva_has_grantafter revoke_grant(id), has_grant on the same four-tuple returns False.
8effective_outcome_grant_alza_a_allowedif the table says approval_required and an active grant exists, effective_outcome returns allowed.
9effective_outcome_denied_non_e_alzato_da_grant [security]if the table says denied, the existence of any grant does not change the outcome: it stays denied.
10list_grants_filtra_per_canale_e_revokedlist_grants respects channel/sender filters and excludes revoked entries by default.

The three [security]-marked cases are the invariants that bind future evolution of the module: any change that breaks them is a dismantling of the security posture, not a refactor.

9. Limits and what is deferred to v1.2+

v1.1 limitwhen it is removed
No rate-limit in code. The autonomy × capability table does not yet distinguish "one call vs ten calls per minute". A granted capability is granted without ceiling; accidental abuse (runaway loops, an executor calling HTTP a hundred times) is not throttled at policy level. v1.2, with a per_capability token bucket stored in the same DB as the grants. The token bucket requires choosing parameters (capacity, refill rate) for each capability; real-usage telemetry is needed before fixing them.
No cost tier. llm:online requires a generic approval_required, it does not differentiate "Sonnet at $0.02 per call" from "GPT-5 at $0.17". Roberto sees all online calls as equal, even when the cost varies by an order of magnitude. v1.2, when the LLM judge will measure the cost factor inside vaglio's reward. Policy will then expose explicit thresholds (e.g. "up to $0.05 allowed in Full, above approval_required").
No custom policy per user profile. The table is compiled in code. A per-user override (e.g. Roberto vs a family member with different levels on the same capabilities) requires a config layer that does not yet exist. When pairing supports multiple profiles with distinct sender_ids mapped to different tables. A boot-time TOML layer (something like policy_overrides.toml) will be introduced, but with integrity constraints against the registry.
No automatic revoke. Grants expire only via explicit expires_at. An "inactivity expiry" (e.g. 90 days unused) does not exist; grants stay in the DB even for years. When the active-grants pool grows large enough to make pruning useful. Simple implementation: an internal cron that, at boot, looks for grants without recent accesses (a last_used column would be needed) or with a very old granted_at and revokes them with reason "stale".

Final notes

Policy is a small module by design: all the complexity of the legality filter lives in two structures readable at a glance (a 13-entry dictionary, a 3×13 table) plus a SQLite round-trip for grants. No DSL, no rules expressed in natural language, no runtime configuration that can break by loading a malformed file.

The constraint that a grant can never elevate a denied to allowed is the heart of the separation between the tactical plane (targeted concession, inside a level that allows it) and the strategic plane (the level itself, chosen deliberately during pairing). Keeping these two planes distinct is what allows Metnos to scale in autonomy without sliding toward more permissions than intended.