TESTED Microdesign v1.1 — new, tested on the evening of April 27, 2026. Cluster policy 10/10 green. Reference: runtime/policy.py.

Status sequence: under approval → approved → tested → implemented. Replaces the obsolete v1.0 written before implementation.

← Documentation index Microdesign › policy

Metnos

policy — the legality filter for every action

Microdesign v1.1 — status TESTED (April 27, 2026 evening)
Audience: those who want to understand how Metnos decides whether an action is permitted, requires approval, or is forbidden.

Reading time: 10 minutes.

What policy is
Capability Registry v1.1
Autonomy × capability table
Persistent per_target grants
Combined outcome (effective_outcome)
Runtime integration
CLI
Tests
Limits and what is deferred to v1.2+

1. What policy is

Policy is the legality filter of Metnos: layer 2 of the architecture (ch. 6 of the Architecture). For every combination of autonomy level, capability and target it decides one outcome among allowed, denied, approval_required. It is the module that embodies the rules shared between Roberto and the agent: what may be done without asking, what must be asked, what is never done.

Policy lives alongside two other layers that decide different things, and this is the clarity point that v1.0 lacked:

policy decides the right — does this level have title to do this thing on this target?
the vaglio decides sense — does it make sense to do it, given the request context?
the sandbox confines execution — once we decide to act, we act inside a kernel-level shell that bounds damage.

The three filters are in series: policy runs first (it is the cheapest, a table lookup plus an optional SQLite query); if it passes, vaglio applies contextual LLM judgement; if vaglio also passes, the sandbox wraps the invocation. The runtime/policy.py module is ~360 lines, no daemon, no global state apart from the table cache and the grants SQLite file.

No policy decision inside executor code. An executor knows nothing about the current autonomy level or the active grants. It only declares its capabilities in the manifest. Policy is the module where all rules live, readable in one place, modifiable without touching a single executor.

2. Capability Registry v1.1

The registry is the closed dictionary of actions Metnos recognises. Thirteen canonical entries, defined in runtime/policy.py:CAPABILITY_REGISTRY (runtime/policy.py:46-121). Each entry is a CapabilitySpec with four attributes:

name — the canonical name (format family:mode, e.g. fs:read);
critical — True if the action is irreversible or high-stakes (write, send, exec);
default_approval — default approval mode: none (never), per_target (once per new target), always (every time, even for previously seen targets);
target_kind — target type: path_glob, host, exact, none.

name	critical	default_approval	target_kind	description
`fs:read`	no	`per_target`	`path_glob`	read files from the local filesystem within declared path_globs
`fs:write`	yes	`per_target`	`path_glob`	write/modify files within path_globs (critical)
`code:exec`	yes	`always`	`exact`	execute a shell command from a whitelist (e.g. package manager)
`network:http`	no	`per_target`	`host`	HTTP/HTTPS GET/POST to authorised hosts
`llm:local`	no	`none`	`none`	local LLM call (Ollama, llama.cpp), zero cost
`llm:online`	no	`per_target`	`none`	online LLM call (Anthropic, OpenAI, ...), cost > 0
`mail:read`	no	`per_target`	`exact`	read IMAP messages from an authorised mailbox
`mail:send`	yes	`always`	`exact`	SMTP send to recipients (irreversible, high stakes)
`channel:in`	no	`none`	`exact`	receive messages from a channel (Telegram, CLI, voice)
`channel:out`	no	`per_target`	`exact`	send messages to a specific channel
`time:read`	no	`none`	`none`	read current time and timezones
`parse:local`	no	`none`	`none`	local parsing of known formats (PDF, HTML, JSON, CSV)
`calendar:read`	no	`per_target`	`exact`	read events from an authorised calendar

The registry is closed: record_grant rejects a capability that is not in it (runtime/policy.py:243-244). Adding a capability means modifying the registry in code and running the tests — no runtime registration. This is a deliberate choice: the action vocabulary is a security asset, not an extension surface freely open.

2.1 Reading the registry

Three families are critical: fs:write, code:exec, mail:send. These are actions that change the world irreversibly (a written file, an already-run shell command, a sent email). Two of them have default_approval = always: code:exec and mail:send. Even at the highest autonomy level, these two always pass through an explicit confirmation.

Four capabilities have default_approval = none: llm:local, channel:in, time:read, parse:local. These are zero-cost, reversible actions, with no side effects on the outside world (listening to a channel, reading a clock, parsing a PDF inside the sandbox). They stay outside the approval flow even at the most conservative level.

3. Autonomy × capability table

The table is the cartesian product of the three autonomy levels (ReadOnly, Supervised, Full) by the 13 capabilities. For each cell an outcome: allowed, approval_required, denied. Generated in runtime/policy.py:_TABLE and _init_table (runtime/policy.py:140-171) according to canonical rules, not hand-written.

capability	ReadOnly	Supervised	Full
`fs:read`	approval	approval	allowed
`fs:write`	denied	approval	allowed
`code:exec`	denied	approval	approval
`network:http`	denied	approval	allowed
`llm:local`	allowed	allowed	allowed
`llm:online`	denied	approval	allowed
`mail:read`	approval	approval	allowed
`mail:send`	denied	approval	approval
`channel:in`	allowed	allowed	allowed
`channel:out`	denied	approval	allowed
`time:read`	allowed	allowed	allowed
`parse:local`	allowed	allowed	allowed
`calendar:read`	approval	approval	allowed

3.1 The three rules that generate the table

The table is not arbitrary: it derives from three rules, one per level, that _init_table applies while iterating over the registry.

ReadOnly. Read-only capabilities only, no side effects visible to the outside world. Never write, never send, never exec, never online LLM (which carries an outgoing monetary cost). Read capabilities with default_approval = per_target (fs:read, mail:read, calendar:read) stay approval_required: the most conservative level does not give them up but asks confirmation for each new target. Critical ones and others with default per_target or always become denied.

Supervised. Everything ReadOnly can do, plus it raises denied to approval_required: every capability with default_approval ≠ none requires approval. This is the level at which the system is fully operational but every action with effects on the world goes through Roberto.

Full. All allowed except capabilities with default_approval = always: code:exec and mail:send stay approval_required. These are the two where a mistake cannot be undone, and for that reason they are never granted without explicit confirmation, regardless of trust level.

Full does not mean carte blanche. It is the level where friction is given up on the reversible, not on actions that burn the bridge. A sent email cannot be recalled; a shell command that has deleted files cannot be undone. For these two capabilities the approval request stays in any profile in which the system can run.

4. Persistent per_target grants

The table alone is not enough. When Roberto approves a request — "yes, go ahead and write to ~/Documents/invoices-2026/* for the next two months" — we want the system to remember that concession and not ask again for every saved file. The memory of these concessions lives in a single-file SQLite table: the grants.

4.1 SQLite schema

Defined in runtime/policy.py:SCHEMA (runtime/policy.py:186-202):

CREATE TABLE IF NOT EXISTS grants (
    id              INTEGER PRIMARY KEY AUTOINCREMENT,
    channel         TEXT NOT NULL,
    sender_id       TEXT NOT NULL,
    capability      TEXT NOT NULL,
    target          TEXT NOT NULL,
    granted_at      TEXT NOT NULL,
    expires_at      TEXT,
    granted_by      TEXT,
    revoked_at      TEXT
);

A grant is identified by the four-tuple (channel, sender_id, capability, target): who approved (e.g. Telegram + user Roberto), for which action, on which target. Dates granted_at/expires_at/revoked_at are ISO 8601 UTC. Two indexes accelerate the two typical queries: exact lookup and active-grants scan.

File path: ~/.local/state/metnos/grants.db, override available via the METNOS_GRANTS_DB environment variable (runtime/policy.py:27, 223-229). The parent folder is created on first access.

4.2 API

function	what it does	citation
`record_grant(channel, sender_id, capability, target, expires_at=None, granted_by=None)`	Records a concession. Raises `ValueError` if `capability` is not in the registry. Returns the `Grant` object with assigned `id`.	`runtime/policy.py:232-259`
`has_grant(channel, sender_id, capability, target)`	True if an active grant (not revoked, not expired) exists for the four-tuple. The query compares `expires_at` with the current time.	`runtime/policy.py:262-284`
`list_grants(channel=None, sender_id=None, include_revoked=False)`	Lists grants, filterable by channel/sender, optionally including revoked. Sorted by `granted_at` descending.	`runtime/policy.py:287-311`
`revoke_grant(grant_id)`	Sets `revoked_at` to the current time. Returns True if something was modified, False if the grant was already revoked or did not exist.	`runtime/policy.py:314-325`

All four functions open and close the connection per call: no in-memory state. Throughput is not the point — we are in the few-queries-per-second regime — and per-call isolation simplifies test reasoning.

4.3 When grants are written

In v1.1 the policy module is read-only from the planner: the planner reads the table and the grants but never creates them. Creation happens in the approval dispatcher (ch. 5 phase 5, see approval_ux): when a pending request resolves as approved with scope this and similar, the dispatcher calls record_grant and from that moment the concession is persistent.

5. Combined outcome (`effective_outcome`)

The effective_outcome function (runtime/policy.py:330-355) is the single entry point for the planner. It combines table and grants into one outcome, according to four cases:

table says	active grant for (channel, sender, target)?	outcome
allowed	indifferent, the DB is not queried	allowed
denied	indifferent, the DB is not queried	denied
approval_required	yes	allowed
approval_required	no (or scope parameters missing)	approval_required

The logic is linear: if the table already decides cleanly (allowed or denied), the grant is not even consulted; if it decides approval_required, an active grant turns it into allowed, otherwise it stays approval_required.

Security guarantee: a grant can never elevate a denied to allowed. This is the constraint that separates a tactical concession (single target, level that allows it) from a structural level upgrade. If a level says "this action is forbidden", no past approval can unblock it: the level itself must change, a decision that lives in pairing and its signing flow. The effective_outcome_denied_non_e_alzato_da_grant test verifies exactly this invariant (ch. 8).

5.1 Concrete examples

Example 1 — Supervised, fs:write on an already-granted file

Two weeks ago Roberto approved fs:write on ~/Documents/invoices-2026/*. The agent is about to save ~/Documents/invoices-2026/04-Acme.pdf.

Table for (Supervised, fs:write) → approval_required. Active grant for (telegram, roberto, fs:write, ~/Documents/invoices-2026/*) → outcome allowed. No approval card, direct save.

Example 2 — ReadOnly, fs:write

Same scenario but the active level is ReadOnly (e.g. a more restrictive delegated session).

Table for (ReadOnly, fs:write) → denied. Even if the grant existed, the DB is not queried. Outcome denied: the agent reports that the active level does not permit disk writes and suggests rising to Supervised.

Example 3 — Full, mail:send

Level Full, capability mail:send, recipient [email protected] to whom Roberto has sent ten emails before.

Table for (Full, mail:send) → approval_required (the "always" rule). Active grant? For mail:send the default mode is always, and in v1.1 no grant is ever created for capabilities marked always. Outcome approval_required: the email goes to the queue, Roberto sees the card, approves or denies that single email.

6. Runtime integration

In v1.1 runtime/policy.py is a separate module, complete and tested, but not yet stitched into the planner. Integration into agent_runtime is the last line of phase 5 and will arrive in the next iteration. The planned shape:

def execute_step(step, ctx):
    cap = step.capability        # e.g. "fs:write"
    target = step.target         # e.g. "/home/roberto/Documents/invoices-2026/04.pdf"
    outcome = policy.effective_outcome(
        ctx.autonomy_level,      # "ReadOnly" | "Supervised" | "Full"
        cap,
        channel=ctx.channel,     # e.g. "telegram"
        sender_id=ctx.sender_id, # e.g. "roberto"
        target=target,
    )
    if outcome == "denied":
        return Refused(reason="insufficient level for " + cap)
    if outcome == "approval_required":
        pending = approval_registry.create_pending(step, ctx)
        channels.approval.render_approval_card(pending, ctx.channel)
        return Awaiting(pending_id=pending.id)
    # outcome == "allowed"
    return invoke_executor(step.executor, step.args, autonomy=ctx.autonomy_level)

Three references to existing modules:

approval_registry.create_pending queues the request and returns an id;
channels.approval.render_approval_card composes the three-line card and sends it on the originating channel (see approval_ux);
invoke_executor is the same point at which the sandbox wraps the command; the autonomy parameter is already pass-through (ch. 6 of sandbox.html).

The planner never reads the table or the grants directly: it makes a single call to effective_outcome and branches on the outcome. This keeps policy as the single source of truth for the rules; the day the rules change (v1.2: cost tier, rate limit), the planner is not touched.

7. CLI

The module is runnable as a script (runtime/policy.py:360-423): useful for manual inspection and to build dashboards in a few lines. Four subcommands.

command	what it does
`python3 -m policy registry`	prints one JSON line per each of the 13 capabilities with all attributes.
`python3 -m policy table`	prints three JSON lines, one per level (ReadOnly/Supervised/Full), with the outcome for each of the 13 capabilities.
`python3 -m policy check <level> <capability> [--channel C --sender S --target T]`	prints the `effective_outcome` result. Without `--channel/--sender/--target` it returns the table-only outcome.
`python3 -m policy grants [--channel C] [--sender S] [--all]`	lists grants, active by default, all with `--all`.
`python3 -m policy revoke <grant_id>`	revokes a grant by id; prints `revoked` or `no-op`.

JSON-line output makes piping into jq easy: for instance python3 -m policy table | jq shows the matrix in readable form.

8. Tests

Cluster policy in the runtime test framework: 10/10 green as of April 27, 2026. Cases cover the registry, the three table rules, the grants round-trip, and the security invariant separating denied from grant-elevatable cases.

#	case	what it verifies
1	`registry_contiene_13_capability_canoniche`	`CAPABILITY_REGISTRY` exposes exactly the 13 expected entries, each with the four mandatory attributes.
2	`is_allowed_readonly_blocca_write_e_exec` [security]	at ReadOnly level: `fs:write`, `code:exec`, `mail:send` are `denied`.
3	`is_allowed_supervised_richiede_approval_per_critical`	at Supervised: `fs:write` is `approval_required`; `llm:local` is `allowed`; `mail:send` and `code:exec` are `approval_required`.
4	`is_allowed_full_mantiene_always_per_critical_irreversibili` [security]	at Full level: `mail:send` and `code:exec` stay `approval_required`; `fs:write` becomes `allowed`.
5	`record_grant_e_has_grant_round_trip`	after `record_grant` on the four-tuple (channel, sender, capability, target), `has_grant` on the same four-tuple returns True.
6	`record_grant_capability_sconosciuta_solleva`	`record_grant` with a capability outside the registry raises `ValueError`.
7	`revoke_grant_disattiva_has_grant`	after `revoke_grant(id)`, `has_grant` on the same four-tuple returns False.
8	`effective_outcome_grant_alza_a_allowed`	if the table says `approval_required` and an active grant exists, `effective_outcome` returns `allowed`.
9	`effective_outcome_denied_non_e_alzato_da_grant` [security]	if the table says `denied`, the existence of any grant does not change the outcome: it stays `denied`.
10	`list_grants_filtra_per_canale_e_revoked`	`list_grants` respects channel/sender filters and excludes revoked entries by default.

The three [security]-marked cases are the invariants that bind future evolution of the module: any change that breaks them is a dismantling of the security posture, not a refactor.

9. Limits and what is deferred to v1.2+

v1.1 limit	when it is removed
No rate-limit in code. The autonomy × capability table does not yet distinguish "one call vs ten calls per minute". A granted capability is granted without ceiling; accidental abuse (runaway loops, an executor calling HTTP a hundred times) is not throttled at policy level.	v1.2, with a per_capability token bucket stored in the same DB as the grants. The token bucket requires choosing parameters (capacity, refill rate) for each capability; real-usage telemetry is needed before fixing them.
No cost tier. `llm:online` requires a generic `approval_required`, it does not differentiate "Sonnet at $0.02 per call" from "GPT-5 at $0.17". Roberto sees all online calls as equal, even when the cost varies by an order of magnitude.	v1.2, when the LLM judge will measure the cost factor inside vaglio's reward. Policy will then expose explicit thresholds (e.g. "up to $0.05 allowed in Full, above approval_required").
No custom policy per user profile. The table is compiled in code. A per-user override (e.g. Roberto vs a family member with different levels on the same capabilities) requires a config layer that does not yet exist.	When pairing supports multiple profiles with distinct sender_ids mapped to different tables. A boot-time TOML layer (something like `policy_overrides.toml`) will be introduced, but with integrity constraints against the registry.
No automatic revoke. Grants expire only via explicit `expires_at`. An "inactivity expiry" (e.g. 90 days unused) does not exist; grants stay in the DB even for years.	When the active-grants pool grows large enough to make pruning useful. Simple implementation: an internal cron that, at boot, looks for grants without recent accesses (a `last_used` column would be needed) or with a very old `granted_at` and revokes them with reason "stale".

Final notes

Policy is a small module by design: all the complexity of the legality filter lives in two structures readable at a glance (a 13-entry dictionary, a 3×13 table) plus a SQLite round-trip for grants. No DSL, no rules expressed in natural language, no runtime configuration that can break by loading a malformed file.

The constraint that a grant can never elevate a denied to allowed is the heart of the separation between the tactical plane (targeted concession, inside a level that allows it) and the strategic plane (the level itself, chosen deliberately during pairing). Keeping these two planes distinct is what allows Metnos to scale in autonomy without sliding toward more permissions than intended.

Metnos

Contents

1. What policy is

2. Capability Registry v1.1

2.1 Reading the registry

3. Autonomy × capability table

3.1 The three rules that generate the table

4. Persistent per_target grants

4.1 SQLite schema

4.2 API

4.3 When grants are written

5. Combined outcome (effective_outcome)

5.1 Concrete examples

6. Runtime integration

7. CLI

8. Tests

9. Limits and what is deferred to v1.2+

Final notes

5. Combined outcome (`effective_outcome`)