A skill is the structured documentation of how to use a service (Google
Calendar, a mailbox, a weather feed). It lives in a text file called
SKILL.md, accompanied by a few small helper scripts. The
public standard is called agentskills.io.
The file has a header (name, author, version, dependencies) followed by sections: «what it does», «first-time setup», «invocation examples». Nothing exotic: the format is meant to be read by a human first and, almost as a side effect, by an agent.
Metnos can import a skill as-is: it reads the file, derives a set of Metnos-format executors, and registers them in the catalog as if they had been written by hand. Once imported, the agent invokes them exactly like the others, and the user sees no difference.
Importing is a chain of five fully automatic stages. The only step that asks for the user is the initial go: «import skill X».
The parser reads SKILL.md and extracts the structure: name, author, declared capabilities, list of sub-commands with their arguments.
Each sub-command is mapped onto Metnos's closed vocabulary (verb + canonical object): calendar list becomes read_events, gmail send becomes send_messages.
For each executor the generator emits the Python file and the TOML manifest using fixed templates (Jinja). Only the manifest description is polished by a language model for stylistic care.
Five cumulative checks: name in the vocabulary, no overlap with existing executors, description consistent with the code, declared capabilities reasonable, input and output contract valid.
The manifest is signed Ed25519. The executor enters a folder kept under separate watch from the hand-written ones. The loader picks it up at restart.
Among the five, only step 3 (Generate) leans on a language model, and only for two specific things: the description in Italian and English, and the eight-to-fifteen affinity keywords that help the planner pick the right tool. Everything else is deterministic: same input, same output.
append_rows: it becomes change_files_xlsx.
One that speaks of label messages does not produce
label_messages: it is rejected if no consistent translation
exists. The rigidity of the vocabulary shields the agent from invented
names and synonyms: every task request is routed to the same tool as
always.
A skill can come from three places:
SKILL.md file already on
disk (typically because you are developing it).agentskills.io/<author>/<skill>; the importer
resolves it to the matching public repository and downloads it.SKILL.md file; in this case the importer clones
or downloads directly.Downloaded skills are placed in the local cache for seven days. Importing the same skill twice within that window generates no traffic: the cached copy is reused.
After import, one or more folders appear on disk, one per sub-command of
the skill. Each contains the same files as a hand-written executor: the
Python file with the invoke function, the TOML manifest,
the signature, the fence profile.
~/.local/share/metnos/executors/skills/google-workspace/
read_events/
read_events.py
manifest.toml
manifest.toml.sig
manifest.lang_state.json
set_events/
…
delete_events/
…
… (another twenty or so, for Gmail, Drive, Sheets, Docs, Contacts)
The destination is skills/ (ADR 0160; the prior name
_imports/ is still read in back-compat mode), under the folder of
synthesized executors. This visually separates what comes from outside
from what Metnos has written on its own, and from what was hand-written.
All three kinds end up in the same agent catalog, however, and all pass
through the same checks at the moment of use.
The manifest of an imported executor carries one extra section, which declares its origin:
[provenance] synthesized = true imported_from = "agentskills.io/googleworkspace/calendar" source_version = "1.1.0" imported_at = "2026-05-10T22:00:00Z" source_sha256 = "<fingerprint of the source file>"
This is the only thing that distinguishes it from a hand-written executor. Everything else — signature, fence, vocabulary, contract — is identical.
An imported skill is not automatically deemed trustworthy. Five cumulative checks decide whether it is admitted to the catalog:
| Check | What it verifies | What it rejects |
|---|---|---|
| Canonical name | The executor name has the shape verb_object with verb and object from the vocabulary. | Fanciful or invented names, unrecognized synonyms. |
| Affinity overlap | The imported executor does not share too many keywords with one already in the catalog. | Masked duplicates, executors that would steal work from an existing one. |
| Description-code consistency | A language model checks that what the manifest promises is what the code actually does. | Misleading descriptions, semantic drift. |
| Credential uniqueness | If the skill requires credentials (API key, token), the binding name is not already taken by another imported skill. | Invisible collisions that would mount one service on top of another. |
| Routing assertions | For each accepted executor, a small battery of prototypical questions is added («list tomorrow's appointments» must pick read_events). | Tools that creep into other tools' territory (routing regressions). |
Rejections are recorded in an audit log with their reasons. When one sub-command of a skill is rejected, the others in the same skill carry on: imports are partial, never all-or-nothing.
Most useful skills lean on services that require an access key: an API key, an OAuth token, a session file. The importer never asks for these things up front. The first time the agent tries to use an executor that requires credentials and does not find them, this happens:
decision = "needs_inputs" with a
small request form.~/.local/share/metnos/credentials/).
Subsequent invocations ask for nothing. If the token expires, the
executor signals error_class = "auth_required" and the flow
starts over. No tokens in configuration files, no passwords in
the code.
To inspect what is already configured without seeing the values, three
canonical executors are available: find_credentials (lists
the configured bindings, metadata only), set_credentials
(registers a new credential), delete_credentials (removes
a binding). The plaintext values never leave those executors: the
planner and its models see only the binding name, the fingerprint, the
date, and the state («configured», «expired»,
«missing»).
Roberto asks: «import the Google Workspace skill».
metnos-skills import agentskills.io/googleworkspace/google-workspacegmail reply collides with send_messages; gmail labels has no canonical qualifier; sheets append collides with its own update).skills/google-workspace/ (ADR 0160).read_events (one of the 21), which replies credentials needed and opens the request form. Roberto provides the client secret and the redirect URL once; subsequent invocations work silently.Total time, excluding the one-time OAuth setup: under three minutes for twenty-one new tools.
Importing third-party code requires layers of control that handcrafted code does not need. An imported skill could have been tampered with, declare one thing and do another, impersonate an existing tool to hijack the chat, or exfiltrate credentials without trace. The seven layers reduce a different risk each, progressively. ADR 0159 documents every layer.
| Layer | When | What it catches |
|---|---|---|
| L1 sign verify | every boot | file tampered post-import (sha256 digest + Ed25519) |
| L2 affinity overlap | at-import | skill squatting on an existing tool (Jaccard ≥0.5 → reject) |
| L3 efficacy ager | daily cron | skill silently failing or never selected (deprecate 30d, archive 14d) |
| L5 smoke battery | at-import | skill returning broken schema or crashing |
| L6 LLM semantic verifier | at-import | mismatch between manifest description and code behavior |
| Runtime vaglio | every invocation | current action violates policy (guard + safe-verb shortcut + LLM judge) |
| Skill audit JSONL | every invocation | exfiltration / credentials abuse (append-only trace for forensics) |
Handcrafted executors (written by the Metnos team) do not have L2, L6
and audit: the code is trusted by construction, manifest and behavior
are aligned by definition, and audit of who-did-what lives in
git blame.
Layers are independent: if L6 fails (LLM down) L1/L2/L3/L5 remain
active; if L2 is too strict (Jaccard 0.5 can reject legitimately
similar skills) the provider qualifier (_google_workspace)
disambiguates (ADR 0136).
| To understand… | Read |
|---|---|
| what an executor is and how it is shaped | executor |
| how Synt writes an executor from scratch (alternative to import) | synt |
| the fence imported executors run inside | sandbox |
| the checks before any risky action | vaglio |
| the decision archive (why import instead of adopting the standard) | ADR 0123 |
Metnos — skill importer, didactic introduction