Contents

  1. What a skill is, in thirty seconds
  2. From the SKILL file to the catalog: the assembly line
  3. What comes in: three sources
  4. What comes out: a folder of signed executors
  5. The checks that matter
  6. Credentials and dialogs with the user
  7. An example from start to finish
  8. Going deeper

1. What a skill is, in thirty seconds

A skill is the structured documentation of how to use a service (Google Calendar, a mailbox, a weather feed). It lives in a text file called SKILL.md, accompanied by a few small helper scripts. The public standard is called agentskills.io.

The file has a header (name, author, version, dependencies) followed by sections: «what it does», «first-time setup», «invocation examples». Nothing exotic: the format is meant to be read by a human first and, almost as a side effect, by an agent.

Metnos can import a skill as-is: it reads the file, derives a set of Metnos-format executors, and registers them in the catalog as if they had been written by hand. Once imported, the agent invokes them exactly like the others, and the user sees no difference.

Why not use the skill as-is? Skills have no digital signature, no execution fence, and their helper scripts can do anything. Running them without filters would be reckless. Importing them into Metnos format means: same functional behavior, but inside the fence, with a signature, with a manifest that declares what they do and what they may not do.

2. From the SKILL file to the catalog: the assembly line

Importing is a chain of five fully automatic stages. The only step that asks for the user is the initial go: «import skill X».

1. Read

The parser reads SKILL.md and extracts the structure: name, author, declared capabilities, list of sub-commands with their arguments.

2. Translate

Each sub-command is mapped onto Metnos's closed vocabulary (verb + canonical object): calendar list becomes read_events, gmail send becomes send_messages.

3. Generate

For each executor the generator emits the Python file and the TOML manifest using fixed templates (Jinja). Only the manifest description is polished by a language model for stylistic care.

4. Check

Five cumulative checks: name in the vocabulary, no overlap with existing executors, description consistent with the code, declared capabilities reasonable, input and output contract valid.

5. Sign

The manifest is signed Ed25519. The executor enters a folder kept under separate watch from the hand-written ones. The loader picks it up at restart.

Among the five, only step 3 (Generate) leans on a language model, and only for two specific things: the description in Italian and English, and the eight-to-fifteen affinity keywords that help the planner pick the right tool. Everything else is deterministic: same input, same output.

The closed vocabulary, a compass. Metnos accepts 23 actions (read, find, set, send, delete, share, …) and 19 objects (files, messages, events, persons, tasks, …). A skill that speaks of append rows to spreadsheet does not produce append_rows: it becomes change_files_xlsx. One that speaks of label messages does not produce label_messages: it is rejected if no consistent translation exists. The rigidity of the vocabulary shields the agent from invented names and synonyms: every task request is routed to the same tool as always.

3. What comes in: three sources

A skill can come from three places:

Downloaded skills are placed in the local cache for seven days. Importing the same skill twice within that window generates no traffic: the cached copy is reused.

4. What comes out: a folder of signed executors

After import, one or more folders appear on disk, one per sub-command of the skill. Each contains the same files as a hand-written executor: the Python file with the invoke function, the TOML manifest, the signature, the fence profile.

~/.local/share/metnos/executors/skills/google-workspace/
  read_events/
    read_events.py
    manifest.toml
    manifest.toml.sig
    manifest.lang_state.json
  set_events/
    …
  delete_events/
    …
  … (another twenty or so, for Gmail, Drive, Sheets, Docs, Contacts)

The destination is skills/ (ADR 0160; the prior name _imports/ is still read in back-compat mode), under the folder of synthesized executors. This visually separates what comes from outside from what Metnos has written on its own, and from what was hand-written. All three kinds end up in the same agent catalog, however, and all pass through the same checks at the moment of use.

The manifest of an imported executor carries one extra section, which declares its origin:

[provenance]
synthesized    = true
imported_from  = "agentskills.io/googleworkspace/calendar"
source_version = "1.1.0"
imported_at    = "2026-05-10T22:00:00Z"
source_sha256  = "<fingerprint of the source file>"

This is the only thing that distinguishes it from a hand-written executor. Everything else — signature, fence, vocabulary, contract — is identical.

5. The checks that matter

An imported skill is not automatically deemed trustworthy. Five cumulative checks decide whether it is admitted to the catalog:

CheckWhat it verifiesWhat it rejects
Canonical nameThe executor name has the shape verb_object with verb and object from the vocabulary.Fanciful or invented names, unrecognized synonyms.
Affinity overlapThe imported executor does not share too many keywords with one already in the catalog.Masked duplicates, executors that would steal work from an existing one.
Description-code consistencyA language model checks that what the manifest promises is what the code actually does.Misleading descriptions, semantic drift.
Credential uniquenessIf the skill requires credentials (API key, token), the binding name is not already taken by another imported skill.Invisible collisions that would mount one service on top of another.
Routing assertionsFor each accepted executor, a small battery of prototypical questions is added («list tomorrow's appointments» must pick read_events).Tools that creep into other tools' territory (routing regressions).

Rejections are recorded in an audit log with their reasons. When one sub-command of a skill is rejected, the others in the same skill carry on: imports are partial, never all-or-nothing.

What is NOT done, today. The importer does not yet perform static analysis on the helper scripts of a skill. A malicious skill that declares I read events but actually ships a file to a remote server is not caught by the five current checks. Static analysis is on the roadmap as «check zero», pre-import: an extra pass that looks for unexpected network calls, writes outside the folder, and other suspicious signals, and shows the user a summary card before asking for final authorization.

6. Credentials and dialogs with the user

Most useful skills lean on services that require an access key: an API key, an OAuth token, a session file. The importer never asks for these things up front. The first time the agent tries to use an executor that requires credentials and does not find them, this happens:

  1. The executor returns decision = "needs_inputs" with a small request form.
  2. The agent shows the user that form (in chat: conversational text; on the web: an HTML form).
  3. The user fills in what is missing (client secret file, redirect URL, API key, …).
  4. The values are encrypted in a local vault (~/.local/share/metnos/credentials/).
  5. The executor is re-invoked with the same arguments and now finds the credentials in place.

Subsequent invocations ask for nothing. If the token expires, the executor signals error_class = "auth_required" and the flow starts over. No tokens in configuration files, no passwords in the code.

To inspect what is already configured without seeing the values, three canonical executors are available: find_credentials (lists the configured bindings, metadata only), set_credentials (registers a new credential), delete_credentials (removes a binding). The plaintext values never leave those executors: the planner and its models see only the binding name, the fingerprint, the date, and the state («configured», «expired», «missing»).

7. An example from start to finish

Roberto asks: «import the Google Workspace skill».

  1. Command: metnos-skills import agentskills.io/googleworkspace/google-workspace
  2. Download: the skill is already in the local cache (last refresh three days ago); no network traffic.
  3. Read: the parser finds 24 sub-commands across Gmail, Calendar, Drive, Sheets, Docs, Contacts.
  4. Translate: 21 sub-commands translate cleanly into canonical names; three are rejected (gmail reply collides with send_messages; gmail labels has no canonical qualifier; sheets append collides with its own update).
  5. Generate: 21 executor folders, each with Python file and manifest, are written under skills/google-workspace/ (ADR 0160).
  6. Check: all 21 pass the five gates. A routing assertion is added to the test battery for each.
  7. Sign: the 21 manifests are signed Ed25519.
  8. Ready: Roberto asks in chat «what appointments do I have tomorrow?». The planner picks read_events (one of the 21), which replies credentials needed and opens the request form. Roberto provides the client secret and the redirect URL once; subsequent invocations work silently.

Total time, excluding the one-time OAuth setup: under three minutes for twenty-one new tools.

8. Seven layers of safety net

Importing third-party code requires layers of control that handcrafted code does not need. An imported skill could have been tampered with, declare one thing and do another, impersonate an existing tool to hijack the chat, or exfiltrate credentials without trace. The seven layers reduce a different risk each, progressively. ADR 0159 documents every layer.

LayerWhenWhat it catches
L1 sign verifyevery bootfile tampered post-import (sha256 digest + Ed25519)
L2 affinity overlapat-importskill squatting on an existing tool (Jaccard ≥0.5 → reject)
L3 efficacy agerdaily cronskill silently failing or never selected (deprecate 30d, archive 14d)
L5 smoke batteryat-importskill returning broken schema or crashing
L6 LLM semantic verifierat-importmismatch between manifest description and code behavior
Runtime vaglioevery invocationcurrent action violates policy (guard + safe-verb shortcut + LLM judge)
Skill audit JSONLevery invocationexfiltration / credentials abuse (append-only trace for forensics)

Handcrafted executors (written by the Metnos team) do not have L2, L6 and audit: the code is trusted by construction, manifest and behavior are aligned by definition, and audit of who-did-what lives in git blame.

Layers are independent: if L6 fails (LLM down) L1/L2/L3/L5 remain active; if L2 is too strict (Jaccard 0.5 can reject legitimately similar skills) the provider qualifier (_google_workspace) disambiguates (ADR 0136).

9. Going deeper

To understand…Read
what an executor is and how it is shapedexecutor
how Synt writes an executor from scratch (alternative to import)synt
the fence imported executors run insidesandbox
the checks before any risky actionvaglio
the decision archive (why import instead of adopting the standard)ADR 0123

Metnos — skill importer, didactic introduction