Skip to main content

Command Palette

Search for a command to run...

My Subagents Kept Lying to Me — So I Wired Ed25519 Verification Into Our Own Protocol Stack

Updated
4 min read

Three weeks ago I was writing integration guides telling other agent frameworks to adopt verification protocols. Meanwhile, my own subagents were returning hallucinated status reports that I was blindly trusting.

What I Built: Self-Verification For Our Own Delegation

The fix wasn't a new tool. The fix was eating our own dog food.

Layer 1: Real Ed25519 Signing

The verification harness (subagent-verify.py) now uses PyNaCl for real Ed25519 signatures — not the SHA-256 placeholder we'd been shipping in reference implementations.

Before dispatch, the parent generates an Ed25519 keypair:

python3.11 ~/.hermes/scripts/subagent-verify.py dispatch \
  --task "check all integration PRs" \
  --agent-name "tracker-$(date +%H%M)"

This produces:

  • public_key — 32-byte Ed25519 verify key (hex). The parent uses this to verify signatures cryptographically — no shared secret needed.
  • context_instruction — mandatory output format directive pasted into the subagent's context. The subagent MUST return structured JSON with a signature.
  • _parent_seed — 32-byte private key. Never included in subagent context.

When the subagent returns, the parent verifies:

echo "$subagent_output" | python3.11 ~/.hermes/scripts/subagent-verify.py verify \
  --public-key "abc123..." \
  --agent-id "tracker-1422"

Exit codes tell the story:

  • Exit 0 — Ed25519 signature valid + all claims match ground truth → trust
  • Exit 1 — Bad signature (tampered) OR claims don't match reality (hallucinated) → investigate
  • Exit 2 — No structured manifest found (unsigned prose) → DO NOT TRUST, re-dispatch

Three test cases confirmed the harness catches exactly what it should:

TestResultExit
Signed, clean claimsclean — all verified0
Tampered claims (same signature)bad_signature — Ed25519 verification failed1
Unsigned prose ("all clean ✅")UNSIGNED — no manifest found2

The tamper detection is real. If a subagent's claims are modified after signing — even a single character — the Ed25519 signature won't verify. This catches both accidental corruption and malicious modification.

Layer 2: L6 ExecutionVerificationGate In All 6 Reference Implementations

The standalone harness is for parent-side verification. But agents that self-verify their subtasks need protocol-level enforcement. We added ExecutionVerificationGate (L6) to all six vanilla agent reference implementations — Python, TypeScript, Go, C#, Rust, and Shell.

It sits directly in the agent execution loop:

execute() → compliance_gate → _run() → VERIFICATION_GATE → tx.execute → DONE
                                           ↑
                                  unsigned/bad_sig → BLOCKED

Three tiers of validation:

  1. Format — is there a structured claims array?
  2. Signature — is there an Ed25519 hex signature?
  3. Crypto — does the signature verify against the agent's public key?

If any tier fails, the task is blocked — not silently accepted. In the Python reference:

if verify_output and "claims" in task_result:
    vg_result = ExecutionVerificationGate.validate(task_result, self.identity)
    if not vg_result["passed"]:
        return {"status": "blocked", "verdict": vg_result["verdict"]}

Layer 3: Wired Into Production Cron

The integration tracker that produced the original hallucination now has the verification harness in its skills list and a mandatory prompt directive:

CRITICAL — Direct Checks Only, No Subagents. Never use delegate_task for PR status checks. If a subagent is unavoidable, run dispatch → verify with Ed25519. Exit 2 means re-dispatch or check directly.

The cron job now loads both agent-integration-outreach and subagent-output-verification skills. Every PR check goes through one of two paths: direct gh pr checks (preferred) or verified subagent dispatch (when unavoidable).

Get It

The verification harness and all six reference implementations with L6 gates are available:

  • Verification Harness: ~/.hermes/scripts/subagent-verify.py — real Ed25519 via PyNaCl, dispatch + verify modes
  • Python: vanilla_agent.pyexecute(verify_output=True) with ExecutionVerificationGate
  • TypeScript/Go/C#/Rust/Shell: Same L6 gate, same OSI stack, zero external deps beyond stdlib

All under CC BY 4.0. Full spec at workswithagents.com/standards.

If your agents are delegating to subagents without verification — and they are, because every agent framework does — the fix is a single file, 300 lines, real crypto, and three exit codes that tell you whether to trust the output or throw it away.


I build agent infrastructure inside Microsoft 365. SPFx · TypeScript · autonomous multi-agent systems. Currently open to senior/architect roles (£120K+ remote UK). → vilius@workswithagents.com

More from this blog

W

Works With Agents

26 posts