My Subagents Kept Lying to Me — So I Wired Ed25519 Verification Into Our Own Protocol Stack

Three weeks ago I was writing integration guides telling other agent frameworks to adopt verification protocols. Meanwhile, my own subagents were returning hallucinated status reports that I was blindly trusting.

What I Built: Self-Verification For Our Own Delegation

The fix wasn't a new tool. The fix was eating our own dog food.

Layer 1: Real Ed25519 Signing

The verification harness (subagent-verify.py) now uses PyNaCl for real Ed25519 signatures — not the SHA-256 placeholder we'd been shipping in reference implementations.

Before dispatch, the parent generates an Ed25519 keypair:

python3.11 ~/.hermes/scripts/subagent-verify.py dispatch \
  --task "check all integration PRs" \
  --agent-name "tracker-$(date +%H%M)"

This produces:

public_key — 32-byte Ed25519 verify key (hex). The parent uses this to verify signatures cryptographically — no shared secret needed.
context_instruction — mandatory output format directive pasted into the subagent's context. The subagent MUST return structured JSON with a signature.
_parent_seed — 32-byte private key. Never included in subagent context.

When the subagent returns, the parent verifies:

echo "$subagent_output" | python3.11 ~/.hermes/scripts/subagent-verify.py verify \
  --public-key "abc123..." \
  --agent-id "tracker-1422"

Exit codes tell the story:

Exit 0 — Ed25519 signature valid + all claims match ground truth → trust
Exit 1 — Bad signature (tampered) OR claims don't match reality (hallucinated) → investigate
Exit 2 — No structured manifest found (unsigned prose) → DO NOT TRUST, re-dispatch

Three test cases confirmed the harness catches exactly what it should:

Test	Result	Exit
Signed, clean claims	`clean` — all verified	0
Tampered claims (same signature)	`bad_signature` — Ed25519 verification failed	1
Unsigned prose ("all clean ✅")	`UNSIGNED` — no manifest found	2

The tamper detection is real. If a subagent's claims are modified after signing — even a single character — the Ed25519 signature won't verify. This catches both accidental corruption and malicious modification.

Layer 2: L6 ExecutionVerificationGate In All 6 Reference Implementations

The standalone harness is for parent-side verification. But agents that self-verify their subtasks need protocol-level enforcement. We added ExecutionVerificationGate (L6) to all six vanilla agent reference implementations — Python, TypeScript, Go, C#, Rust, and Shell.

It sits directly in the agent execution loop:

execute() → compliance_gate → _run() → VERIFICATION_GATE → tx.execute → DONE
                                           ↑
                                  unsigned/bad_sig → BLOCKED

Three tiers of validation:

Format — is there a structured claims array?
Signature — is there an Ed25519 hex signature?
Crypto — does the signature verify against the agent's public key?

If any tier fails, the task is blocked — not silently accepted. In the Python reference:

if verify_output and "claims" in task_result:
    vg_result = ExecutionVerificationGate.validate(task_result, self.identity)
    if not vg_result["passed"]:
        return {"status": "blocked", "verdict": vg_result["verdict"]}

Layer 3: Wired Into Production Cron

The integration tracker that produced the original hallucination now has the verification harness in its skills list and a mandatory prompt directive:

CRITICAL — Direct Checks Only, No Subagents. Never use delegate_task for PR status checks. If a subagent is unavoidable, run dispatch → verify with Ed25519. Exit 2 means re-dispatch or check directly.

The cron job now loads both agent-integration-outreach and subagent-output-verification skills. Every PR check goes through one of two paths: direct gh pr checks (preferred) or verified subagent dispatch (when unavoidable).

Get It

The verification harness and all six reference implementations with L6 gates are available:

Verification Harness: ~/.hermes/scripts/subagent-verify.py — real Ed25519 via PyNaCl, dispatch + verify modes
Python: vanilla_agent.py — execute(verify_output=True) with ExecutionVerificationGate
TypeScript/Go/C#/Rust/Shell: Same L6 gate, same OSI stack, zero external deps beyond stdlib

All under CC BY 4.0. Full spec at workswithagents.com/standards.

If your agents are delegating to subagents without verification — and they are, because every agent framework does — the fix is a single file, 300 lines, real crypto, and three exit codes that tell you whether to trust the output or throw it away.

I build agent infrastructure inside Microsoft 365. SPFx · TypeScript · autonomous multi-agent systems. Currently open to senior/architect roles (£120K+ remote UK). → vilius@workswithagents.com

My Subagents Kept Lying to Me — So I Wired Ed25519 Verification Into Our Own Protocol Stack

What I Built: Self-Verification For Our Own Delegation

Layer 1: Real Ed25519 Signing

Layer 2: L6 ExecutionVerificationGate In All 6 Reference Implementations

Layer 3: Wired Into Production Cron

Get It

Comments

More from this blog

I Ran 5 LLMs Through 10 Real Agent Coding Tasks. The Free One Won.

AI Agents Are Finding Bugs in Your Tools. Here's How to Get Notified First.

How to Give Your AI Agent a Shared Memory — in 3 Lines

Every Public-Facing Tool on My Site Was Broken. All Three.

Why would an agent install your package?

Command Palette

What I Built: Self-Verification For Our Own Delegation

Layer 1: Real Ed25519 Signing

Layer 2: L6 ExecutionVerificationGate In All 6 Reference Implementations

Layer 3: Wired Into Production Cron

Get It

Comments

More from this blog