Security
Multi-layered defense architecture covering prompt injection, agent-to-agent trust, policy enforcement, script sandboxing, and skill import safety.
4-Layer Prompt Injection Defense
ironclad-agent/injection.rsRegex patterns, encoding evasion detection, financial manipulation checks, multi-language injection scanning → ThreatScore 0.0–1.0
ironclad-agent/prompt.rsHMAC-tagged trust boundaries (session secret + content hash) — unforgeable by injected content
ironclad-agent/policy.rsAuthority-based tool access control (creator > self > peer > external), financial guards, self-modification locks
ironclad-agent/policy.rsOutput pattern scanning, behavioral anomaly detection (tool pattern changes, protected file access, repeated financial ops)
Zero-Trust Agent-to-Agent Protocol
- ▸Mutual authentication via on-chain identity (ERC-8004 registry on Base)
- ▸Challenge-response with signed nonces + timestamps (60s window)
- ▸ECDH ephemeral keypairs → AES-256-GCM session encryption with forward secrecy
- ▸Per-message HMAC authentication, rate limiting, size limits
- ▸Peer messages pass through injection defense with reduced authority
- ▸Opacity principle: agents never expose internal memory, prompts, keys, or session history
Policy Engine
6 built-in rules. All decisions audit-logged to the policy_decisions table.
Levels: creator > self > peer > external. Each with progressively restricted tool access.
Tool risk classification: Safe, Caution, Dangerous, Forbidden.
Per-payment caps, hourly/daily transfer limits, minimum reserve enforcement.
Prevents access to sensitive paths (wallet files, database, config).
Per-tool and per-session rate limits to prevent runaway execution.
Input validation and output scanning for all tool calls.
Script Sandbox
- ▸Configurable interpreter whitelist (bash, python3, node by default)
- ▸Environment stripping in sandbox mode (only PATH, HOME, IRONCLAD_SESSION_ID, IRONCLAD_AGENT_ID)
- ▸Timeout enforcement and output truncation
Skill Import Safety Scanning
50+ danger patterns across 5 categories. Verdicts: Clean, Warnings (review recommended), Critical (import blocked).
| Category | Examples |
|---|---|
| Dangerous Commands | rm -rf /, fork bombs, pipe-to-shell RCE, dynamic eval |
| Network Access | curl, wget, netcat, SSH |
| Filesystem Access | writes to ~/.ssh/, ~/.gnupg/, access to ironclad.db or wallet.json |
| Environment Exfiltration | reading $API_KEY, $SECRET, $PASSWORD, process.env, os.environ |
| Obfuscation | base64-decode piped to shell |