Security
Multi-layered defense architecture covering prompt injection, agent-to-agent trust, policy enforcement, script sandboxing, and skill import safety.
4-Layer Prompt Injection Defense
internal/pipeline/guards.goRegex patterns, encoding evasion detection, financial manipulation checks, multi-language injection scanning → ThreatScore 0.0–1.0
internal/agent/prompt.goHMAC-tagged trust boundaries (session secret + content hash) — unforgeable by injected content
internal/agent/policy.goAuthority-based tool access control (creator > self > peer > external), financial guards, self-modification locks
internal/agent/policy.goOutput pattern scanning, behavioral anomaly detection (tool pattern changes, protected file access, repeated financial ops)
Zero-Trust Agent-to-Agent Protocol
- ▸Mutual authentication via on-chain identity (ERC-8004 registry on Base)
- ▸Challenge-response with signed nonces + timestamps (60s window)
- ▸ECDH ephemeral keypairs → AES-256-GCM session encryption with forward secrecy
- ▸Per-message HMAC authentication, rate limiting, size limits
- ▸Peer messages pass through injection defense with reduced authority
- ▸Opacity principle: agents never expose internal memory, prompts, keys, or session history
Policy Engine
6 built-in rules. All decisions audit-logged to the policy_decisions table.
Levels: creator > self > peer > external. Each with progressively restricted tool access.
Tool risk classification: Safe, Caution, Dangerous, Forbidden.
Per-payment caps, hourly/daily transfer limits, minimum reserve enforcement.
Prevents access to sensitive paths (wallet files, database, config).
Per-tool and per-session rate limits to prevent runaway execution.
Input validation and output scanning for all tool calls.
Script Sandbox
- ▸Configurable interpreter whitelist (bash, python3, node by default)
- ▸Plugin scripts: unconditional env_clear with minimal allowlist (PATH, HOME, USER, LANG, TERM, TMPDIR + ROBOTICUS_* vars)
- ▸Skill scripts: environment stripping when sandbox_env = true (PATH, HOME, ROBOTICUS_SESSION_ID, ROBOTICUS_AGENT_ID)
- ▸Tool name validation -- strict [a-zA-Z0-9_-] character set; path traversal patterns rejected at manifest parse time
- ▸Script path confinement -- resolved paths are canonicalized and must remain within the plugin directory
- ▸Timeout enforcement and output truncation
Skill Import Safety Scanning
50+ danger patterns across 5 categories. Verdicts: Clean, Warnings (review recommended), Critical (import blocked).
| Category | Examples |
|---|---|
| Dangerous Commands | rm -rf /, fork bombs, pipe-to-shell RCE, dynamic eval |
| Network Access | curl, wget, netcat, SSH |
| Filesystem Access | writes to ~/.ssh/, ~/.gnupg/, access to roboticus.db or wallet.json |
| Environment Exfiltration | reading $API_KEY, $SECRET, $PASSWORD, process.env, os.environ |
| Obfuscation | base64-decode piped to shell |