Features
Deep-dive into every subsystem of the Ironclad autonomous agent runtime.
LLM Client Pipeline
- ▸Model-agnostic proxy -- provider config fully externalized in TOML
- ▸Format translation -- typed Rust enums with From<T> for 4 API formats (12+ translation pairs)
- ▸Circuit breaker per provider (Closed/Open/HalfOpen, exponential backoff)
- ▸In-flight deduplication -- SHA-256 fingerprinting prevents duplicate concurrent requests
- ▸Tier-based prompt adaptation -- T1 (condensed), T2 (preamble + reorder), T3/T4 (passthrough + cache_control)
- ▸Heuristic model router -- complexity classification + rule-based fallback chain
- ▸3-level semantic cache -- L1 exact hash, L2 embedding cosine similarity, L3 deterministic tool TTL
- ▸Persistent connection pool -- single reqwest::Client with HTTP/2 multiplexing per provider
- ▸x402 payment protocol -- automatic payment-gated inference (402 -> sign EIP-3009 -> retry)
Agent Core
- ▸ReAct state machine -- Think -> Act -> Observe -> Persist cycle with idle/loop detection
- ▸Tool system -- trait-based plugin architecture with 10 tool categories
- ▸Policy engine -- 6 built-in rules (authority, command safety, financial, path protection, rate limit, validation)
- ▸4-layer prompt injection defense (regex + HMAC boundaries + output validation + behavioral anomaly detection)
- ▸Progressive context loading -- 4 complexity levels (L0 ~2K, L1 ~4K, L2 ~8K, L3 ~16K tokens)
- ▸Subagent framework -- spawn child agents with isolated tool registries and policy overrides
- ▸Human-in-the-loop approvals -- configurable approval gates for high-risk tool calls
Memory System
- ▸5-tier unified memory: Working, Episodic, Semantic, Procedural, Relationship
- ▸Full-text search via SQLite FTS5
- ▸Memory budget manager -- configurable per-tier token allocation with unused rollover
- ▸Background pruning via heartbeat task
Scheduling
- ▸Durable scheduler -- cron expressions, interval, one-time timestamps; all state in SQLite
- ▸Lease-based execution -- prevents double-execution across instances
- ▸Heartbeat daemon -- configurable tick interval, builds TickContext (balance, survival tier) per tick
- ▸7 built-in tasks: SurvivalCheck, UsdcMonitor, YieldTask, MemoryPrune, CacheEvict, MetricSnapshot, AgentCardRefresh
Financial
- ▸Ethereum wallet -- keypair generation/loading via alloy-rs, EIP-191 signing
- ▸x402 payment protocol -- EIP-3009 TransferWithAuthorization for automated LLM payments
- ▸Treasury policy -- per-payment, hourly, daily, and minimum reserve limits
- ▸Yield engine -- deposits idle USDC into Aave/Compound on Base, auto-withdraws below threshold
- ▸Survival tier system -- high/normal/low_compute/critical/dead states drive model downgrading
Channels
- ▸Telegram -- long-poll + webhook, Markdown V2 formatting, 4096-char chunking
- ▸WhatsApp -- Cloud API webhook, signature verification
- ▸Discord -- gateway WebSocket, slash commands, embed formatting
- ▸WebSocket -- direct browser/client connections with ping/pong keepalive
- ▸A2A (Agent-to-Agent) -- zero-trust protocol with ECDSA mutual auth, ECDH key exchange, AES-256-GCM encryption
- ▸Delivery queue -- persistent message delivery with retry logic
Plugin SDK
- ▸Plugin trait -- name(), version(), tools(), execute()
- ▸6 script languages: .gosh, .go, .sh, .py, .rb, .js
- ▸Sandboxed execution -- configurable timeout, output size cap, interpreter whitelist
- ▸Plugin manifest (plugin.toml) -- declarative tool registration with risk levels
- ▸Auto-discovery -- scans plugin directories, registers tools at boot
- ▸Hot-reload -- detects content hash changes and re-registers
Browser Automation
- ▸Chrome DevTools Protocol via WebSocket
- ▸Action types: navigate, click, type, screenshot, evaluate, wait, scroll, extract
- ▸Session management -- start/stop headless Chrome instances
- ▸REST API integration -- /api/browser/* endpoints for remote control
Skill System
- ▸Structured skills (.toml) -- programmatic tool chains with parameter templates, script paths, and policy overrides
- ▸Instruction skills (.md) -- YAML frontmatter (triggers, priority) + markdown body injected into system prompt
- ▸Trigger matching -- keyword, tool name, and regex patterns
- ▸Safety scanning on import -- 50+ danger patterns across 5 categories
- ▸SHA-256 change detection, hot-reload support
Dashboard
- ▸SPA embedded in the binary (zero external dependencies)
- ▸9 pages: Overview, Sessions, Memory, Skills, Scheduler, Metrics, Wallet, Settings, Workspace
- ▸4 themes: AI Black & Purple, CRT Orange, CRT Green, Psychedelic Freakout
- ▸Live sparkline charts and stacked area charts for cost breakdown
- ▸Retro CRT aesthetic with scanline effects and monospace typography
RAG & Embeddings
Ironclad implements a multi-layer retrieval-augmented generation pipeline spread across three crates. Memories are ingested, indexed for both keyword and vector search, and retrieved into the context window at query time.
1. Five-Tier Memory System
All conversational data is routed into five specialized memory tiers, each backed by its own SQLite table. ironclad-db/src/memory.rs
| Tier | Purpose | Key Fields |
|---|---|---|
| Working | Active session context (goals, recent summaries) | session-scoped, importance-ranked |
| Episodic | Significant events (tool use, financial ops) | classified, timestamped |
| Semantic | Factual knowledge (key-value with confidence) | upsert on (category, key) |
| Procedural | Tool success/failure tracking | success/failure counters |
| Relationship | Entity trust scores, interaction history | per-entity trust + count |
The MemoryBudgetManager in ironclad-agent/src/memory.rs allocates a configurable percentage of the total token budget to each tier (default: 30/25/20/15/10).
2. Full-Text Search
Working, episodic, and semantic tiers all feed into an FTS5 virtual table (memory_fts). The fts_search() function queries across all three tiers with a sanitized MATCH query, plus a LIKE fallback for procedural and relationship tables. This is the keyword-based leg of the retrieval pipeline.
ironclad-db/src/memory.rs
3. Embedding Store & Vector Search
Embeddings are stored as JSON-serialized Vec<f32> in an embeddings table. The search_similar() function does a brute-force scan computing cosine similarity against every stored embedding, filtering by a min_similarity threshold and returning the top-k results.
ironclad-db/src/embeddings.rs
4. Hybrid Search — The RAG Retrieval Path
hybrid_search() combines both legs:
- ▸FTS5 keyword match — scores are positional (rank-decayed) and weighted by
(1 - hybrid_weight) - ▸Vector cosine similarity — scores are weighted by
hybrid_weight
Results from both are merged, re-sorted by combined score, and truncated to the limit. The hybrid_weight parameter (default 0.5, configurable in MemoryConfig) controls the balance.
ironclad-db/src/embeddings.rs
5. Semantic Cache
The SemanticCache operates at the LLM request layer with three lookup levels:
- L1Exact hash — SHA-256 of the prompt text, instant match
- L2Semantic similarity — character n-gram embeddings + cosine similarity (threshold 0.85)
- L3Tool-aware TTL — shorter TTL for tool-involving responses (1/4 of normal)
This avoids redundant LLM calls for semantically equivalent prompts.
ironclad-llm/src/cache.rs
6. Context Assembly
The build_context() function packs the final prompt within a token budget determined by query complexity (L0=2k, L1=4k, L2=8k, L3=16k tokens). It fills the context window in priority order: system prompt, then retrieved memories (the RAG output), then conversation history (newest first, truncated when budget exhausts). When context exceeds 80% capacity, soft_trim evicts oldest non-system messages and build_compaction_prompt can generate a summary for insertion.
ironclad-agent/src/context.rs
7. Post-Turn Ingestion
After each turn, ingest_turn() classifies the exchange (tool use, financial, social, creative, reasoning) and routes content into the appropriate memory tiers automatically, so future RAG queries have fresh material to retrieve.
ironclad-agent/src/memory.rs
Current Limitations
The embedding generation itself is placeholder-ready — the system stores and searches vectors, but there is no active embedding model integration yet (embedding_provider and embedding_model in the config are Option<String> and default to None). The semantic cache uses a lightweight character n-gram embedding as a stopgap. A real deployment would need to wire up an embedding provider (local like nomic-embed-text on Ollama, or remote like OpenAI text-embedding-3-small) to generate real vectors for the store_embedding / hybrid_search pipeline.
The brute-force scan in search_similar is also fine for small-to-medium memory stores but would need an index (HNSW or similar) if the embedding count grew into the tens of thousands.
Full Comparison: Ironclad vs OpenClaw
| Dimension | OpenClaw | Ironclad |
|---|---|---|
| Architecture | 3 separate processes (Node.js, Python, TypeScript) | Single Rust binary |
| Languages | Node.js + Python + TypeScript + Go | Rust (one language, one toolchain) |
| Memory usage | ~500 MB (3 processes) | ~50 MB (1 process) |
| Proxy latency | ~50ms (Python aiohttp) | ~2ms (in-process, persistent pool) |
| Cold start | ~3s (Node.js) + ~2s (Python) | ~50ms |
| Binary size | ~200 MB (node_modules + pip) | ~15 MB static binary |
| Supply chain | 500+ npm + pip packages | ~50 auditable crates |
| Database | 5 storage layers (JSONL, PostgreSQL, SQLite, JSON, MD) | 1 unified SQLite (28 tables, WAL) |
| Model routing | Rule-based fallback only | Heuristic complexity routing + rule-based fallback |
| Semantic cache | None | 3-level (exact, embedding, tool TTL) |
| Injection defense | 8 regex checks | 4-layer defense (regex + HMAC + output + behavioral) |
| Agent-to-agent | No mutual auth | Zero-trust (ECDSA, ECDH, AES-256-GCM) |
| Financial | x402 topup only; USDC idle | x402 + yield engine (4-8% APY) |
| Dashboard | Next.js (separate process, read-only) | Embedded SPA (read + write, 41 routes) |
| Plugin system | Markdown skills only | Dual-format skills + plugin SDK (6 languages) |