Features

Deep-dive into every subsystem of the Ironclad autonomous agent runtime.

LLM Client Pipeline Agent Core Memory System Scheduling Financial Channels Plugin SDK Browser Automation Skill System Dashboard RAG & Embeddings

LLM Client Pipeline

▸Model-agnostic proxy -- provider config fully externalized in TOML
▸Format translation -- typed Rust enums with From<T> for 4 API formats (12+ translation pairs)
▸Circuit breaker per provider (Closed/Open/HalfOpen, exponential backoff)
▸In-flight deduplication -- SHA-256 fingerprinting prevents duplicate concurrent requests
▸Tier-based prompt adaptation -- T1 (condensed), T2 (preamble + reorder), T3/T4 (passthrough + cache_control)
▸Heuristic model router -- complexity classification + rule-based fallback chain
▸3-level semantic cache -- L1 exact hash, L2 embedding cosine similarity, L3 deterministic tool TTL
▸Persistent connection pool -- single reqwest::Client with HTTP/2 multiplexing per provider
▸x402 payment protocol -- automatic payment-gated inference (402 -> sign EIP-3009 -> retry)

Agent Core

▸ReAct state machine -- Think -> Act -> Observe -> Persist cycle with idle/loop detection
▸Tool system -- trait-based plugin architecture with 10 tool categories
▸Policy engine -- 6 built-in rules (authority, command safety, financial, path protection, rate limit, validation)
▸4-layer prompt injection defense (regex + HMAC boundaries + output validation + behavioral anomaly detection)
▸Progressive context loading -- 4 complexity levels (L0 ~2K, L1 ~4K, L2 ~8K, L3 ~16K tokens)
▸Subagent framework -- spawn child agents with isolated tool registries and policy overrides
▸Human-in-the-loop approvals -- configurable approval gates for high-risk tool calls

Memory System

▸5-tier unified memory: Working, Episodic, Semantic, Procedural, Relationship
▸Full-text search via SQLite FTS5
▸Memory budget manager -- configurable per-tier token allocation with unused rollover
▸Background pruning via heartbeat task

Scheduling

▸Durable scheduler -- cron expressions, interval, one-time timestamps; all state in SQLite
▸Lease-based execution -- prevents double-execution across instances
▸Heartbeat daemon -- configurable tick interval, builds TickContext (balance, survival tier) per tick
▸7 built-in tasks: SurvivalCheck, UsdcMonitor, YieldTask, MemoryPrune, CacheEvict, MetricSnapshot, AgentCardRefresh

Financial

▸Ethereum wallet -- keypair generation/loading via alloy-rs, EIP-191 signing
▸x402 payment protocol -- EIP-3009 TransferWithAuthorization for automated LLM payments
▸Treasury policy -- per-payment, hourly, daily, and minimum reserve limits
▸Yield engine -- deposits idle USDC into Aave/Compound on Base, auto-withdraws below threshold
▸Survival tier system -- high/normal/low_compute/critical/dead states drive model downgrading

Channels

▸Telegram -- long-poll + webhook, Markdown V2 formatting, 4096-char chunking
▸WhatsApp -- Cloud API webhook, signature verification
▸Discord -- gateway WebSocket, slash commands, embed formatting
▸WebSocket -- direct browser/client connections with ping/pong keepalive
▸A2A (Agent-to-Agent) -- zero-trust protocol with ECDSA mutual auth, ECDH key exchange, AES-256-GCM encryption
▸Delivery queue -- persistent message delivery with retry logic

Plugin SDK

▸Plugin trait -- name(), version(), tools(), execute()
▸6 script languages: .gosh, .go, .sh, .py, .rb, .js
▸Sandboxed execution -- configurable timeout, output size cap, interpreter whitelist
▸Plugin manifest (plugin.toml) -- declarative tool registration with risk levels
▸Auto-discovery -- scans plugin directories, registers tools at boot
▸Hot-reload -- detects content hash changes and re-registers

Browser Automation

▸Chrome DevTools Protocol via WebSocket
▸Action types: navigate, click, type, screenshot, evaluate, wait, scroll, extract
▸Session management -- start/stop headless Chrome instances
▸REST API integration -- /api/browser/* endpoints for remote control

Skill System

▸Structured skills (.toml) -- programmatic tool chains with parameter templates, script paths, and policy overrides
▸Instruction skills (.md) -- YAML frontmatter (triggers, priority) + markdown body injected into system prompt
▸Trigger matching -- keyword, tool name, and regex patterns
▸Safety scanning on import -- 50+ danger patterns across 5 categories
▸SHA-256 change detection, hot-reload support

Dashboard

▸SPA embedded in the binary (zero external dependencies)
▸9 pages: Overview, Sessions, Memory, Skills, Scheduler, Metrics, Wallet, Settings, Workspace
▸4 themes: AI Black & Purple, CRT Orange, CRT Green, Psychedelic Freakout
▸Live sparkline charts and stacked area charts for cost breakdown
▸Retro CRT aesthetic with scanline effects and monospace typography

RAG & Embeddings

Ironclad implements a multi-layer retrieval-augmented generation pipeline spread across three crates. Memories are ingested, indexed for both keyword and vector search, and retrieved into the context window at query time.

1. Five-Tier Memory System

All conversational data is routed into five specialized memory tiers, each backed by its own SQLite table. ironclad-db/src/memory.rs

Tier	Purpose	Key Fields
Working	Active session context (goals, recent summaries)	session-scoped, importance-ranked
Episodic	Significant events (tool use, financial ops)	classified, timestamped
Semantic	Factual knowledge (key-value with confidence)	upsert on (category, key)
Procedural	Tool success/failure tracking	success/failure counters
Relationship	Entity trust scores, interaction history	per-entity trust + count

The MemoryBudgetManager in ironclad-agent/src/memory.rs allocates a configurable percentage of the total token budget to each tier (default: 30/25/20/15/10).

2. Full-Text Search

Working, episodic, and semantic tiers all feed into an FTS5 virtual table (memory_fts). The fts_search() function queries across all three tiers with a sanitized MATCH query, plus a LIKE fallback for procedural and relationship tables. This is the keyword-based leg of the retrieval pipeline.

ironclad-db/src/memory.rs

3. Embedding Store & Vector Search

Embeddings are stored as JSON-serialized Vec<f32> in an embeddings table. The search_similar() function does a brute-force scan computing cosine similarity against every stored embedding, filtering by a min_similarity threshold and returning the top-k results.

ironclad-db/src/embeddings.rs

4. Hybrid Search — The RAG Retrieval Path

hybrid_search() combines both legs:

▸FTS5 keyword match — scores are positional (rank-decayed) and weighted by (1 - hybrid_weight)
▸Vector cosine similarity — scores are weighted by hybrid_weight

Results from both are merged, re-sorted by combined score, and truncated to the limit. The hybrid_weight parameter (default 0.5, configurable in MemoryConfig) controls the balance.

ironclad-db/src/embeddings.rs

5. Semantic Cache

The SemanticCache operates at the LLM request layer with three lookup levels:

L1Exact hash — SHA-256 of the prompt text, instant match
L2Semantic similarity — character n-gram embeddings + cosine similarity (threshold 0.85)
L3Tool-aware TTL — shorter TTL for tool-involving responses (1/4 of normal)

This avoids redundant LLM calls for semantically equivalent prompts.

ironclad-llm/src/cache.rs

6. Context Assembly

The build_context() function packs the final prompt within a token budget determined by query complexity (L0=2k, L1=4k, L2=8k, L3=16k tokens). It fills the context window in priority order: system prompt, then retrieved memories (the RAG output), then conversation history (newest first, truncated when budget exhausts). When context exceeds 80% capacity, soft_trim evicts oldest non-system messages and build_compaction_prompt can generate a summary for insertion.

ironclad-agent/src/context.rs

7. Post-Turn Ingestion

After each turn, ingest_turn() classifies the exchange (tool use, financial, social, creative, reasoning) and routes content into the appropriate memory tiers automatically, so future RAG queries have fresh material to retrieve.

ironclad-agent/src/memory.rs

Current Limitations

The embedding generation itself is placeholder-ready — the system stores and searches vectors, but there is no active embedding model integration yet (embedding_provider and embedding_model in the config are Option<String> and default to None). The semantic cache uses a lightweight character n-gram embedding as a stopgap. A real deployment would need to wire up an embedding provider (local like nomic-embed-text on Ollama, or remote like OpenAI text-embedding-3-small) to generate real vectors for the store_embedding / hybrid_search pipeline.

The brute-force scan in search_similar is also fine for small-to-medium memory stores but would need an index (HNSW or similar) if the embedding count grew into the tens of thousands.

Full Comparison: Ironclad vs OpenClaw

Dimension	OpenClaw	Ironclad
Architecture	3 separate processes (Node.js, Python, TypeScript)	Single Rust binary
Languages	Node.js + Python + TypeScript + Go	Rust (one language, one toolchain)
Memory usage	~500 MB (3 processes)	~50 MB (1 process)
Proxy latency	~50ms (Python aiohttp)	~2ms (in-process, persistent pool)
Cold start	~3s (Node.js) + ~2s (Python)	~50ms
Binary size	~200 MB (node_modules + pip)	~15 MB static binary
Supply chain	500+ npm + pip packages	~50 auditable crates
Database	5 storage layers (JSONL, PostgreSQL, SQLite, JSON, MD)	1 unified SQLite (28 tables, WAL)
Model routing	Rule-based fallback only	Heuristic complexity routing + rule-based fallback
Semantic cache	None	3-level (exact, embedding, tool TTL)
Injection defense	8 regex checks	4-layer defense (regex + HMAC + output + behavioral)
Agent-to-agent	No mutual auth	Zero-trust (ECDSA, ECDH, AES-256-GCM)
Financial	x402 topup only; USDC idle	x402 + yield engine (4-8% APY)
Dashboard	Next.js (separate process, read-only)	Embedded SPA (read + write, 41 routes)
Plugin system	Markdown skills only	Dual-format skills + plugin SDK (6 languages)