Changelog
Release history for the Roboticus autonomous agent runtime. Follows conventional commits.
v0.11.0
New Features & More2026-03-25Added: 5 changes. Changed: 4 changes. Fixed: 4 changes. Key changes: Agent efficacy push: Memory introspection, selective forgetting, relationship-memory automation, and task operating state are now first-class parts of the shared pipeline. Introspection-driven execution: Task-oriented turns now begin from introspected runtime truth, can compose specialists from a clean slate, and preserve executed state across normalization retries. Delegation and composition flow: Empty-roster specialist requests now progress into composition and delegation instead of collapsing into narrated intent or centralization. Task-path truthfulness: Filesystem/runtime blockers now surface as real blockers instead of canned fallback prose.
Highlights
- ▸Agent efficacy push: Memory introspection, selective forgetting, relationship-memory automation, and task operating state are now first-class parts of the shared pipeline.
- ▸Introspection-driven execution: Task-oriented turns now begin from introspected runtime truth, can compose specialists from a clean slate, and preserve executed state across normalization retries.
- ▸MCP release-grade management: Shared `/api/mcp/servers` management surface aligned across dashboard and CLI.
- ▸Skill and subagent utilization telemetry: Usage count and last-used signals exposed for skills and subagents.
- ▸Release vetting automation: Added `scripts/run-v0110-vetting.sh` and `docs/testing/v0110-vetting-matrix.md` to lock in the v0.11.0 regression contract.
- ▸Delegation and composition flow: Empty-roster specialist requests now progress into composition and delegation instead of collapsing into narrated intent or centralization.
- ▸Task-path truthfulness: Filesystem/runtime blockers now surface as real blockers instead of canned fallback prose.
- ▸Prompt and planner behavior: Introspection is now treated as the first operational step for task work and feeds a shared task operating state/action planner.
v0.10.0
Correctness, Safety & Operational Maturity (14 changes)2026-03-23Added: 8 changes. Fixed: 4 changes. Changed: 2 changes. Key changes: Model categorization with 29-model capability profiles, skill authoring API, Landlock/Job Object script confinement, typestate session lifecycle, delegation scoring engine, and critical signal adapter fixes.
Highlights
- ▸Model categorization (Phase 1): 10 task categories, 29 model profiles across 8 providers, category-aware routing with `category_fit` metascore dimension.
- ▸Skill authoring API: Create, validate, and publish Markdown instruction skills via `POST /api/skills/author`.
- ▸Landlock & Job Object confinement: Linux filesystem sandboxing and Windows process isolation for script execution.
- ▸Typestate sessions: Compile-time session lifecycle — `Session<Created>` → `Session<Active>` → `Session<Closed>`.
- ▸Delegation scoring engine: `score_agent_fit()`, `composite_fit_ratio()`, and `utility_margin_for_delegation()` for decomposition decisions.
- ▸Signal adapter critical fixes: Replaced `std::sync::Mutex` in async context, added rate limiting, bounded buffer growth.
v0.9.9
Terminal UX & Release Hardening (8 changes)2026-03-18Added: 6 changes. Changed: 1 change. Fixed: 1 change. Key changes: New `roboticus tui` terminal application, configurable context budget tiers, integrations management endpoints/CLI, tool output noise filtering, and dashboard configuration/routing UX improvements.
Highlights
- ▸Terminal UI: Added `roboticus tui` (`roboticus-tui` crate) with chat, logs, status bar, streaming responses, and session resume.
- ▸Context budget tuning: Added configurable L0-L3 token budgets and per-channel minimum complexity level controls.
- ▸Integrations management: Added `POST /api/channels/{platform}/test`, dashboard per-channel probes, and `roboticus integrations` CLI commands.
- ▸Tool output filter chain: Added ANSI/progress/duplicate/whitespace filtering before LLM observation to reduce token noise.
- ▸Dashboard and routing polish: Exposed unconfigured sections with enable actions and improved routing profile validation/toasts/defaults.
- ▸Default bind address: Changed defaults from `127.0.0.1` to `localhost` for safer local consistency.
v0.9.8
Platform Refactor & Reliability (22 changes)2026-03-16Added: 7 changes. Fixed: 10 changes. Changed: 5 changes. Key changes: server crate split (`roboticus-cli`/`roboticus-api`/slim server), unified error model, channel adapter helper extraction, model categorization/router spec, and broad hardening around SQL safety, session integrity, config generation, and silent error triage.
Highlights
- ▸Server crate split: Decomposed the large server crate into `roboticus-cli`, `roboticus-api`, and a slim `roboticus` runtime bootstrap.
- ▸Error type unification: Consolidated into a nested `RoboticusError` hierarchy with clean `From` conversions.
- ▸Channel adapter helper extraction: Added shared formatter/chunking/allowlist helpers across channel adapters.
- ▸Model categorization spec: Added 10-category task taxonomy with routing/orchestrator integration points.
- ▸SQL safety + session integrity: Hardened `drop_column` identifier handling and fixed session `find_or_create` error swallowing.
- ▸Config and platform reliability: Fixed Windows TOML path escaping and converted silent failures into explicit logging/warnings.
v0.9.7
Bug Fixes & Stability (26 changes)2026-03-14Added: 8 changes. Fixed: 18 changes. Key changes: DB fitness hardening (DF-1–DF-18): 18-item SQLite performance audit resolved — retention pruning for 5 high-growth tables, orphan cleanup sweeps (working memory + embeddings), `auto_vacuum=INCREMENTAL`, 6 missing indexes, episodic dead-entry pruning, cache NULL-expiry fix, `PRAGMA synchronous=NORMAL` under WAL, CHECK constraints on 11 columns, and dead `proxy_stats` table removal. Memory hygiene mechanic: `roboticus mechanic` detects and (with `--repair`) purges contaminated memory entries using 7 deterministic LIKE-prefix patterns across 3 tiers, with JSON-structured findings. Circuit breaker window reset: `record_failure()` now tracks `window_start` for rolling-window accumulation — failures spaced ~60s apart correctly accumulate instead of resetting. Embedding auth for local providers: `EmbeddingConfig.is_local` skips API key resolution and auth headers for Ollama/llama.cpp.
Highlights
- ▸DB fitness hardening (DF-1–DF-18): 18-item SQLite performance audit resolved — retention pruning for 5 high-growth tables, orphan cleanup sweeps (working memory + embeddings), `auto_vacuum=INCREMENTAL`, 6 missing indexes, episodic dead-entry pruning, cache NULL-expiry fix, `PRAGMA synchronous=NORMAL` under WAL, CHECK constraints on 11 columns, and dead `proxy_stats` table removal.
- ▸Memory hygiene mechanic: `roboticus mechanic` detects and (with `--repair`) purges contaminated memory entries using 7 deterministic LIKE-prefix patterns across 3 tiers, with JSON-structured findings.
- ▸Sandbox boundary management: Filesystem confinement for skill scripts (skills_dir + `$ROBOTICUS_WORKSPACE`, no traversal/symlink escape), configurable network isolation (`unshare(CLONE_NEWNET)` on Linux), memory ceiling via `RLIMIT_AS`, interpreter allowlist via absolute-path resolution, and mechanic sandbox health reporting.
- ▸Filesystem security overhaul: `FilesystemSecurityConfig` with `workspace_only` mode, ~25 default protected path patterns, `tool_allowed_paths` whitelist (auto-populated from Obsidian vault path), macOS `sandbox-exec` write-denial confinement, and dashboard UI toggles.
- ▸Unified pipeline architecture: `IntentRegistry` (22-variant `Intent` enum), `GuardChain` (12 guards with `full()`/`cached()`/`streaming()` presets), `ShortcutDispatcher` (15 handlers replacing 983-line god function), `PipelineConfig` (4 presets: `api`/`streaming`/`channel`/`cron`), and `DedupGuard` RAII replacing 11 manual release patterns. Net ~653 lines removed.
- ▸ChannelFormatter trait: Per-platform output formatting with static dispatch registry — `TelegramFormatter` (Markdown→MarkdownV2), `DiscordFormatter`, `WhatsAppFormatter`, `SignalFormatter`, `WebFormatter`, `EmailFormatter` — wired into `channel_message.rs` delivery path. 31 unit tests.
- ▸Configurable inference timeouts: Per-provider `timeout_seconds` setting (`[providers.*.timeout_seconds]`) with 300-second default, surfaced in dashboard provider configuration.
- ▸Dashboard session ID copy button: One-click copy-to-clipboard for session IDs in the Sessions panel.
v0.9.6
New Features (14 changes)2026-03-12Added: 14 changes. Key changes: Compliance-first self-funding control plane: Complete revenue opportunity lifecycle (intake → qualify → score → plan → fulfill → settle) with DB-backed restart safety, strategy-level scoring (confidence/effort/risk/priority/recommendation), feedback persistence per opportunity and summary by strategy, configurable post-settlement asset routing (default `PALM_USD`), EVM swap submission with tx-hash tracking and on-chain receipt reconciliation, tax payout lifecycle mirroring swap tasks, and operator-visible accounting (net profit, attributable costs, retained earnings, tax allocation) across API, CLI, and mechanic surfaces. Revenue mechanic integration: Mechanic can probe, reconcile, and repair orphaned or stale revenue jobs and swap/tax reconciliation mismatches via `run_gateway_provider_and_revenue_checks` and `run_gateway_integrated_repair_sweep`.
Highlights
- ▸Compliance-first self-funding control plane: Complete revenue opportunity lifecycle (intake → qualify → score → plan → fulfill → settle) with DB-backed restart safety, strategy-level scoring (confidence/effort/risk/priority/recommendation), feedback persistence per opportunity and summary by strategy, configurable post-settlement asset routing (default `PALM_USD`), EVM swap submission with tx-hash tracking and on-chain receipt reconciliation, tax payout lifecycle mirroring swap tasks, and operator-visible accounting (net profit, attributable costs, retained earnings, tax allocation) across API, CLI, and mechanic surfaces.
- ▸Revenue mechanic integration: Mechanic can probe, reconcile, and repair orphaned or stale revenue jobs and swap/tax reconciliation mismatches via `run_gateway_provider_and_revenue_checks` and `run_gateway_integrated_repair_sweep`.
- ▸Skills catalog: `PluginCatalog` with CLI flows (`roboticus skills catalog list/install/activate`) and API endpoints (`GET/POST /api/skills/catalog`, `/install`, `/activate`). Registry manifest fetch from remote URL.
- ▸Skill registry protocol: Migration 022 adds `version`, `author`, `registry_source` columns to skills table. Multi-registry support via `RegistrySource { name, url, priority, enabled }` with backward-compatible fallback from legacy single-URL `registry_url`.
- ▸Multi-registry fetch: Registry sync iterates all configured sources, namespaces skills as `{registry_name}/{skill_name}` for non-local sources, applies semver comparison to skip redundant downloads, and resolves conflicts by registry priority.
- ▸Learning loop closure: Agent now detects repeating multi-step tool sequences on session close and synthesizes reusable SKILL.md procedure files. `learned_skills` table (migration 021) tracks reinforcement history (success/failure counts, priority). `LearningConfig` exposes tuneable thresholds for minimum sequence length, success ratio, priority boost/decay, and skill cap. Inspired by recent work on autonomous tool-use learning in LLM agents ([arXiv:2603.05344](https://arxiv.org/abs/2603.05344)).
- ▸Procedural failure recording: `record_procedural_failure()` (previously dead code in the DB layer) is now called from `ingest_turn()` when tool results indicate failure, closing the procedural memory feedback loop.
- ▸Skill priority adjustment: Governor `tick()` now runs `adjust_learned_skill_priorities()` after episodic decay — learned skills with high success ratios get priority boosts; those with poor ratios get decayed.
v0.9.5
Improvements & More2026-03-06Changed: 8 changes. Fixed: 5 changes. Note: 1 change. Key changes: Terminology normalization: `soul_text` → `os_text`, `soul_history` → `os_personality_history` (migration 020) for firmware/OS terminology coherency. Behavior soak hardening: `scripts/run-agent-behavior-soak.py` now includes regression checks for filesystem capability truthfulness, subagent capability response quality, and affirmative continuation quality, with rubric updates to score substantive outcomes over brittle phrase matching. Internal protocol fallback leakage: response sanitization no longer surfaces protocol-placeholder fallback text; empty/degraded sanitized content now resolves through deterministic user-facing quality fallback. Markdown count execution reliability: execution shortcut path now handles recursive markdown-file count prompts deterministically, including strict numeric-only responses when requested (`count only` / `only the number` style prompts).
Highlights
- ▸Terminology normalization: `soul_text` → `os_text`, `soul_history` → `os_personality_history` (migration 020) for firmware/OS terminology coherency.
- ▸Behavior soak hardening: `scripts/run-agent-behavior-soak.py` now includes regression checks for filesystem capability truthfulness, subagent capability response quality, and affirmative continuation quality, with rubric updates to score substantive outcomes over brittle phrase matching.
- ▸Roadmap/release traceability: `docs/releases/v0.9.5.md` and `docs/ROADMAP.md` updated with current v0.9.5 prep status for speculative execution, browser runtime support, CLI skill roadmap slice, and behavior continuity validation.
- ▸Architecture documentation: Added explicit v0.9.5-prep control/dataflow coverage for deterministic execution shortcuts and guarded response sanitization in `docs/architecture/roboticus-dataflow.md` and `docs/architecture/roboticus-sequences.md`.
- ▸Browser runtime continuity: Browser action execution now attempts a single stop/start session recovery when CDP disconnect/closed-socket errors are detected, limited to idempotent actions to avoid duplicate side effects on replay.
- ▸Autonomy turn-budget controls: Added configurable agent-level ReAct budget controls (`autonomy_max_react_turns`, `autonomy_max_turn_duration_seconds`) and wired enforcement into the runtime loop.
- ▸CLI adapter response contract: `run_script` now emits stable typed metadata (`adapter`, `schema_version`, `status`, `error_class`) and normalized script error classes for downstream handling.
- ▸Speculative policy invariants: Added explicit test coverage enforcing Safe-only speculative eligibility (Caution/Dangerous/Forbidden remain excluded from speculative execution).
v0.9.4+hotfix.1
Improvements & More2026-03-05Added: 3 changes. Changed: 7 changes. Fixed: 2 changes. Security: 1 change. Key changes: Routing observability UX: Metrics dashboard now includes an explorable model-decision graph and a routing-profile spider graph (correctness/cost/speed) with runtime apply support via safe config patching. Model shift telemetry: Non-streaming inference pipeline now emits websocket `model_shift` events when execution model differs from selected model (fallback or cache continuity path). Agent message contract: `/api/agent/message` responses now expose both routing-time and execution-time model fields (`selected_model`, `model`, `model_shift_from`) for continuity diagnostics. Routing dataset privacy default: `GET /api/models/routing-dataset` now redacts `user_excerpt` by default; explicit opt-in is required to include excerpts.
Highlights
- ▸Routing observability UX: Metrics dashboard now includes an explorable model-decision graph and a routing-profile spider graph (correctness/cost/speed) with runtime apply support via safe config patching.
- ▸Model shift telemetry: Non-streaming inference pipeline now emits websocket `model_shift` events when execution model differs from selected model (fallback or cache continuity path).
- ▸Routing profile roadmap spec: Added `docs/roadmap/0.9.4/features/user-routing-profile-spider-graph.md` and linked roadmap entry.
- ▸Agent message contract: `/api/agent/message` responses now expose both routing-time and execution-time model fields (`selected_model`, `model`, `model_shift_from`) for continuity diagnostics.
- ▸Routing dataset privacy default: `GET /api/models/routing-dataset` now redacts `user_excerpt` by default; explicit opt-in is required to include excerpts.
- ▸Routing eval validation: `POST /api/models/routing-eval` now validates `cost_weight`, `accuracy_floor`, and `accuracy_min_obs` bounds.
- ▸Config defaults/tests: routing defaults now use `metascore`; legacy `heuristic` input is accepted and normalized to `metascore` during validation.
- ▸Cache integrity mode for live agent path: semantic near-match cache reuse is now disabled in the inference pipeline (`lookup_strict`: exact + tool-TTL only) to prevent instruction-mismatched cached responses.
v0.9.2
New Features & More2026-03-02Added: 15 changes. Changed: 7 changes. Removed: 4 changes. Key changes: Wiring Remediation (Phase 0): Comprehensive Tier 1–3 wiring audit remediation. 14 gates cleared — all functional wires verified against code. See `docs/audit/wiring-audit-v0.9.md` for the full re-audit. Unified Request Pipeline: API (`agent_message`) and channel (`process_channel_message`) paths now share `prepare_inference` + `execute_inference_pipeline` in `core.rs`, eliminating 6+ behavioral asymmetries between entry points. `post_turn_ingest` Tool Results: All call sites now pass actual tool call name + result from the ReAct loop instead of `&[]`. Episodic memory captures tool-use context, improving digest quality. Gate System Note: `build_gate_system_note` now wired in both API and channel paths (previously channel-only).
Highlights
- ▸Wiring Remediation (Phase 0): Comprehensive Tier 1–3 wiring audit remediation. 14 gates cleared — all functional wires verified against code. See `docs/audit/wiring-audit-v0.9.md` for the full re-audit.
- ▸Unified Request Pipeline: API (`agent_message`) and channel (`process_channel_message`) paths now share `prepare_inference` + `execute_inference_pipeline` in `core.rs`, eliminating 6+ behavioral asymmetries between entry points.
- ▸Multi-Tool Parsing: `parse_tool_calls` (plural) correctly parses multiple tool invocations from a single LLM response across all four provider formats.
- ▸OpenAI Responses + Google Tool Wiring: Bidirectional tool support for OpenAI Responses API and Google Generative AI — tool definitions translated into requests, structured tool calls parsed from responses with `{"tool_call": ...}` shim.
- ▸Quality Warm Start: `QualityTracker` is seeded from `inference_costs` on startup, eliminating cold-start assumptions for metascore routing.
- ▸Escalation Read Feedback: `EscalationTracker` acceptance history now feeds routing weight adjustments via `escalation_bias`, closing the feedback loop.
- ▸Approval Resume: Blocked tool calls are re-executed asynchronously after approval via `execute_tool_call_after_approval`.
- ▸Hippocampus (2.13): Self-describing schema map with auto-discovery of all system tables. Agent-created tables (`ag_<id>_*`) with access levels, row counts, and guardrails. Compact summary injected into system prompt (~200 tokens) for ambient storage awareness.
v0.9.1
New Features2026-03-02Added: 6 changes. Changed: 2 changes. Key changes: Model Metascore Routing (2.19 core): Unified per-model scoring replaces availability-first routing. `ModelProfile` combines static provider attributes (cost, tier, locality) with dynamic observations (quality, capacity headroom, circuit breaker health). `metascore()` produces a transparent 5-dimension breakdown (efficacy, cost, availability, locality, confidence) with configurable weights for cost-aware mode. `select_by_metascore()` is now the primary routing decision in `select_routed_model_with_audit()`. Tiered Inference Pipeline (2.3): `ConfidenceEvaluator` scores local model responses using token probability, response length, and self-reported uncertainty signals. Responses below the confidence floor trigger automatic escalation to the next model in the fallback chain. `EscalationTracker` records escalation events for capacity/cost telemetry. Routing hot path: `select_routed_model_with_audit()` now extracts features from user content, classifies task complexity, builds model profiles, and selects via metascore — replacing the previous first-usable-model strategy. Rate limiter architecture: `GlobalRateLimitLayer` is now constructed once at startup and shared between the axum middleware stack and `AppState`, enabling admin observability of the same rate-limit counters the middleware uses.
Highlights
- ▸Model Metascore Routing (2.19 core): Unified per-model scoring replaces availability-first routing. `ModelProfile` combines static provider attributes (cost, tier, locality) with dynamic observations (quality, capacity headroom, circuit breaker health). `metascore()` produces a transparent 5-dimension breakdown (efficacy, cost, availability, locality, confidence) with configurable weights for cost-aware mode. `select_by_metascore()` is now the primary routing decision in `select_routed_model_with_audit()`.
- ▸Tiered Inference Pipeline (2.3): `ConfidenceEvaluator` scores local model responses using token probability, response length, and self-reported uncertainty signals. Responses below the confidence floor trigger automatic escalation to the next model in the fallback chain. `EscalationTracker` records escalation events for capacity/cost telemetry.
- ▸Throttle Event Observability (1.17): New `GET /api/stats/throttle` endpoint exposes live rate-limit counters including global/per-IP/per-actor request counts, throttle tallies, and top-10 offenders. `ThrottleSnapshot` struct provides admin visibility into abuse patterns.
- ▸Quality Tracking: `QualityTracker` now records observations on every inference success with a heuristic quality signal (response structure, finish reason, latency). Exponential moving average feeds into metascore efficacy dimension.
- ▸Audit Trail Extensions: `ModelSelectionAudit` now includes `metascore_breakdown` (full per-dimension scores) and `complexity_score` for routing decisions. `ModelCandidateAudit` includes per-candidate metascores.
- ▸Profile module (`roboticus-llm::profile`): `ModelProfile`, `MetascoreBreakdown`, `build_model_profiles()`, `select_by_metascore()` — 9 unit tests covering local/cloud task routing, cold-start penalties, cost-aware selection, blocked model filtering, and deterministic tie-breaking.
- ▸Routing hot path: `select_routed_model_with_audit()` now extracts features from user content, classifies task complexity, builds model profiles, and selects via metascore — replacing the previous first-usable-model strategy.
- ▸Rate limiter architecture: `GlobalRateLimitLayer` is now constructed once at startup and shared between the axum middleware stack and `AppState`, enabling admin observability of the same rate-limit counters the middleware uses.
v0.8.9
Bug Fixes & Stability (17 changes)2026-03-01Security: 3 changes. Fixed: 14 changes. Key changes: HIGH: RwLock held across LLM call: Config read-lock was held for the entire duration of streaming LLM calls, blocking all config writes. Now clones needed values and drops the lock before the network call. HIGH: CSS selector injection: Browser `click` and `type_text` actions now validate CSS selectors, rejecting inputs containing `{`/`}` (which can escape selector context into rule injection) and enforcing a 500-character length limit. HIGH: SSE streaming drops tool-use deltas: OpenAI-format SSE chunks with `content: null` (common in function-call and tool-use deltas) were silently dropped. Now emits an empty-string delta, matching the Anthropic and Google format arms. HIGH: Done event schema mismatch: The SSE `stream_done` event used `"content"` key while all streaming chunks used `"delta"`, causing clients to miss the done signal. Now consistently uses `"delta"`.
Highlights
- ▸HIGH: RwLock held across LLM call: Config read-lock was held for the entire duration of streaming LLM calls, blocking all config writes. Now clones needed values and drops the lock before the network call.
- ▸HIGH: CSS selector injection: Browser `click` and `type_text` actions now validate CSS selectors, rejecting inputs containing `{`/`}` (which can escape selector context into rule injection) and enforcing a 500-character length limit.
- ▸HIGH: Relaxed atomic ordering: Cross-task flags and counters using `Ordering::Relaxed` upgraded to `Acquire`/`Release`/`AcqRel` to ensure correct visibility guarantees across async task boundaries.
- ▸HIGH: SSE streaming drops tool-use deltas: OpenAI-format SSE chunks with `content: null` (common in function-call and tool-use deltas) were silently dropped. Now emits an empty-string delta, matching the Anthropic and Google format arms.
- ▸HIGH: Done event schema mismatch: The SSE `stream_done` event used `"content"` key while all streaming chunks used `"delta"`, causing clients to miss the done signal. Now consistently uses `"delta"`.
- ▸HIGH: Dead-letter replay race: Two locks acquired non-atomically during message replay could interleave with concurrent deliveries. Now holds both locks in a single scope.
- ▸HIGH: ReAct tool errors bypass scan_output: Error messages from tool execution were returned directly to the model without content scanning. Now calls `scan_output()` on tool error strings.
- ▸HIGH: derive_nickname Unicode panic: `&text[prefix.len()..]` applied a byte offset from a lowercased string to the original, panicking on multi-byte characters. Now uses `char_indices().nth()` for safe boundary detection.
v0.8.8
Security Hardening (39 changes)2026-03-01Security: 13 changes. Fixed: 26 changes. Key changes: HIGH: WebSocket API key leak: Replaced `?token=` query-string authentication on WebSocket upgrade with a ticket-based flow, preventing API keys from appearing in server logs, proxy logs, and browser history. HIGH: Prompt injection in tips: `get_turn_tips` and `get_session_insights` now sanitize LLM-generated tips before rendering, preventing stored prompt injection via malicious session content. HIGH: Float policy bypass: Policy enforcement on `amount` fields now falls back to `as_f64()` conversion, closing a bypass where float amounts evaded integer-only checks. HIGH: Tool call parsing failures: `parse_tool_call` now uses `rfind` with a candidate loop, correctly parsing tool calls that contain the delimiter character in arguments.
Highlights
- ▸HIGH: WebSocket API key leak: Replaced `?token=` query-string authentication on WebSocket upgrade with a ticket-based flow, preventing API keys from appearing in server logs, proxy logs, and browser history.
- ▸HIGH: Prompt injection in tips: `get_turn_tips` and `get_session_insights` now sanitize LLM-generated tips before rendering, preventing stored prompt injection via malicious session content.
- ▸HIGH: Provider error info leak: `classify_provider_error` in `run_llm_analysis` now strips internal details from error responses before returning to callers.
- ▸MED: XSS in sanitize_html: `sanitize_html` now escapes all 5 OWASP-recommended HTML entities (`& < > " '`), closing a reflected XSS vector.
- ▸MED: Input validation on identifiers: `peer_id`, `group_id`, and `channel` fields now enforce length and character-set constraints, preventing injection of oversized or malformed identifiers.
- ▸MED: Webhook body size limit: Public webhook router now applies `DefaultBodyLimit` to prevent memory exhaustion from oversized payloads.
- ▸MED: Analysis route DoS protection: Analysis routes now apply `ConcurrencyLimitLayer(3)` to prevent resource exhaustion from concurrent expensive LLM calls.
- ▸MED: Config schema leak: `update_config` error responses now return a generic message instead of leaking internal schema details.
v0.8.7
Bug Fixes & Stability (21 changes)2026-02-28Fixed: 19 changes. Added: 2 changes. Key changes: CRIT: Cron jobs silently never firing: `run_cron_worker` timestamp format lacked timezone suffix (`Z`), causing `evaluate_cron` RFC 3339 parse to always fail — all cron-scheduled jobs were silently skipped. HIGH: Telegram chunk_message UTF-8 panic: Byte-level string slicing in `chunk_message` panicked on multi-byte characters (emoji, CJK). Now uses `floor_char_boundary()` matching the Discord adapter. Release notes for v0.8.5 and v0.8.6 (missing from previous releases, blocking release doc gate). Roadmap section 1.24: Built-in CLI Agent Skills (Claude Code + Codex CLI).
Highlights
- ▸CRIT: Cron jobs silently never firing: `run_cron_worker` timestamp format lacked timezone suffix (`Z`), causing `evaluate_cron` RFC 3339 parse to always fail — all cron-scheduled jobs were silently skipped.
- ▸HIGH: Telegram chunk_message UTF-8 panic: Byte-level string slicing in `chunk_message` panicked on multi-byte characters (emoji, CJK). Now uses `floor_char_boundary()` matching the Discord adapter.
- ▸HIGH: Keystore redact_key_name UTF-8 panic: Byte-level `&key[..3]` slicing panicked on multi-byte key names. Now uses `key.chars().take(3)`.
- ▸HIGH: LLM forward_stream missing query: auth mode: Streaming requests to providers using query-string authentication (e.g., Google Generative AI) failed because the `query:` prefix was not handled, sending it as a literal HTTP header instead.
- ▸HIGH: yield_engine U256-to-u64 panic: `real_a_token_balance` panicked via `U256::to::<u64>()` if an aToken balance exceeded `u64::MAX`. Now uses safe `try_into::<u128>()`.
- ▸HIGH: yield_engine amount_to_raw saturation: `amount_to_raw` silently saturated USDC amounts above ~$18.4B via unchecked `f64 -> u64` cast. Now explicitly clamps.
- ▸MED: Email adapter SMTP relay panic: `EmailAdapter::new` panicked via `.expect()` on invalid SMTP hostname. Now returns `Result`.
- ▸MED: Email adapter mutex panics: `push_message`/`recv` used `.expect("mutex poisoned")`. Now uses `.unwrap_or_else(|e| e.into_inner())` for poison recovery, matching other adapters.
v0.8.6
Security Hardening & More2026-02-28Security: 9 changes. Fixed: 14 changes. Added: 2 changes. Key changes: CRIT: Unauthenticated rate-limit actor identity: Removed `x-user-id` header as rate-limit actor identity — it was unauthenticated and trivially spoofable. CRIT: Stable token fingerprinting: Replaced `DefaultHasher` with SHA-256 for token fingerprinting, since `DefaultHasher` is not stable across processes and could cause cache/rate-limit bypasses. Windows daemon error propagation: `schtasks /Create` errors now propagate instead of being silently dropped; post-spawn verification added; `schtasks /Delete` errors during uninstall handled correctly. CLI API key headers: Added `--api-key`/`ROBOTICUS_API_KEY` global CLI argument. All 22 bare `reqwest` calls replaced with `http_client()` helper that injects API key as default header.
Highlights
- ▸CRIT: Unauthenticated rate-limit actor identity: Removed `x-user-id` header as rate-limit actor identity — it was unauthenticated and trivially spoofable.
- ▸CRIT: Stable token fingerprinting: Replaced `DefaultHasher` with SHA-256 for token fingerprinting, since `DefaultHasher` is not stable across processes and could cause cache/rate-limit bypasses.
- ▸HIGH: Rate-limit IP fallback: IP extraction now uses `ConnectInfo<SocketAddr>` (real TCP peer address) instead of a hardcoded `127.0.0.1` fallback.
- ▸HIGH: ASCII-only identifiers: `validate_identifier` now restricts to ASCII alphanumeric characters, closing Unicode homoglyph and normalization attacks.
- ▸HIGH: Memory search query cap: `/api/memory/search` query parameter capped at 512 characters to prevent regex-based DoS.
- ▸HIGH: Error message sanitization: Added SQLite schema-leaking prefixes (`no such table`, `no such column`, etc.) to the error sanitization blocklist.
- ▸MED: Rate-limit counter ordering: Global rate-limit counter now incremented after per-IP/per-actor checks pass, preventing global exhaustion from blocked IPs.
- ▸MED: Symlink-safe directory traversal: `collect_findings_recursive` now uses `entry.file_type()` and skips symlinks, preventing symlink-following attacks.
v0.8.5
Bug Fixes & Stability (28 changes)2026-02-28Security: 6 changes. Fixed: 22 changes. Key changes: WASM preemptive timeout (BUG-101): WASM plugin execution now runs on a dedicated thread with `recv_timeout`, providing true preemptive timeout instead of the previous post-hoc elapsed-time check that allowed malicious modules to run indefinitely. Script runner orphan kill (BUG-102): Script runner now captures the child PID before `wait_with_output()` and sends `kill -9` on timeout, preventing orphan process accumulation. reqwest Client panic (BUG-105): `VectorDbSource::new()` and `GraphSource::new()` now return `Result` instead of panicking via `.expect()` when TLS initialization fails. Signal handler crash (BUG-108): SIGTERM handler installation now falls back to SIGINT-only mode instead of crashing via `.expect()` in containerized environments.
Highlights
- ▸WASM preemptive timeout (BUG-101): WASM plugin execution now runs on a dedicated thread with `recv_timeout`, providing true preemptive timeout instead of the previous post-hoc elapsed-time check that allowed malicious modules to run indefinitely.
- ▸Script runner orphan kill (BUG-102): Script runner now captures the child PID before `wait_with_output()` and sends `kill -9` on timeout, preventing orphan process accumulation.
- ▸Rate limiter memory bounds (BUG-103): Per-IP and per-actor rate limit maps are now capped at 10,000 and 5,000 entries respectively, preventing unbounded memory growth during distributed floods. Throttle tracking maps are also cleared on window reset.
- ▸Knowledge/Obsidian bounded reads (BUG-104, BUG-110): `DirectorySource::query()` and `parse_note()` now enforce 10 MB and 5 MB file size limits respectively, preventing OOM on oversized files.
- ▸Config secret allowlist (BUG-106): Admin config endpoint now uses an allowlist (`ALLOWED_FIELDS`) instead of a blocklist for field filtering, ensuring new secret fields are safe by default.
- ▸Interview turn cap (BUG-107): Interview sessions now enforce a 200-turn maximum to prevent unbounded memory growth within the 3600s TTL.
- ▸reqwest Client panic (BUG-105): `VectorDbSource::new()` and `GraphSource::new()` now return `Result` instead of panicking via `.expect()` when TLS initialization fails.
- ▸Signal handler crash (BUG-108): SIGTERM handler installation now falls back to SIGINT-only mode instead of crashing via `.expect()` in containerized environments.
v0.8.4
Bug Fixes & Stability & More2026-02-28Security: 3 changes. Fixed: 16 changes. Changed: 1 change. Key changes: WebSocket message size limit: Unauthenticated WebSocket connections now enforce a 4 KiB inbound message limit and no longer echo full message bodies, closing a ~3x memory amplification DoS vector. Hippocampus TOCTOU fix: `drop_agent_table` auth check and DROP are now wrapped in a single transaction, preventing race-condition bypasses. Agent amnesia on DB error (SF-2): `list_messages` calls in agent routes now propagate errors instead of silently returning empty history via `.unwrap_or_default()`. Governor silent write failures (SF-1): Session expiry and compaction errors are now logged at warn/error level; `tick()` returns an accurate expired count instead of silently swallowing failures with `.ok()`.
Highlights
- ▸WebSocket message size limit: Unauthenticated WebSocket connections now enforce a 4 KiB inbound message limit and no longer echo full message bodies, closing a ~3x memory amplification DoS vector.
- ▸Hippocampus TOCTOU fix: `drop_agent_table` auth check and DROP are now wrapped in a single transaction, preventing race-condition bypasses.
- ▸Script runner bounded reads: Shebang detection now uses `BufReader::take(512)` instead of `read_to_string`, preventing OOM on oversized script files.
- ▸Agent amnesia on DB error (SF-2): `list_messages` calls in agent routes now propagate errors instead of silently returning empty history via `.unwrap_or_default()`.
- ▸Governor silent write failures (SF-1): Session expiry and compaction errors are now logged at warn/error level; `tick()` returns an accurate expired count instead of silently swallowing failures with `.ok()`.
- ▸Money::from_dollars NaN panic (BUG-2): `from_dollars` now returns `Result`, rejecting NaN and Infinity inputs instead of panicking via `assert!`.
- ▸Delivery queue recovery (SF-7): `recover_from_store` is now async with proper `.lock().await`, replacing a `try_lock()` that silently dropped recovered messages.
- ▸Agent loop detection enforcement (BUG-3): `is_looping()` is now called inside `transition()` and forces `Done` state, preventing callers from bypassing loop detection.
v0.8.3
Security Hardening & More2026-02-27Security: 4 changes. Fixed: 4 changes. Added: 1 change. Key changes: Auth bypass when no API key: Requests to non-exempt API routes now fail closed when no API key is configured — only loopback connections are allowed. Previously, missing API key config silently allowed all traffic. A2A replay protection: Added nonce registry with TTL-based expiry to the A2A protocol, preventing message replay attacks within the nonce window. UTF-8 panic in memory truncation: Replaced unsafe byte-level string slicing with `floor_char_boundary()` to prevent panics on multi-byte characters (emoji, CJK) near the 200-char truncation point. Script plugin zombie processes: Script timeout now explicitly kills the child process and reaps it, preventing zombie accumulation.
Highlights
- ▸Auth bypass when no API key: Requests to non-exempt API routes now fail closed when no API key is configured — only loopback connections are allowed. Previously, missing API key config silently allowed all traffic.
- ▸A2A replay protection: Added nonce registry with TTL-based expiry to the A2A protocol, preventing message replay attacks within the nonce window.
- ▸Plugin permission enforcement: New `strict_permissions` and `allowed_permissions` config fields for plugin policy. In strict mode, undeclared permissions are blocked; in permissive mode (default), they produce a warning.
- ▸Ethereum signature recovery ID: EIP-191 signatures now include the recovery byte (v = 27 or 28), producing correct 65-byte signatures instead of 64-byte truncated ones.
- ▸UTF-8 panic in memory truncation: Replaced unsafe byte-level string slicing with `floor_char_boundary()` to prevent panics on multi-byte characters (emoji, CJK) near the 200-char truncation point.
- ▸Script plugin zombie processes: Script timeout now explicitly kills the child process and reaps it, preventing zombie accumulation.
- ▸Script plugin unbounded output: stdout/stderr from plugin scripts are now capped at 10 MB via `AsyncReadExt::take()`.
- ▸Keystore lock ordering: Consolidated two separate mutexes into a single `KeystoreState` mutex, eliminating potential deadlock scenarios.
v0.8.2
New Features2026-02-27Added: 3 changes. Fixed: 5 changes. Key changes: 100+ API route integration tests: Comprehensive test coverage for sessions, turns, interviews, feedback, skills, model selection, channels, webhooks, dead letters, admin, memory, cron, context, and approvals endpoints. Tests exercise both success and error paths including validation, 404s, auth, and edge cases. Workspace test count now at 3,316. Homebrew tap distribution: macOS/Linux users can install via `brew install robot-accomplice/tap/roboticus`. 29 stabilization bug fixes: Resolved input validation gaps, API error format inconsistencies, query parameter hardening, security headers, dashboard trailing content, model persistence, cron field naming, and Windows TOML path issues discovered during exhaustive hands-on testing of v0.8.1. HTML injection prevention: Closed remaining sanitization coverage gaps in API write endpoints.
Highlights
- ▸100+ API route integration tests: Comprehensive test coverage for sessions, turns, interviews, feedback, skills, model selection, channels, webhooks, dead letters, admin, memory, cron, context, and approvals endpoints. Tests exercise both success and error paths including validation, 404s, auth, and edge cases. Workspace test count now at 3,316.
- ▸Homebrew tap distribution: macOS/Linux users can install via `brew install robot-accomplice/tap/roboticus`.
- ▸Winget package distribution: Windows users can install via Winget package manager.
- ▸29 stabilization bug fixes: Resolved input validation gaps, API error format inconsistencies, query parameter hardening, security headers, dashboard trailing content, model persistence, cron field naming, and Windows TOML path issues discovered during exhaustive hands-on testing of v0.8.1.
- ▸HTML injection prevention: Closed remaining sanitization coverage gaps in API write endpoints.
- ▸Dashboard SPA cleanup: Removed duplicate trailing content after `</html>` close tag.
- ▸Model change persistence: Fixed model selection not persisting across server restarts.
- ▸Config serialization: Fixed TOML config serialization on Windows paths.
v0.8.1
Bug Fixes & Stability (14 changes)2026-02-27Fixed: 12 changes. Changed: 2 changes. Key changes: 40 smoke/UAT bug fixes: Resolved 40 bugs (5 critical, 6 high, 15 medium, 14 low/UX) discovered during comprehensive smoke testing of all 85 REST routes, 32 CLI commands, and 13 dashboard pages. Input validation hardening: Added field-length limits, HTML sanitization, and null-byte rejection across all API write endpoints. CI scripts use POSIX grep: Replaced all `rg` (ripgrep) invocations with standard `grep -E`/`grep -qE` in CI scripts for broader runner compatibility. Windows compilation: Added conditional `allow(unused_mut)` for platform-gated mutation in security audit command.
Highlights
- ▸40 smoke/UAT bug fixes: Resolved 40 bugs (5 critical, 6 high, 15 medium, 14 low/UX) discovered during comprehensive smoke testing of all 85 REST routes, 32 CLI commands, and 13 dashboard pages.
- ▸Input validation hardening: Added field-length limits, HTML sanitization, and null-byte rejection across all API write endpoints.
- ▸JSON error responses: All API error paths now return structured `{"error": "..."}` JSON instead of plain text.
- ▸Memory search deduplication: FTS memory search no longer returns duplicate entries; results are now structured with category/timestamp metadata.
- ▸Cron scheduler accuracy: `next_run_at` is now persisted after computation; heartbeat no longer floods logs with virtual job IDs; jobs use actual agent IDs.
- ▸Cost display precision: Floating-point noise eliminated from cost/efficiency metrics (rounded to 6 decimal places with division-by-zero guard).
- ▸Skills metadata: `risk_level` is now parameterized (not hardcoded "Caution"); skills track `last_loaded_at` timestamp.
- ▸CLI resilience: `roboticus check` no longer crashes with raw Rust IO errors; shows friendly messages with config path suggestions.
v0.8.0
Security Hardening & More2026-02-26Security: 17 changes. Fixed: 22 changes. Added: 16 changes. Changed: 4 changes. Key changes: CORS hardening: Removed wildcard `Access-Control-Allow-Origin: *` fallback when no API key is configured; CORS now always restricts to the configured bind address origin. Wallet key zeroing: Decrypted API keys in the keystore and child agent wallet secrets are now wrapped in `Zeroizing<String>` so key material is zeroed on drop. Telegram invalid-token resilience: Telegram `404/401` poll failures are now classified as likely invalid/revoked bot-token errors with explicit repair guidance and adaptive backoff to reduce noisy tight-loop logging. Subagent runtime activation sync: Taskable subagents are now auto-started at boot and kept in sync with create/update/toggle/delete operations, fixing the `enabled > 0, running = 0` stall where configured subagents stayed idle.
Highlights
- ▸CORS hardening: Removed wildcard `Access-Control-Allow-Origin: *` fallback when no API key is configured; CORS now always restricts to the configured bind address origin.
- ▸Wallet key zeroing: Decrypted API keys in the keystore and child agent wallet secrets are now wrapped in `Zeroizing<String>` so key material is zeroed on drop.
- ▸WalletFile Debug redaction: `WalletFile` no longer derives `Debug`; a manual impl redacts `private_key_hex` to prevent accidental key leakage in logs or panics.
- ▸Plaintext wallet detection: Loading an unencrypted wallet file now emits a `SECURITY` warning at `warn!` level instead of silently succeeding.
- ▸Webhook signature enforcement: WhatsApp webhook verification now rejects requests with an error when `app_secret` is unconfigured, instead of silently skipping verification.
- ▸OAuth token persistence errors surfaced: `OAuthManager::persist()` now returns `Result<()>` and callers log failures at `error!` level instead of silently swallowing write errors.
- ▸Skill catalog path traversal prevention: Skill download filenames from remote registries are now validated and canonicalized to prevent `../` path traversal.
- ▸API key URL encoding: The `query:` auth mode now percent-encodes API keys before appending to URLs, preventing malformed requests and log leakage.
v0.7.1
Hotfixes & Reliability2026-02-25Fixed: 6 changes. Key changes: Windows daemon startup and binary update reliability fixes, dashboard render boundary hardening, and loopback-proxy migration safeguards with explicit deprecation guidance for v0.8.0 removal.
Highlights
- ▸Windows daemon startup reliability: Replaced the broken `sc.exe` service launch path with a detached user-process daemon flow.
- ▸Windows binary update guardrail: `roboticus update binary` now blocks in-process self-update on Windows and prints safe manual upgrade steps.
- ▸Dashboard JS bleed-through fix: Dashboard rendering is clipped to the canonical HTML document boundary.
- ▸In-process provider routing metadata: `/api/models/available` reports in-process proxy mode and provider diagnostics for clearer operator visibility.
- ▸Loopback proxy deprecation guidance: `0.7.x` warns that `127.0.0.1:8788/<provider>` is deprecated and will be removed in `v0.8.0`.
v0.7.0
New Features2026-02-25Added: 4 changes. Changed: 3 changes. Key changes: Subagent contract enforcement: Added explicit `subagent` vs `model-proxy` role validation, fixed-skills persistence/validation, and strict rejection of personality payloads for taskable subagents. Model-selection forensics pipeline: Added persistent `model_selection_events` storage, turn-linked forensics APIs (`GET /api/turns/{id}/model-selection`, `GET /api/models/selections`), and live dashboard views for candidate evaluation details. Roster and status semantics: `/api/roster`, `/api/agent/status`, and dashboard agent views now distinguish taskable subagents from model proxies and report taskable counts with clearer operator-facing terminology. Subagent model assignment options: Added support for `auto` (router-controlled) and `commander` (primary-agent-assigned) model modes for taskable subagents, including runtime model resolution behavior.
Highlights
- ▸Subagent contract enforcement: Added explicit `subagent` vs `model-proxy` role validation, fixed-skills persistence/validation, and strict rejection of personality payloads for taskable subagents.
- ▸Model-selection forensics pipeline: Added persistent `model_selection_events` storage, turn-linked forensics APIs (`GET /api/turns/{id}/model-selection`, `GET /api/models/selections`), and live dashboard views for candidate evaluation details.
- ▸Streaming turn traceability: `POST /api/agent/message/stream` now emits stable `turn_id` values from stream start through completion and records per-turn model-selection audits for streamed responses.
- ▸Subagent ubiquitous-language architecture doc: Added `docs/architecture/subagent-ubiquitous-language.md` with canonical terminology, gap audit, and dataflow diagrams.
- ▸Roster and status semantics: `/api/roster`, `/api/agent/status`, and dashboard agent views now distinguish taskable subagents from model proxies and report taskable counts with clearer operator-facing terminology.
- ▸Subagent model assignment options: Added support for `auto` (router-controlled) and `commander` (primary-agent-assigned) model modes for taskable subagents, including runtime model resolution behavior.
- ▸Context forensics UX: Context Explorer now supports live stream-turn handoff and direct forensic drill-down using active `turn_id` metadata.
v0.6.1
Bug Fixes & Stability2026-02-24Fixed: 3 changes. Key changes: Release integrity follow-up: Merged post-tag regression fixes from the 0.6.0 release branch into `develop`, including web peer-scope identity validation, dashboard WebSocket token encoding, and release-gate compile/test stabilization. Session creation stability: Restored explicit default agent scope behavior in DB session creation paths to avoid `500` failures in session lifecycle APIs/tests.
Highlights
- ▸Release integrity follow-up: Merged post-tag regression fixes from the 0.6.0 release branch into `develop`, including web peer-scope identity validation, dashboard WebSocket token encoding, and release-gate compile/test stabilization.
- ▸Session creation stability: Restored explicit default agent scope behavior in DB session creation paths to avoid `500` failures in session lifecycle APIs/tests.
- ▸Routing test alignment: Updated router integration expectations to reflect current fallback behavior when primary providers are breaker-blocked.
v0.6.0
New Features2026-02-24Added: 4 changes. Changed: 5 changes. Key changes: Capacity headroom telemetry: New `GET /api/stats/capacity` endpoint exposes per-provider headroom, utilization, and sustained-pressure flags for operator visibility. Capacity-aware circuit preemption: Circuit breakers now accept soft capacity pressure signals and expose preemptive `half_open` state before hard failure trips. Routing quality now capacity-weighted: `select_for_complexity()` scores candidates by model quality and provider headroom, rather than binary near-capacity fallback behavior. Inference feedback loop now records capacity usage: both non-stream and stream response paths record provider token/request usage and update capacity pressure signals.
Highlights
- ▸Capacity headroom telemetry: New `GET /api/stats/capacity` endpoint exposes per-provider headroom, utilization, and sustained-pressure flags for operator visibility.
- ▸Capacity-aware circuit preemption: Circuit breakers now accept soft capacity pressure signals and expose preemptive `half_open` state before hard failure trips.
- ▸Session scope backfill migration: Added `012_session_scope_backfill_unique.sql` to normalize legacy sessions to explicit scope and enforce unique active scoped sessions.
- ▸Safe markdown rendering in dashboard sessions: Session chat and Context Explorer now render markdown with strict URL sanitization and no raw HTML execution.
- ▸Routing quality now capacity-weighted: `select_for_complexity()` scores candidates by model quality and provider headroom, rather than binary near-capacity fallback behavior.
- ▸Inference feedback loop now records capacity usage: both non-stream and stream response paths record provider token/request usage and update capacity pressure signals.
- ▸Session scoping defaults to explicit agent scope: `find_or_create()` now uses `agent` scope by default and channel/web paths pass scoped keys for peer/group isolation.
- ▸Channel session affinity: Channel dedup and session selection now use resolved chat/channel identity instead of platform-only sender affinity.
v0.5.0
New Features (25 changes)2026-02-23Added: 18 changes. Changed: 7 changes. Key changes: Addressability Filter: Composable filter chain for group chat addressability detection. Agent only responds when mentioned by name, replied to, or in a DM. Configurable via `[addressability]` config section with alias names support. Response Transform Pipeline: Three-stage pipeline applied to all LLM responses -- `ReasoningExtractor` (captures `<think>` blocks), `FormatNormalizer` (whitespace/fence cleanup), `ContentGuard` (injection defense). Replaces the previous inline `scan_output` approach. All 10 crate READMEs updated to v0.5.0 with expanded descriptions and key types. All 10 `lib.rs` files now have `//!` crate-level doc comments.
Highlights
- ▸Addressability Filter: Composable filter chain for group chat addressability detection. Agent only responds when mentioned by name, replied to, or in a DM. Configurable via `[addressability]` config section with alias names support.
- ▸Response Transform Pipeline: Three-stage pipeline applied to all LLM responses -- `ReasoningExtractor` (captures `<think>` blocks), `FormatNormalizer` (whitespace/fence cleanup), `ContentGuard` (injection defense). Replaces the previous inline `scan_output` approach.
- ▸Flexible Network Binding: Interface-based binding (`bind_interface`), optional TLS via `axum-server` with rustls, and `advertise_url` for agent card generation.
- ▸Approval Workflow Loop Integration: Agent pauses on gated tool calls, publishes `pending_approval` events via WebSocket, and resumes after admin approve/deny. Dashboard "Approvals" panel with real-time updates.
- ▸Browser as Agent Tool: `BrowserTool` adapter wrapping the 12-action `roboticus-browser` crate, registered in `ToolRegistry`. Tool schemas injected into system prompt so the LLM can request browser actions.
- ▸Context Observatory: Full turn inspector and analytics suite:
- ▸Turn recording with `context_snapshots` table capturing token allocation, memory tier breakdown, complexity level, and model for every LLM call
- ▸Turn & Context API: `GET /api/sessions/{id}/turns`, `GET /api/turns/{id}`, `GET /api/turns/{id}/context`, `GET /api/turns/{id}/tools`
v0.4.3
New Features & More2026-02-23Added: 6 changes. Fixed: 3 changes. Changed: 2 changes. Key changes: Slash commands for agent chat: `/model`, `/models`, `/breaker`, `/retry` for runtime LLM control. Runtime model override via `/model set <model>` — temporarily forces a specific model, bypassing routing. Credit/billing errors now permanently trip the circuit breaker (no auto-recovery to HalfOpen) — providers with exhausted credits are never probed again until explicitly reset via `/breaker reset`. Dashboard "Save to keystore" button now sends `Content-Type: application/json` header — previously failed with "Expected request with 'Content-Type: application/json'".
Highlights
- ▸Slash commands for agent chat: `/model`, `/models`, `/breaker`, `/retry` for runtime LLM control
- ▸Runtime model override via `/model set <model>` — temporarily forces a specific model, bypassing routing
- ▸Circuit breaker status and reset via `/breaker` and `/breaker reset [provider]` slash commands
- ▸Breaker-aware model routing — `select_for_complexity` and `select_cheapest_qualified` now skip providers with tripped circuit breakers
- ▸Pre-flight API key check in `infer_with_fallback` — cloud providers with no configured key are skipped before sending a doomed request
- ▸Dashboard settings inputs show a dimmed "none" placeholder instead of literal "null" for empty fields
- ▸Credit/billing errors now permanently trip the circuit breaker (no auto-recovery to HalfOpen) — providers with exhausted credits are never probed again until explicitly reset via `/breaker reset`
- ▸Dashboard "Save to keystore" button now sends `Content-Type: application/json` header — previously failed with "Expected request with 'Content-Type: application/json'"
v0.4.2
Bug Fixes & Stability2026-02-23Fixed: 3 changes. Key changes: `roboticus daemon start` now verifies the service is actually running after `launchctl load` — previously reported "Daemon started" even when the service crashed immediately. `roboticus daemon install` resolves the config path to absolute before embedding in the plist — previously used the relative path which launchd couldn't resolve.
Highlights
- ▸`roboticus daemon start` now verifies the service is actually running after `launchctl load` — previously reported "Daemon started" even when the service crashed immediately
- ▸`roboticus daemon install` resolves the config path to absolute before embedding in the plist — previously used the relative path which launchd couldn't resolve
- ▸Captures launchctl stderr and checks `LastExitStatus` / PID to give actionable error messages on daemon start failure
v0.4.1
Security Hardening & More2026-02-23Added: 5 changes. Fixed: 7 changes. Security: 4 changes. Key changes: `roboticus daemon start|stop|restart` subcommands for full daemon lifecycle management. Interactive prompt after `roboticus daemon install` asking whether to start immediately. Replaced stale `[providers.local]` (localhost:8080) with `[providers.moonshot]` in bundled and registry provider configs. Added `moonshot/kimi-k2.5` to dashboard known-models list for settings autocomplete.
Highlights
- ▸`roboticus daemon start|stop|restart` subcommands for full daemon lifecycle management
- ▸Interactive prompt after `roboticus daemon install` asking whether to start immediately
- ▸`--start` flag on `roboticus daemon install` for non-interactive use
- ▸Dashboard keystore management: save/remove provider API keys from the settings page
- ▸Session nicknames in dashboard sessions table with click-to-copy session ID
- ▸Replaced stale `[providers.local]` (localhost:8080) with `[providers.moonshot]` in bundled and registry provider configs
- ▸Added `moonshot/kimi-k2.5` to dashboard known-models list for settings autocomplete
- ▸`roboticus daemon install` now actually offers to load the service (previously only wrote the plist/unit file)
v0.4.0
New Features & More2026-02-23Added: 10 changes. Changed: 5 changes. Fixed: 2 changes. Key changes: Signal channel adapter backed by signal-cli JSON-RPC daemon (`roboticus-channels::signal`). Unified thinking indicator (🤖🧠…) for all chat channels (Telegram, WhatsApp, Discord, Signal). `thinking_threshold_seconds` moved from per-channel (`TelegramConfig`) to `ChannelsConfig` level. Channel message processing is now platform-agnostic via `send_typing_indicator` / `send_thinking_indicator` helpers.
Highlights
- ▸Signal channel adapter backed by signal-cli JSON-RPC daemon (`roboticus-channels::signal`)
- ▸Unified thinking indicator (🤖🧠…) for all chat channels (Telegram, WhatsApp, Discord, Signal)
- ▸Configurable `thinking_threshold_seconds` on `[channels]` — estimated latency gate for thinking indicator (default: 30s)
- ▸`send_typing` and `send_ephemeral` on WhatsApp and Discord adapters
- ▸Latency estimator based on model tier, input length, and circuit-breaker state
- ▸LLM fallback chain: `infer_with_fallback` helper retries across configured providers on transient errors
- ▸Permanent error detection in delivery queue — 403/401/400 and "bot blocked" errors dead-letter immediately
- ▸Config auto-discovery: `roboticus start` checks `~/.roboticus/roboticus.toml` when no `--config` flag is given
v0.3.0
Security Hardening & More2026-02-23Security: 8 changes. Fixed: 11 changes. Changed: 10 changes. Added: 1 change. Key changes: Plugin sandbox: validate tool names against allowlist; reject path-traversal payloads; add `shutdown_all` for graceful teardown. Browser restrictions: block `file://`, `javascript:`, `data:` URI schemes in CDP navigation; harden Chrome launch flags. Telegram adapter now processes all updates in a batch, not just the first. Cron worker dispatches jobs instead of unconditionally marking success.
Highlights
- ▸Plugin sandbox: validate tool names against allowlist; reject path-traversal payloads; add `shutdown_all` for graceful teardown
- ▸Browser restrictions: block `file://`, `javascript:`, `data:` URI schemes in CDP navigation; harden Chrome launch flags
- ▸Session role validation: reject messages with roles outside `{user, assistant, system, tool}`
- ▸Channel message authority: trusted sender IDs config for elevated `ChannelAuthority`
- ▸WhatsApp webhook signature verification via HMAC-SHA256
- ▸Docker: run as non-root `roboticus` user
- ▸Wallet: encrypt private keys with machine-derived passphrase; never store plaintext
- ▸API key `#[serde(skip_serializing)]` prevents accidental serialization leakage
v0.2.0
Alpha Release2026-02-23Full roadmap implementation — 35 items across 7 phases. ReAct agent loop, RAG retrieval pipeline, embedding provider integration, ANN index, persistent semantic cache, sub-agent framework, and comprehensive bug fixes from code review.
Highlights
- ▸ReAct agent loop with idle/loop detection
- ▸5-tier hybrid RAG retrieval (FTS5 + vector cosine)
- ▸Embedding provider integration (OpenAI, Ollama, Google)
- ▸HNSW approximate nearest neighbor index
- ▸Persistent semantic cache (SQLite-backed, auto-eviction)
- ▸Sub-agent framework with isolated tool registries
- ▸22 code review issues resolved (6 critical, 12 high, 4 medium)
- ▸RwLock deadlock fix in circuit breaker path
- ▸UTF-8 safety, atomic OAuth persistence, poison recovery
v0.1.0
New Features & More2026-02-22Added: 5 changes. Changed: 1 change. Fixed: 1 change. Key changes: Initial Project Roboticus baseline for Roboticus. Multi-crate Rust workspace foundation (runtime crates + integration test crate). Prepared packaging/publish metadata for early release workflows. Early release stabilization fixes for binary packaging, startup wiring, and quality gates.
Highlights
- ▸Initial Project Roboticus baseline for Roboticus.
- ▸Multi-crate Rust workspace foundation (runtime crates + integration test crate).
- ▸Core SQLite persistence layer with schema/migrations and operational defaults.
- ▸Early HTTP API, CLI surface, and embedded dashboard scaffolding.
- ▸Initial architecture and reference documentation set.
- ▸Prepared packaging/publish metadata for early release workflows.
- ▸Early release stabilization fixes for binary packaging, startup wiring, and quality gates.