Skip to content

Changelog

Release history for the Roboticus autonomous agent runtime. Follows conventional commits.

v0.11.0

New Features & More2026-03-25

Added: 5 changes. Changed: 4 changes. Fixed: 4 changes. Key changes: Agent efficacy push: Memory introspection, selective forgetting, relationship-memory automation, and task operating state are now first-class parts of the shared pipeline. Introspection-driven execution: Task-oriented turns now begin from introspected runtime truth, can compose specialists from a clean slate, and preserve executed state across normalization retries. Delegation and composition flow: Empty-roster specialist requests now progress into composition and delegation instead of collapsing into narrated intent or centralization. Task-path truthfulness: Filesystem/runtime blockers now surface as real blockers instead of canned fallback prose.

Highlights

  • Agent efficacy push: Memory introspection, selective forgetting, relationship-memory automation, and task operating state are now first-class parts of the shared pipeline.
  • Introspection-driven execution: Task-oriented turns now begin from introspected runtime truth, can compose specialists from a clean slate, and preserve executed state across normalization retries.
  • MCP release-grade management: Shared `/api/mcp/servers` management surface aligned across dashboard and CLI.
  • Skill and subagent utilization telemetry: Usage count and last-used signals exposed for skills and subagents.
  • Release vetting automation: Added `scripts/run-v0110-vetting.sh` and `docs/testing/v0110-vetting-matrix.md` to lock in the v0.11.0 regression contract.
  • Delegation and composition flow: Empty-roster specialist requests now progress into composition and delegation instead of collapsing into narrated intent or centralization.
  • Task-path truthfulness: Filesystem/runtime blockers now surface as real blockers instead of canned fallback prose.
  • Prompt and planner behavior: Introspection is now treated as the first operational step for task work and feeds a shared task operating state/action planner.
FEATAgent efficacy push: Memory introspection, selective forgetting, relationship-memory automation, and task operating state are now first-class parts of the shared pipeline.
FEATIntrospection-driven execution: Task-oriented turns now begin from introspected runtime truth, can compose specialists from a clean slate, and preserve executed state across normalization retries.
FEATMCP release-grade management: Shared `/api/mcp/servers` management surface aligned across dashboard and CLI.
FEATSkill and subagent utilization telemetry: Usage count and last-used signals exposed for skills and subagents.
FEATRelease vetting automation: Added `scripts/run-v0110-vetting.sh` and `docs/testing/v0110-vetting-matrix.md` to lock in the v0.11.0 regression contract.
CHOREDelegation and composition flow: Empty-roster specialist requests now progress into composition and delegation instead of collapsing into narrated intent or centralization.
CHORETask-path truthfulness: Filesystem/runtime blockers now surface as real blockers instead of canned fallback prose.
CHOREPrompt and planner behavior: Introspection is now treated as the first operational step for task work and feeds a shared task operating state/action planner.
CHORERelease documentation: Active docs, architecture notes, roadmap entries, and release gates now reflect shipped v0.11.0 behavior.
FIXPipeline trace drift: Legacy databases missing `pipeline_traces.session_id` and related fields are repaired on boot.
FIXPrompt Performance persistence: Routing weights and context budget now persist and rehydrate correctly.
FIXDashboard regressions: Repaired session archive, raw TOML editing, Observability nav, semantic memory navigation, roster skill drill-down, workspace symmetry, and related operator-surface gaps.
FIXAnalysis/recommendation soak failures: Live deep-analysis surfaces validated against a stronger provider path during soak.

v0.10.0

Correctness, Safety & Operational Maturity (14 changes)2026-03-23

Added: 8 changes. Fixed: 4 changes. Changed: 2 changes. Key changes: Model categorization with 29-model capability profiles, skill authoring API, Landlock/Job Object script confinement, typestate session lifecycle, delegation scoring engine, and critical signal adapter fixes.

Highlights

  • Model categorization (Phase 1): 10 task categories, 29 model profiles across 8 providers, category-aware routing with `category_fit` metascore dimension.
  • Skill authoring API: Create, validate, and publish Markdown instruction skills via `POST /api/skills/author`.
  • Landlock & Job Object confinement: Linux filesystem sandboxing and Windows process isolation for script execution.
  • Typestate sessions: Compile-time session lifecycle — `Session<Created>` → `Session<Active>` → `Session<Closed>`.
  • Delegation scoring engine: `score_agent_fit()`, `composite_fit_ratio()`, and `utility_margin_for_delegation()` for decomposition decisions.
  • Signal adapter critical fixes: Replaced `std::sync::Mutex` in async context, added rate limiting, bounded buffer growth.
FEATModel categorization (Phase 1): `TaskCategory` enum (10 types), `classify_task()`, benchmark profiles for 29 models, `CategoryQualityTracker`, and `category_fit` metascore dimension (0.15 weight).
FEATSkill authoring API: `POST /api/skills/author` for creating, validating, and publishing Markdown instruction skills with safety scanning.
FEATLandlock confinement: Linux filesystem sandboxing via `landlock` crate for script execution. Windows Job Object isolation.
FEATTypestate session lifecycle: `Session<Created>`, `Session<Active>`, `Session<Closed>` compile-time state machine in `roboticus-db`.
FEATDelegation scoring engine: `score_agent_fit()`, `composite_fit_ratio()`, `utility_margin_for_delegation()` for principled decomposition gate decisions.
FEATOpenAPI spec endpoint: `GET /openapi.json` serves OpenAPI 3.1 spec; `GET /docs` provides spec access for Swagger UI viewers.
FEATCodex CLI plugin: Delegate coding tasks to OpenAI Codex CLI with structured JSON output and approval mode support.
FEATDead letter alerting: Atomic counter with configurable threshold, error-level logging, and `GET /api/stats/delivery` endpoint.
FIXSignal adapter async/sync mutex (Critical): Replaced `std::sync::Mutex<VecDeque>` with bounded `tokio::sync::mpsc` channel to eliminate runtime thread blocking.
FIXSignal adapter rate limiting: Added `governor::RateLimiter` (5 req/s default) to prevent signal-cli daemon DoS.
FIXDelivery queue error detection: `is_permanent_error()` now extracts HTTP status codes first (429=transient, 4xx=permanent, 5xx=transient).
FIXPlugin catalog: Registry manifest now includes `plugins` section; empty catalog shows helpful message instead of hard error.
CHOREFormatter single-pass optimization: Replaced 3-allocation `strip→clean→collapse` chain with single-pass `clean_content()` across all formatters.
CHOREMetascore weight redistribution: Adjusted routing weights to accommodate new `category_fit` dimension (0.15).

v0.9.9

Terminal UX & Release Hardening (8 changes)2026-03-18

Added: 6 changes. Changed: 1 change. Fixed: 1 change. Key changes: New `roboticus tui` terminal application, configurable context budget tiers, integrations management endpoints/CLI, tool output noise filtering, and dashboard configuration/routing UX improvements.

Highlights

  • Terminal UI: Added `roboticus tui` (`roboticus-tui` crate) with chat, logs, status bar, streaming responses, and session resume.
  • Context budget tuning: Added configurable L0-L3 token budgets and per-channel minimum complexity level controls.
  • Integrations management: Added `POST /api/channels/{platform}/test`, dashboard per-channel probes, and `roboticus integrations` CLI commands.
  • Tool output filter chain: Added ANSI/progress/duplicate/whitespace filtering before LLM observation to reduce token noise.
  • Dashboard and routing polish: Exposed unconfigured sections with enable actions and improved routing profile validation/toasts/defaults.
  • Default bind address: Changed defaults from `127.0.0.1` to `localhost` for safer local consistency.
FEATTerminal user interface (`roboticus tui`): New `roboticus-tui` crate with chat/log/status UX, streaming responses, and session create/resume support.
FEATContext budget tuning: Configurable `[context_budget]` tiers with dashboard sliders and per-channel minimum complexity level.
FEATIntegrations management: Added channel probe endpoint (`POST /api/channels/{platform}/test`), dashboard integrations panel controls, and `roboticus integrations` CLI group.
FEATTool output noise filter: Introduced `ToolOutputFilterChain` with ANSI strip, progress-line filtering, duplicate-line dedupe, and whitespace normalization.
FEATDashboard config exposure: Unconfigured sections/channels now render in the dashboard with explicit enable actions.
FEATRouting profile polish: Added >1.0 weight validation warning, apply toast, and default profile display when unset.
CHOREDefault bind address: Switched defaults/docs from `127.0.0.1` to `localhost` (loopback literals retained where RFC-required).
FIXWindows script-runner tests: Added `#[cfg(unix)]` guard around Unix-only permissions test module to prevent Windows compile failures.

v0.9.8

Platform Refactor & Reliability (22 changes)2026-03-16

Added: 7 changes. Fixed: 10 changes. Changed: 5 changes. Key changes: server crate split (`roboticus-cli`/`roboticus-api`/slim server), unified error model, channel adapter helper extraction, model categorization/router spec, and broad hardening around SQL safety, session integrity, config generation, and silent error triage.

Highlights

  • Server crate split: Decomposed the large server crate into `roboticus-cli`, `roboticus-api`, and a slim `roboticus` runtime bootstrap.
  • Error type unification: Consolidated into a nested `RoboticusError` hierarchy with clean `From` conversions.
  • Channel adapter helper extraction: Added shared formatter/chunking/allowlist helpers across channel adapters.
  • Model categorization spec: Added 10-category task taxonomy with routing/orchestrator integration points.
  • SQL safety + session integrity: Hardened `drop_column` identifier handling and fixed session `find_or_create` error swallowing.
  • Config and platform reliability: Fixed Windows TOML path escaping and converted silent failures into explicit logging/warnings.
FEATServer crate split: Split `roboticus-server` into `roboticus-cli`, `roboticus-api`, and slim `roboticus` runtime entrypoint.
FEATError type unification: Introduced nested `RoboticusError` hierarchy with `thiserror` + `From` conversion coverage.
FEATChannel adapter helpers: Added shared `ChannelFormatter::format()`, `chunk_message()`, and allowlist checks.
FEATModel categorization spec: Added 10-category task taxonomy and integration points for routing/orchestrator flows.
FEAT`--json` listing support: Added structured JSON output to all listing-style CLI commands.
FEAT`/health` alias and `/dashboard` redirect: Added convenience endpoint routing for operators.
FEATConfig backup management: Moved backups to `./backups/` with retention by count and age.
FIXSQL injection hardening in `drop_column`: Enforced identifier validation + safe quoting.
FIXSession `find_or_create`: Stopped swallowing real DB failures by removing `.ok()`-based fallback behavior.
FIXKeystore refresh observability: Converted silent refresh failures into explicit `tracing::warn` logs.
FIXWindows TOML path escaping: Added path normalization for generated config values on Windows.
FIXSilent error triage: Implemented explicit logging/warning tiers for previously silent failure paths.
CHOREDependency cleanup: Removed 19 unused dependencies left over from crate split transitions.
CHORE`Wallet::test_mock()` feature gate: Restricted to `test-support` feature and excluded from production artifacts.
CHOREDead code and duplicate test utilities cleanup: Consolidated shared helpers and removed stale paths.

v0.9.7

Bug Fixes & Stability (26 changes)2026-03-14

Added: 8 changes. Fixed: 18 changes. Key changes: DB fitness hardening (DF-1–DF-18): 18-item SQLite performance audit resolved — retention pruning for 5 high-growth tables, orphan cleanup sweeps (working memory + embeddings), `auto_vacuum=INCREMENTAL`, 6 missing indexes, episodic dead-entry pruning, cache NULL-expiry fix, `PRAGMA synchronous=NORMAL` under WAL, CHECK constraints on 11 columns, and dead `proxy_stats` table removal. Memory hygiene mechanic: `roboticus mechanic` detects and (with `--repair`) purges contaminated memory entries using 7 deterministic LIKE-prefix patterns across 3 tiers, with JSON-structured findings. Circuit breaker window reset: `record_failure()` now tracks `window_start` for rolling-window accumulation — failures spaced ~60s apart correctly accumulate instead of resetting. Embedding auth for local providers: `EmbeddingConfig.is_local` skips API key resolution and auth headers for Ollama/llama.cpp.

Highlights

  • DB fitness hardening (DF-1–DF-18): 18-item SQLite performance audit resolved — retention pruning for 5 high-growth tables, orphan cleanup sweeps (working memory + embeddings), `auto_vacuum=INCREMENTAL`, 6 missing indexes, episodic dead-entry pruning, cache NULL-expiry fix, `PRAGMA synchronous=NORMAL` under WAL, CHECK constraints on 11 columns, and dead `proxy_stats` table removal.
  • Memory hygiene mechanic: `roboticus mechanic` detects and (with `--repair`) purges contaminated memory entries using 7 deterministic LIKE-prefix patterns across 3 tiers, with JSON-structured findings.
  • Sandbox boundary management: Filesystem confinement for skill scripts (skills_dir + `$ROBOTICUS_WORKSPACE`, no traversal/symlink escape), configurable network isolation (`unshare(CLONE_NEWNET)` on Linux), memory ceiling via `RLIMIT_AS`, interpreter allowlist via absolute-path resolution, and mechanic sandbox health reporting.
  • Filesystem security overhaul: `FilesystemSecurityConfig` with `workspace_only` mode, ~25 default protected path patterns, `tool_allowed_paths` whitelist (auto-populated from Obsidian vault path), macOS `sandbox-exec` write-denial confinement, and dashboard UI toggles.
  • Unified pipeline architecture: `IntentRegistry` (22-variant `Intent` enum), `GuardChain` (12 guards with `full()`/`cached()`/`streaming()` presets), `ShortcutDispatcher` (15 handlers replacing 983-line god function), `PipelineConfig` (4 presets: `api`/`streaming`/`channel`/`cron`), and `DedupGuard` RAII replacing 11 manual release patterns. Net ~653 lines removed.
  • ChannelFormatter trait: Per-platform output formatting with static dispatch registry — `TelegramFormatter` (Markdown→MarkdownV2), `DiscordFormatter`, `WhatsAppFormatter`, `SignalFormatter`, `WebFormatter`, `EmailFormatter` — wired into `channel_message.rs` delivery path. 31 unit tests.
  • Configurable inference timeouts: Per-provider `timeout_seconds` setting (`[providers.*.timeout_seconds]`) with 300-second default, surfaced in dashboard provider configuration.
  • Dashboard session ID copy button: One-click copy-to-clipboard for session IDs in the Sessions panel.
FEATDB fitness hardening (DF-1–DF-18): 18-item SQLite performance audit resolved — retention pruning for 5 high-growth tables, orphan cleanup sweeps (working memory + embeddings), `auto_vacuum=INCREMENTAL`, 6 missing indexes, episodic dead-entry pruning, cache NULL-expiry fix, `PRAGMA synchronous=NORMAL` under WAL, CHECK constraints on 11 columns, and dead `proxy_stats` table removal.
FEATMemory hygiene mechanic: `roboticus mechanic` detects and (with `--repair`) purges contaminated memory entries using 7 deterministic LIKE-prefix patterns across 3 tiers, with JSON-structured findings.
FEATSandbox boundary management: Filesystem confinement for skill scripts (skills_dir + `$ROBOTICUS_WORKSPACE`, no traversal/symlink escape), configurable network isolation (`unshare(CLONE_NEWNET)` on Linux), memory ceiling via `RLIMIT_AS`, interpreter allowlist via absolute-path resolution, and mechanic sandbox health reporting.
FEATFilesystem security overhaul: `FilesystemSecurityConfig` with `workspace_only` mode, ~25 default protected path patterns, `tool_allowed_paths` whitelist (auto-populated from Obsidian vault path), macOS `sandbox-exec` write-denial confinement, and dashboard UI toggles.
FEATUnified pipeline architecture: `IntentRegistry` (22-variant `Intent` enum), `GuardChain` (12 guards with `full()`/`cached()`/`streaming()` presets), `ShortcutDispatcher` (15 handlers replacing 983-line god function), `PipelineConfig` (4 presets: `api`/`streaming`/`channel`/`cron`), and `DedupGuard` RAII replacing 11 manual release patterns. Net ~653 lines removed.
FEATChannelFormatter trait: Per-platform output formatting with static dispatch registry — `TelegramFormatter` (Markdown→MarkdownV2), `DiscordFormatter`, `WhatsAppFormatter`, `SignalFormatter`, `WebFormatter`, `EmailFormatter` — wired into `channel_message.rs` delivery path. 31 unit tests.
FEATConfigurable inference timeouts: Per-provider `timeout_seconds` setting (`[providers.*.timeout_seconds]`) with 300-second default, surfaced in dashboard provider configuration.
FEATDashboard session ID copy button: One-click copy-to-clipboard for session IDs in the Sessions panel.
FIXCircuit breaker window reset: `record_failure()` now tracks `window_start` for rolling-window accumulation — failures spaced ~60s apart correctly accumulate instead of resetting.
FIXEmbedding auth for local providers: `EmbeddingConfig.is_local` skips API key resolution and auth headers for Ollama/llama.cpp.
FIXCron `schedule_kind: "once"` support: Runtime maps "once" → "at" dispatch, calls `DurableScheduler::evaluate_at()`, auto-disables after single execution.
FIXVault path whitelisting: `tool_allowed_paths` auto-populated from `obsidian.vault_path` during config normalization — workspace-only mode no longer blocks configured external paths.
FIXFleet activity chart capacity model: Stacked area normalizes per-agent scores by `1/agentCount` with `fixedMax: 1.0`.
FIXCache guard parity: `cached()` guard set now includes `SubagentClaim` + `LiteraryQuoteRetry` (previously missing).
FIXExecutionTruthGuard: Tool-results bypass bug removed.
FIXCollapsible if lint: Updated `impl_core.rs` to use `if let` chain (edition 2024).
FIXWallet RPC rate-limit backoff: `get_all_balances()` detects rate-limit error codes (`-32016`, `-32005`, `429`) and stops iterating remaining tokens instead of repeatedly hitting the provider.
FIXCron once-type orphan jobs: Jobs with `schedule_kind: "once"` and no `schedule_expr` are now auto-disabled on first encounter instead of emitting a warning every 60s.
FIXDashboard sidebar footer: Navigation bar footer now stays pinned to the bottom of the viewport (added `height: 100%` to sidebar container).
FIXDashboard custom model Add button: Custom model text input row now has its own Add button; both Add buttons use a shared class selector.
FIXTelegram double-underscore italic: `text` was incorrectly emitted as Telegram underline instead of italic — formatter now maps to `_text_`.
FIXConfig hot-reload path divergence: `normalize_paths()` and `merge_bundled_providers()` were skipped during hot-reload — reloaded configs now match boot-time normalization.
FIXRouting audit fixes: Attempt counter not incrementing on retry, `u32` truncation on cost metrics, misleading timeout error message wording.
FIXDashboard UI stall during inference: 4 `RwLock` guard-scope fixes release locks before async I/O, preventing cascading reader starvation.
FIXCron semaphore hot-reload race: Semaphore not released when cron runtime reloads config, causing phantom permit exhaustion. Dead `LlmService` method removed, lock consolidation in admin routes.
FIXAgent audit fixes: Tautological always-true test condition, timeout hint parsing edge case, unreachable branch removal.

v0.9.6

New Features (14 changes)2026-03-12

Added: 14 changes. Key changes: Compliance-first self-funding control plane: Complete revenue opportunity lifecycle (intake → qualify → score → plan → fulfill → settle) with DB-backed restart safety, strategy-level scoring (confidence/effort/risk/priority/recommendation), feedback persistence per opportunity and summary by strategy, configurable post-settlement asset routing (default `PALM_USD`), EVM swap submission with tx-hash tracking and on-chain receipt reconciliation, tax payout lifecycle mirroring swap tasks, and operator-visible accounting (net profit, attributable costs, retained earnings, tax allocation) across API, CLI, and mechanic surfaces. Revenue mechanic integration: Mechanic can probe, reconcile, and repair orphaned or stale revenue jobs and swap/tax reconciliation mismatches via `run_gateway_provider_and_revenue_checks` and `run_gateway_integrated_repair_sweep`.

Highlights

  • Compliance-first self-funding control plane: Complete revenue opportunity lifecycle (intake → qualify → score → plan → fulfill → settle) with DB-backed restart safety, strategy-level scoring (confidence/effort/risk/priority/recommendation), feedback persistence per opportunity and summary by strategy, configurable post-settlement asset routing (default `PALM_USD`), EVM swap submission with tx-hash tracking and on-chain receipt reconciliation, tax payout lifecycle mirroring swap tasks, and operator-visible accounting (net profit, attributable costs, retained earnings, tax allocation) across API, CLI, and mechanic surfaces.
  • Revenue mechanic integration: Mechanic can probe, reconcile, and repair orphaned or stale revenue jobs and swap/tax reconciliation mismatches via `run_gateway_provider_and_revenue_checks` and `run_gateway_integrated_repair_sweep`.
  • Skills catalog: `PluginCatalog` with CLI flows (`roboticus skills catalog list/install/activate`) and API endpoints (`GET/POST /api/skills/catalog`, `/install`, `/activate`). Registry manifest fetch from remote URL.
  • Skill registry protocol: Migration 022 adds `version`, `author`, `registry_source` columns to skills table. Multi-registry support via `RegistrySource { name, url, priority, enabled }` with backward-compatible fallback from legacy single-URL `registry_url`.
  • Multi-registry fetch: Registry sync iterates all configured sources, namespaces skills as `{registry_name}/{skill_name}` for non-local sources, applies semver comparison to skip redundant downloads, and resolves conflicts by registry priority.
  • Learning loop closure: Agent now detects repeating multi-step tool sequences on session close and synthesizes reusable SKILL.md procedure files. `learned_skills` table (migration 021) tracks reinforcement history (success/failure counts, priority). `LearningConfig` exposes tuneable thresholds for minimum sequence length, success ratio, priority boost/decay, and skill cap. Inspired by recent work on autonomous tool-use learning in LLM agents ([arXiv:2603.05344](https://arxiv.org/abs/2603.05344)).
  • Procedural failure recording: `record_procedural_failure()` (previously dead code in the DB layer) is now called from `ingest_turn()` when tool results indicate failure, closing the procedural memory feedback loop.
  • Skill priority adjustment: Governor `tick()` now runs `adjust_learned_skill_priorities()` after episodic decay — learned skills with high success ratios get priority boosts; those with poor ratios get decayed.
FEATCompliance-first self-funding control plane: Complete revenue opportunity lifecycle (intake → qualify → score → plan → fulfill → settle) with DB-backed restart safety, strategy-level scoring (confidence/effort/risk/priority/recommendation), feedback persistence per opportunity and summary by strategy, configurable post-settlement asset routing (default `PALM_USD`), EVM swap submission with tx-hash tracking and on-chain receipt reconciliation, tax payout lifecycle mirroring swap tasks, and operator-visible accounting (net profit, attributable costs, retained earnings, tax allocation) across API, CLI, and mechanic surfaces.
FEATRevenue mechanic integration: Mechanic can probe, reconcile, and repair orphaned or stale revenue jobs and swap/tax reconciliation mismatches via `run_gateway_provider_and_revenue_checks` and `run_gateway_integrated_repair_sweep`.
FEATSkills catalog: `PluginCatalog` with CLI flows (`roboticus skills catalog list/install/activate`) and API endpoints (`GET/POST /api/skills/catalog`, `/install`, `/activate`). Registry manifest fetch from remote URL.
FEATSkill registry protocol: Migration 022 adds `version`, `author`, `registry_source` columns to skills table. Multi-registry support via `RegistrySource { name, url, priority, enabled }` with backward-compatible fallback from legacy single-URL `registry_url`.
FEATMulti-registry fetch: Registry sync iterates all configured sources, namespaces skills as `{registry_name}/{skill_name}` for non-local sources, applies semver comparison to skip redundant downloads, and resolves conflicts by registry priority.
FEATLearning loop closure: Agent now detects repeating multi-step tool sequences on session close and synthesizes reusable SKILL.md procedure files. `learned_skills` table (migration 021) tracks reinforcement history (success/failure counts, priority). `LearningConfig` exposes tuneable thresholds for minimum sequence length, success ratio, priority boost/decay, and skill cap. Inspired by recent work on autonomous tool-use learning in LLM agents ([arXiv:2603.05344](https://arxiv.org/abs/2603.05344)).
FEATProcedural failure recording: `record_procedural_failure()` (previously dead code in the DB layer) is now called from `ingest_turn()` when tool results indicate failure, closing the procedural memory feedback loop.
FEATSkill priority adjustment: Governor `tick()` now runs `adjust_learned_skill_priorities()` after episodic decay — learned skills with high success ratios get priority boosts; those with poor ratios get decayed.
FEATSkill subdirectory loading: `SkillLoader` now recurses into `learned/` subdirectory, loading machine-synthesized skills alongside hand-authored ones.
FEATProgressive context compaction: 5-stage compaction (`Trim` → `Summarize` → `Archive` → `Evict` → `Emergency`) in `compact_before_archive()` with `CompactionStage::from_excess()` selector.
FEATDecay-weighted episodic retrieval: `rerank_episodic_by_decay()` applies time-based decay at retrieval time, preventing stale context from dominating memory budget.
FEATInstruction anti-fade micro-reminders: Event-driven system prompt reinforcement at agent decision points to combat instruction-following drift.
FEATx402 autonomous payment: LLM HTTP client now handles `402 Payment Required` responses with autonomous on-chain payment and request retry.
FEATHomebrew & Winget packaging: `release.yml` contains complete `update-homebrew` (SHA256 extraction, formula generation, tap push) and `update-winget` (`vedantmgoyal9/winget-releaser@v2`) jobs. Activation requires tap repo creation and secrets provisioning.

v0.9.5

Improvements & More2026-03-06

Changed: 8 changes. Fixed: 5 changes. Note: 1 change. Key changes: Terminology normalization: `soul_text` → `os_text`, `soul_history` → `os_personality_history` (migration 020) for firmware/OS terminology coherency. Behavior soak hardening: `scripts/run-agent-behavior-soak.py` now includes regression checks for filesystem capability truthfulness, subagent capability response quality, and affirmative continuation quality, with rubric updates to score substantive outcomes over brittle phrase matching. Internal protocol fallback leakage: response sanitization no longer surfaces protocol-placeholder fallback text; empty/degraded sanitized content now resolves through deterministic user-facing quality fallback. Markdown count execution reliability: execution shortcut path now handles recursive markdown-file count prompts deterministically, including strict numeric-only responses when requested (`count only` / `only the number` style prompts).

Highlights

  • Terminology normalization: `soul_text` → `os_text`, `soul_history` → `os_personality_history` (migration 020) for firmware/OS terminology coherency.
  • Behavior soak hardening: `scripts/run-agent-behavior-soak.py` now includes regression checks for filesystem capability truthfulness, subagent capability response quality, and affirmative continuation quality, with rubric updates to score substantive outcomes over brittle phrase matching.
  • Roadmap/release traceability: `docs/releases/v0.9.5.md` and `docs/ROADMAP.md` updated with current v0.9.5 prep status for speculative execution, browser runtime support, CLI skill roadmap slice, and behavior continuity validation.
  • Architecture documentation: Added explicit v0.9.5-prep control/dataflow coverage for deterministic execution shortcuts and guarded response sanitization in `docs/architecture/roboticus-dataflow.md` and `docs/architecture/roboticus-sequences.md`.
  • Browser runtime continuity: Browser action execution now attempts a single stop/start session recovery when CDP disconnect/closed-socket errors are detected, limited to idempotent actions to avoid duplicate side effects on replay.
  • Autonomy turn-budget controls: Added configurable agent-level ReAct budget controls (`autonomy_max_react_turns`, `autonomy_max_turn_duration_seconds`) and wired enforcement into the runtime loop.
  • CLI adapter response contract: `run_script` now emits stable typed metadata (`adapter`, `schema_version`, `status`, `error_class`) and normalized script error classes for downstream handling.
  • Speculative policy invariants: Added explicit test coverage enforcing Safe-only speculative eligibility (Caution/Dangerous/Forbidden remain excluded from speculative execution).
CHORETerminology normalization: `soul_text` → `os_text`, `soul_history` → `os_personality_history` (migration 020) for firmware/OS terminology coherency.
CHOREBehavior soak hardening: `scripts/run-agent-behavior-soak.py` now includes regression checks for filesystem capability truthfulness, subagent capability response quality, and affirmative continuation quality, with rubric updates to score substantive outcomes over brittle phrase matching.
CHORERoadmap/release traceability: `docs/releases/v0.9.5.md` and `docs/ROADMAP.md` updated with current v0.9.5 prep status for speculative execution, browser runtime support, CLI skill roadmap slice, and behavior continuity validation.
CHOREArchitecture documentation: Added explicit v0.9.5-prep control/dataflow coverage for deterministic execution shortcuts and guarded response sanitization in `docs/architecture/roboticus-dataflow.md` and `docs/architecture/roboticus-sequences.md`.
CHOREBrowser runtime continuity: Browser action execution now attempts a single stop/start session recovery when CDP disconnect/closed-socket errors are detected, limited to idempotent actions to avoid duplicate side effects on replay.
CHOREAutonomy turn-budget controls: Added configurable agent-level ReAct budget controls (`autonomy_max_react_turns`, `autonomy_max_turn_duration_seconds`) and wired enforcement into the runtime loop.
CHORECLI adapter response contract: `run_script` now emits stable typed metadata (`adapter`, `schema_version`, `status`, `error_class`) and normalized script error classes for downstream handling.
CHORESpeculative policy invariants: Added explicit test coverage enforcing Safe-only speculative eligibility (Caution/Dangerous/Forbidden remain excluded from speculative execution).
FIXInternal protocol fallback leakage: response sanitization no longer surfaces protocol-placeholder fallback text; empty/degraded sanitized content now resolves through deterministic user-facing quality fallback.
FIXMarkdown count execution reliability: execution shortcut path now handles recursive markdown-file count prompts deterministically, including strict numeric-only responses when requested (`count only` / `only the number` style prompts).
FIXDelegation shortcut boundary: markdown-count shortcut no longer hijacks explicitly delegated prompts, preserving delegation intent handling.
FIXSpeculative branch cleanup safety: introduced RAII speculation slot guards and abort-path tests to guarantee no slot leakage when speculative tasks are canceled.
FIXCLI skill sandbox isolation coverage: added explicit tests that secret env vars are stripped while only allowlisted runtime vars are propagated under `skills.sandbox_env=true`.
CHORE$50 seed exercise deferred: The revenue infrastructure is proven via integration tests and the seed exercise plan is authored (`docs/releases/v0.9.6-seed-exercise.md`), but the exercise itself is deferred — the economic ecosystem for autonomous bot services is still nascent. The rails are in place; the market is not.

v0.9.4+hotfix.1

Improvements & More2026-03-05

Added: 3 changes. Changed: 7 changes. Fixed: 2 changes. Security: 1 change. Key changes: Routing observability UX: Metrics dashboard now includes an explorable model-decision graph and a routing-profile spider graph (correctness/cost/speed) with runtime apply support via safe config patching. Model shift telemetry: Non-streaming inference pipeline now emits websocket `model_shift` events when execution model differs from selected model (fallback or cache continuity path). Agent message contract: `/api/agent/message` responses now expose both routing-time and execution-time model fields (`selected_model`, `model`, `model_shift_from`) for continuity diagnostics. Routing dataset privacy default: `GET /api/models/routing-dataset` now redacts `user_excerpt` by default; explicit opt-in is required to include excerpts.

Highlights

  • Routing observability UX: Metrics dashboard now includes an explorable model-decision graph and a routing-profile spider graph (correctness/cost/speed) with runtime apply support via safe config patching.
  • Model shift telemetry: Non-streaming inference pipeline now emits websocket `model_shift` events when execution model differs from selected model (fallback or cache continuity path).
  • Routing profile roadmap spec: Added `docs/roadmap/0.9.4/features/user-routing-profile-spider-graph.md` and linked roadmap entry.
  • Agent message contract: `/api/agent/message` responses now expose both routing-time and execution-time model fields (`selected_model`, `model`, `model_shift_from`) for continuity diagnostics.
  • Routing dataset privacy default: `GET /api/models/routing-dataset` now redacts `user_excerpt` by default; explicit opt-in is required to include excerpts.
  • Routing eval validation: `POST /api/models/routing-eval` now validates `cost_weight`, `accuracy_floor`, and `accuracy_min_obs` bounds.
  • Config defaults/tests: routing defaults now use `metascore`; legacy `heuristic` input is accepted and normalized to `metascore` during validation.
  • Cache integrity mode for live agent path: semantic near-match cache reuse is now disabled in the inference pipeline (`lookup_strict`: exact + tool-TTL only) to prevent instruction-mismatched cached responses.
FEATRouting observability UX: Metrics dashboard now includes an explorable model-decision graph and a routing-profile spider graph (correctness/cost/speed) with runtime apply support via safe config patching.
FEATModel shift telemetry: Non-streaming inference pipeline now emits websocket `model_shift` events when execution model differs from selected model (fallback or cache continuity path).
FEATRouting profile roadmap spec: Added `docs/roadmap/0.9.4/features/user-routing-profile-spider-graph.md` and linked roadmap entry.
CHOREAgent message contract: `/api/agent/message` responses now expose both routing-time and execution-time model fields (`selected_model`, `model`, `model_shift_from`) for continuity diagnostics.
CHORERouting dataset privacy default: `GET /api/models/routing-dataset` now redacts `user_excerpt` by default; explicit opt-in is required to include excerpts.
CHORERouting eval validation: `POST /api/models/routing-eval` now validates `cost_weight`, `accuracy_floor`, and `accuracy_min_obs` bounds.
CHOREConfig defaults/tests: routing defaults now use `metascore`; legacy `heuristic` input is accepted and normalized to `metascore` during validation.
CHORECache integrity mode for live agent path: semantic near-match cache reuse is now disabled in the inference pipeline (`lookup_strict`: exact + tool-TTL only) to prevent instruction-mismatched cached responses.
CHOREPath normalization parity: runtime `PUT /api/config` updates now apply the same tilde (`~`) path expansion as TOML load (`normalize_paths`), including multimodal, device, and knowledge source path fields.
CHOREExplicit config path behavior: `resolve_config_path(Some(\"~/...\"))` now expands to the user home directory instead of preserving a literal `~`.
FIXLive startup migration deadlock on legacy DBs: database initialization/migration order no longer fails on `inference_costs.turn_id` index creation when the column is absent in legacy state.
FIXMigration 13 idempotency: routing v0.9.4 migration path now handles pre-existing `turn_id`/routing columns without `duplicate column` failures.
FIXStrict deny-by-default channels: adapters now reject traffic when allowlists are empty (`deny_on_empty=true`). Alpha update/mechanic flows are expected to repair channel allowlists during upgrade/install.

v0.9.2

New Features & More2026-03-02

Added: 15 changes. Changed: 7 changes. Removed: 4 changes. Key changes: Wiring Remediation (Phase 0): Comprehensive Tier 1–3 wiring audit remediation. 14 gates cleared — all functional wires verified against code. See `docs/audit/wiring-audit-v0.9.md` for the full re-audit. Unified Request Pipeline: API (`agent_message`) and channel (`process_channel_message`) paths now share `prepare_inference` + `execute_inference_pipeline` in `core.rs`, eliminating 6+ behavioral asymmetries between entry points. `post_turn_ingest` Tool Results: All call sites now pass actual tool call name + result from the ReAct loop instead of `&[]`. Episodic memory captures tool-use context, improving digest quality. Gate System Note: `build_gate_system_note` now wired in both API and channel paths (previously channel-only).

Highlights

  • Wiring Remediation (Phase 0): Comprehensive Tier 1–3 wiring audit remediation. 14 gates cleared — all functional wires verified against code. See `docs/audit/wiring-audit-v0.9.md` for the full re-audit.
  • Unified Request Pipeline: API (`agent_message`) and channel (`process_channel_message`) paths now share `prepare_inference` + `execute_inference_pipeline` in `core.rs`, eliminating 6+ behavioral asymmetries between entry points.
  • Multi-Tool Parsing: `parse_tool_calls` (plural) correctly parses multiple tool invocations from a single LLM response across all four provider formats.
  • OpenAI Responses + Google Tool Wiring: Bidirectional tool support for OpenAI Responses API and Google Generative AI — tool definitions translated into requests, structured tool calls parsed from responses with `{"tool_call": ...}` shim.
  • Quality Warm Start: `QualityTracker` is seeded from `inference_costs` on startup, eliminating cold-start assumptions for metascore routing.
  • Escalation Read Feedback: `EscalationTracker` acceptance history now feeds routing weight adjustments via `escalation_bias`, closing the feedback loop.
  • Approval Resume: Blocked tool calls are re-executed asynchronously after approval via `execute_tool_call_after_approval`.
  • Hippocampus (2.13): Self-describing schema map with auto-discovery of all system tables. Agent-created tables (`ag_<id>_*`) with access levels, row counts, and guardrails. Compact summary injected into system prompt (~200 tokens) for ambient storage awareness.
FEATWiring Remediation (Phase 0): Comprehensive Tier 1–3 wiring audit remediation. 14 gates cleared — all functional wires verified against code. See `docs/audit/wiring-audit-v0.9.md` for the full re-audit.
FEATUnified Request Pipeline: API (`agent_message`) and channel (`process_channel_message`) paths now share `prepare_inference` + `execute_inference_pipeline` in `core.rs`, eliminating 6+ behavioral asymmetries between entry points.
FEATMulti-Tool Parsing: `parse_tool_calls` (plural) correctly parses multiple tool invocations from a single LLM response across all four provider formats.
FEATOpenAI Responses + Google Tool Wiring: Bidirectional tool support for OpenAI Responses API and Google Generative AI — tool definitions translated into requests, structured tool calls parsed from responses with `{"tool_call": ...}` shim.
FEATQuality Warm Start: `QualityTracker` is seeded from `inference_costs` on startup, eliminating cold-start assumptions for metascore routing.
FEATEscalation Read Feedback: `EscalationTracker` acceptance history now feeds routing weight adjustments via `escalation_bias`, closing the feedback loop.
FEATApproval Resume: Blocked tool calls are re-executed asynchronously after approval via `execute_tool_call_after_approval`.
FEATHippocampus (2.13): Self-describing schema map with auto-discovery of all system tables. Agent-created tables (`ag_<id>_*`) with access levels, row counts, and guardrails. Compact summary injected into system prompt (~200 tokens) for ambient storage awareness.
FEATAgent Data Tools: `CreateTable`, `AlterTable`, `DropTable` registered in ToolRegistry with hippocampus auto-registration, size limits, and reserved-name enforcement.
FEATDocument Ingestion Pipeline (3.5.5): `roboticus ingest <path>` CLI and `POST /api/knowledge/ingest` API. Supports `.md`, `.txt`, `.rs`, `.py`, `.js`, `.ts`, `.pdf` files. Parse → chunk (512 tokens, 64-token overlap) → embed → store in memory system.
FEATIANA Timezone Support (1.18): Cron scheduler evaluates session reset schedules using IANA timezone identifiers. Conformance tests for DST transitions, sub-minute cron, timezone-prefixed expressions.
FEATInference Costs Extension: `latency_ms` (INTEGER), `quality_score` (REAL), `escalation` (BOOLEAN) columns added to `inference_costs` table. All inference calls now record latency and escalation state.
FEATMCP Server Gateway: First plugin release. `RoboticusMcpHandler` bridges rmcp's `ServerHandler` to the ToolRegistry. External MCP clients (Claude Desktop, Cursor, VS Code) connect via StreamableHTTP, discover tools through `tools/list`, invoke through `tools/call`. All MCP tool calls run with `InputAuthority::External`.
FEATGolden Test Fixtures: Deterministic golden files for delegation, delegation follow-up, echo follow-up, and echo tool-call pathways.
FEATTool-Call Shim Tests: Harness integration tests verifying the full structured tool_call → parse → execute → observation → follow-up pipeline.
CHORE`post_turn_ingest` Tool Results: All call sites now pass actual tool call name + result from the ReAct loop instead of `&[]`. Episodic memory captures tool-use context, improving digest quality.
CHOREGate System Note: `build_gate_system_note` now wired in both API and channel paths (previously channel-only).
CHOREShared Confidence Evaluator: `infer_with_fallback` uses the shared `LlmService.confidence` instance instead of creating a local copy.
CHOREContext Pruning: `needs_pruning()` → `soft_trim()` wired in `build_context` when assembled context exceeds the token budget.
CHORECheckpoint Load: `load_checkpoint` called during inference preparation for session resume (previously write-only).
CHOREImportance Decay: `decay_importance` called from `SessionGovernor.tick()` after digest, preventing stale context accumulation.
CHORECI Pipeline: Parallelized per-crate test execution and harness quick-test stages for faster CI runtime.
CHORE`SpawnManager`: Dead module removed (`spawning.rs` deleted, zero references). Virtual delegation tool pattern replaced it.
CHOREDead Routing Surfaces: `uniroute.rs` (ModelVector, QueryRequirements, ModelVectorRegistry) deleted. Dead selector functions (`select_for_complexity`, `select_cheapest_qualified`, `select_for_quality_target`) removed. `ModelRouter` retained as active runtime override/fallback router.
CHORE`router_integration.rs`: Dead test module removed (tested deleted routing code).
CHORE`skills-roadmap-2026.md`: Superseded by `capabilities-roadmap-2026.md`.

v0.9.1

New Features2026-03-02

Added: 6 changes. Changed: 2 changes. Key changes: Model Metascore Routing (2.19 core): Unified per-model scoring replaces availability-first routing. `ModelProfile` combines static provider attributes (cost, tier, locality) with dynamic observations (quality, capacity headroom, circuit breaker health). `metascore()` produces a transparent 5-dimension breakdown (efficacy, cost, availability, locality, confidence) with configurable weights for cost-aware mode. `select_by_metascore()` is now the primary routing decision in `select_routed_model_with_audit()`. Tiered Inference Pipeline (2.3): `ConfidenceEvaluator` scores local model responses using token probability, response length, and self-reported uncertainty signals. Responses below the confidence floor trigger automatic escalation to the next model in the fallback chain. `EscalationTracker` records escalation events for capacity/cost telemetry. Routing hot path: `select_routed_model_with_audit()` now extracts features from user content, classifies task complexity, builds model profiles, and selects via metascore — replacing the previous first-usable-model strategy. Rate limiter architecture: `GlobalRateLimitLayer` is now constructed once at startup and shared between the axum middleware stack and `AppState`, enabling admin observability of the same rate-limit counters the middleware uses.

Highlights

  • Model Metascore Routing (2.19 core): Unified per-model scoring replaces availability-first routing. `ModelProfile` combines static provider attributes (cost, tier, locality) with dynamic observations (quality, capacity headroom, circuit breaker health). `metascore()` produces a transparent 5-dimension breakdown (efficacy, cost, availability, locality, confidence) with configurable weights for cost-aware mode. `select_by_metascore()` is now the primary routing decision in `select_routed_model_with_audit()`.
  • Tiered Inference Pipeline (2.3): `ConfidenceEvaluator` scores local model responses using token probability, response length, and self-reported uncertainty signals. Responses below the confidence floor trigger automatic escalation to the next model in the fallback chain. `EscalationTracker` records escalation events for capacity/cost telemetry.
  • Throttle Event Observability (1.17): New `GET /api/stats/throttle` endpoint exposes live rate-limit counters including global/per-IP/per-actor request counts, throttle tallies, and top-10 offenders. `ThrottleSnapshot` struct provides admin visibility into abuse patterns.
  • Quality Tracking: `QualityTracker` now records observations on every inference success with a heuristic quality signal (response structure, finish reason, latency). Exponential moving average feeds into metascore efficacy dimension.
  • Audit Trail Extensions: `ModelSelectionAudit` now includes `metascore_breakdown` (full per-dimension scores) and `complexity_score` for routing decisions. `ModelCandidateAudit` includes per-candidate metascores.
  • Profile module (`roboticus-llm::profile`): `ModelProfile`, `MetascoreBreakdown`, `build_model_profiles()`, `select_by_metascore()` — 9 unit tests covering local/cloud task routing, cold-start penalties, cost-aware selection, blocked model filtering, and deterministic tie-breaking.
  • Routing hot path: `select_routed_model_with_audit()` now extracts features from user content, classifies task complexity, builds model profiles, and selects via metascore — replacing the previous first-usable-model strategy.
  • Rate limiter architecture: `GlobalRateLimitLayer` is now constructed once at startup and shared between the axum middleware stack and `AppState`, enabling admin observability of the same rate-limit counters the middleware uses.
FEATModel Metascore Routing (2.19 core): Unified per-model scoring replaces availability-first routing. `ModelProfile` combines static provider attributes (cost, tier, locality) with dynamic observations (quality, capacity headroom, circuit breaker health). `metascore()` produces a transparent 5-dimension breakdown (efficacy, cost, availability, locality, confidence) with configurable weights for cost-aware mode. `select_by_metascore()` is now the primary routing decision in `select_routed_model_with_audit()`.
FEATTiered Inference Pipeline (2.3): `ConfidenceEvaluator` scores local model responses using token probability, response length, and self-reported uncertainty signals. Responses below the confidence floor trigger automatic escalation to the next model in the fallback chain. `EscalationTracker` records escalation events for capacity/cost telemetry.
FEATThrottle Event Observability (1.17): New `GET /api/stats/throttle` endpoint exposes live rate-limit counters including global/per-IP/per-actor request counts, throttle tallies, and top-10 offenders. `ThrottleSnapshot` struct provides admin visibility into abuse patterns.
FEATQuality Tracking: `QualityTracker` now records observations on every inference success with a heuristic quality signal (response structure, finish reason, latency). Exponential moving average feeds into metascore efficacy dimension.
FEATAudit Trail Extensions: `ModelSelectionAudit` now includes `metascore_breakdown` (full per-dimension scores) and `complexity_score` for routing decisions. `ModelCandidateAudit` includes per-candidate metascores.
FEATProfile module (`roboticus-llm::profile`): `ModelProfile`, `MetascoreBreakdown`, `build_model_profiles()`, `select_by_metascore()` — 9 unit tests covering local/cloud task routing, cold-start penalties, cost-aware selection, blocked model filtering, and deterministic tie-breaking.
CHORERouting hot path: `select_routed_model_with_audit()` now extracts features from user content, classifies task complexity, builds model profiles, and selects via metascore — replacing the previous first-usable-model strategy.
CHORERate limiter architecture: `GlobalRateLimitLayer` is now constructed once at startup and shared between the axum middleware stack and `AppState`, enabling admin observability of the same rate-limit counters the middleware uses.

v0.8.9

Bug Fixes & Stability (17 changes)2026-03-01

Security: 3 changes. Fixed: 14 changes. Key changes: HIGH: RwLock held across LLM call: Config read-lock was held for the entire duration of streaming LLM calls, blocking all config writes. Now clones needed values and drops the lock before the network call. HIGH: CSS selector injection: Browser `click` and `type_text` actions now validate CSS selectors, rejecting inputs containing `{`/`}` (which can escape selector context into rule injection) and enforcing a 500-character length limit. HIGH: SSE streaming drops tool-use deltas: OpenAI-format SSE chunks with `content: null` (common in function-call and tool-use deltas) were silently dropped. Now emits an empty-string delta, matching the Anthropic and Google format arms. HIGH: Done event schema mismatch: The SSE `stream_done` event used `"content"` key while all streaming chunks used `"delta"`, causing clients to miss the done signal. Now consistently uses `"delta"`.

Highlights

  • HIGH: RwLock held across LLM call: Config read-lock was held for the entire duration of streaming LLM calls, blocking all config writes. Now clones needed values and drops the lock before the network call.
  • HIGH: CSS selector injection: Browser `click` and `type_text` actions now validate CSS selectors, rejecting inputs containing `{`/`}` (which can escape selector context into rule injection) and enforcing a 500-character length limit.
  • HIGH: Relaxed atomic ordering: Cross-task flags and counters using `Ordering::Relaxed` upgraded to `Acquire`/`Release`/`AcqRel` to ensure correct visibility guarantees across async task boundaries.
  • HIGH: SSE streaming drops tool-use deltas: OpenAI-format SSE chunks with `content: null` (common in function-call and tool-use deltas) were silently dropped. Now emits an empty-string delta, matching the Anthropic and Google format arms.
  • HIGH: Done event schema mismatch: The SSE `stream_done` event used `"content"` key while all streaming chunks used `"delta"`, causing clients to miss the done signal. Now consistently uses `"delta"`.
  • HIGH: Dead-letter replay race: Two locks acquired non-atomically during message replay could interleave with concurrent deliveries. Now holds both locks in a single scope.
  • HIGH: ReAct tool errors bypass scan_output: Error messages from tool execution were returned directly to the model without content scanning. Now calls `scan_output()` on tool error strings.
  • HIGH: derive_nickname Unicode panic: `&text[prefix.len()..]` applied a byte offset from a lowercased string to the original, panicking on multi-byte characters. Now uses `char_indices().nth()` for safe boundary detection.
FIXHIGH: RwLock held across LLM call: Config read-lock was held for the entire duration of streaming LLM calls, blocking all config writes. Now clones needed values and drops the lock before the network call.
FIXHIGH: CSS selector injection: Browser `click` and `type_text` actions now validate CSS selectors, rejecting inputs containing `{`/`}` (which can escape selector context into rule injection) and enforcing a 500-character length limit.
FIXHIGH: Relaxed atomic ordering: Cross-task flags and counters using `Ordering::Relaxed` upgraded to `Acquire`/`Release`/`AcqRel` to ensure correct visibility guarantees across async task boundaries.
FIXHIGH: SSE streaming drops tool-use deltas: OpenAI-format SSE chunks with `content: null` (common in function-call and tool-use deltas) were silently dropped. Now emits an empty-string delta, matching the Anthropic and Google format arms.
FIXHIGH: Done event schema mismatch: The SSE `stream_done` event used `"content"` key while all streaming chunks used `"delta"`, causing clients to miss the done signal. Now consistently uses `"delta"`.
FIXHIGH: Dead-letter replay race: Two locks acquired non-atomically during message replay could interleave with concurrent deliveries. Now holds both locks in a single scope.
FIXHIGH: ReAct tool errors bypass scan_output: Error messages from tool execution were returned directly to the model without content scanning. Now calls `scan_output()` on tool error strings.
FIXHIGH: derive_nickname Unicode panic: `&text[prefix.len()..]` applied a byte offset from a lowercased string to the original, panicking on multi-byte characters. Now uses `char_indices().nth()` for safe boundary detection.
FIXMED: WebSocket idle timeout missing: `handle_socket` had no timeout — idle clients held file descriptors and broadcast receivers indefinitely. Now sends ping every 30s with a 90s idle timeout.
FIXMED: Web path bypasses decomposition gate: `evaluate_decomposition_gate` was only called in `process_channel_message`, not in the web API's `agent_message`. Extracted into a shared helper called from both paths.
FIXMED: Agent processing invisible in logs: Neither `agent_message` nor `process_channel_message` logged entry spans. Added `info!` spans with session_id and channel at function entry.
FIXMED: --json flag ignored: The `--json` CLI flag was only threaded to `cmd_defrag`. Now threaded to `cmd_status` and other output-producing commands.
FIXMED: Config capabilities empty: `/api/config/capabilities` returned an empty `immutable_sections` list. Now populated with `["server", "treasury", "a2a", "wallet"]`.
FIXMED: config get returns stale TOML: `roboticus config get` read from the on-disk TOML even when the server was running with different runtime values. Now tries the live API first, falling back to TOML when offline.
FIXMED: A2A missing from channel status: `/api/channels/status` omitted the A2A channel. Now includes a hardcoded A2A entry reading enabled/listening state from server state.
FIXMED: Dashboard scheduler hardcodes agent_id: The scheduler panel used a hardcoded `agent_id: 'roboticus'` instead of the active agent. Now uses `App._activeAgentId`.
FIXLOW: Missing #[must_use] annotations: Added `#[must_use]` to 8 builder/constructor methods across `speculative.rs`, `actions.rs`, and `knowledge.rs` to prevent accidental discard of return values.

v0.8.8

Security Hardening (39 changes)2026-03-01

Security: 13 changes. Fixed: 26 changes. Key changes: HIGH: WebSocket API key leak: Replaced `?token=` query-string authentication on WebSocket upgrade with a ticket-based flow, preventing API keys from appearing in server logs, proxy logs, and browser history. HIGH: Prompt injection in tips: `get_turn_tips` and `get_session_insights` now sanitize LLM-generated tips before rendering, preventing stored prompt injection via malicious session content. HIGH: Float policy bypass: Policy enforcement on `amount` fields now falls back to `as_f64()` conversion, closing a bypass where float amounts evaded integer-only checks. HIGH: Tool call parsing failures: `parse_tool_call` now uses `rfind` with a candidate loop, correctly parsing tool calls that contain the delimiter character in arguments.

Highlights

  • HIGH: WebSocket API key leak: Replaced `?token=` query-string authentication on WebSocket upgrade with a ticket-based flow, preventing API keys from appearing in server logs, proxy logs, and browser history.
  • HIGH: Prompt injection in tips: `get_turn_tips` and `get_session_insights` now sanitize LLM-generated tips before rendering, preventing stored prompt injection via malicious session content.
  • HIGH: Provider error info leak: `classify_provider_error` in `run_llm_analysis` now strips internal details from error responses before returning to callers.
  • MED: XSS in sanitize_html: `sanitize_html` now escapes all 5 OWASP-recommended HTML entities (`& < > " '`), closing a reflected XSS vector.
  • MED: Input validation on identifiers: `peer_id`, `group_id`, and `channel` fields now enforce length and character-set constraints, preventing injection of oversized or malformed identifiers.
  • MED: Webhook body size limit: Public webhook router now applies `DefaultBodyLimit` to prevent memory exhaustion from oversized payloads.
  • MED: Analysis route DoS protection: Analysis routes now apply `ConcurrencyLimitLayer(3)` to prevent resource exhaustion from concurrent expensive LLM calls.
  • MED: Config schema leak: `update_config` error responses now return a generic message instead of leaking internal schema details.
FIXHIGH: WebSocket API key leak: Replaced `?token=` query-string authentication on WebSocket upgrade with a ticket-based flow, preventing API keys from appearing in server logs, proxy logs, and browser history.
FIXHIGH: Prompt injection in tips: `get_turn_tips` and `get_session_insights` now sanitize LLM-generated tips before rendering, preventing stored prompt injection via malicious session content.
FIXHIGH: Provider error info leak: `classify_provider_error` in `run_llm_analysis` now strips internal details from error responses before returning to callers.
FIXMED: XSS in sanitize_html: `sanitize_html` now escapes all 5 OWASP-recommended HTML entities (`& < > " '`), closing a reflected XSS vector.
FIXMED: Input validation on identifiers: `peer_id`, `group_id`, and `channel` fields now enforce length and character-set constraints, preventing injection of oversized or malformed identifiers.
FIXMED: Webhook body size limit: Public webhook router now applies `DefaultBodyLimit` to prevent memory exhaustion from oversized payloads.
FIXMED: Analysis route DoS protection: Analysis routes now apply `ConcurrencyLimitLayer(3)` to prevent resource exhaustion from concurrent expensive LLM calls.
FIXMED: Config schema leak: `update_config` error responses now return a generic message instead of leaking internal schema details.
FIXMED: Feedback comment size limit: `FeedbackRequest.comment` now enforces a 4096-character cap, preventing oversized payloads from reaching storage.
FIXMED: Config allowlist tightening: Removed `extra_headers` from the `get_config` response allowlist, preventing exposure of sensitive header values.
FIXLOW: Unsafe UTF-8 decode: Replaced `from_utf8_unchecked` with safe `from_utf8` to prevent undefined behavior on malformed input.
FIXLOW: Embedding test env isolation: Embedding test uses a unique env var name with a SAFETY comment to prevent cross-test interference.
FIXLOW: Path traversal defense-in-depth: `obsidian_read` now validates paths against directory traversal patterns as an additional defense layer.
FIXHIGH: Float policy bypass: Policy enforcement on `amount` fields now falls back to `as_f64()` conversion, closing a bypass where float amounts evaded integer-only checks.
FIXHIGH: Tool call parsing failures: `parse_tool_call` now uses `rfind` with a candidate loop, correctly parsing tool calls that contain the delimiter character in arguments.
FIXHIGH: Unicode string metric: `common_prefix_ratio` now operates on `chars()` instead of byte slices, producing correct ratios for multi-byte characters.
FIXHIGH: Incorrect P50 latency: `latency_p50` now computes the true median by averaging the two middle values for even-length arrays.
FIXHIGH: Speculation cache collisions: `SpeculationKey` now stores full parameter JSON instead of using `DefaultHasher`, which was not stable across processes and caused incorrect cache hits.
FIXHIGH: WhatsApp adapter panic: `WhatsAppAdapter::new` now returns `Result<Self>` instead of panicking on initialization failures.
FIXHIGH: Export agents silent failure: `export_agents` now matches on `Result` and propagates errors instead of silently dropping them.
FIXHIGH: Inference cost logging: `record_inference_cost` now uses `inspect_err` to log failures instead of silently discarding them with `.ok()`.
FIXMED: Turn count inflation: `turn_count` now only increments on `Think` state transitions, fixing 2-3x count inflation from duplicate counting.
FIXMED: Archive truncation: `compact_before_archive` now fetches all messages instead of being capped at 20, preventing data loss during session archival.
FIXMED: URL decoder corruption: `%XX` decoder now preserves characters on invalid hex sequences instead of silently dropping them.
FIXMED: Task handoff stalls: Handoff logic now skips `Failed` tasks to find the next `Pending` task, preventing the scheduler from stalling on failed work.
FIXMED: Config write propagation: `write_defaults` now propagates errors with `?` instead of silently discarding them with `.ok()`.
FIXMED: Cron validation logging: Invalid cron expressions now log a warning before returning `false`, replacing a silent rejection.
FIXMED: Wallet passphrase fallthrough: An incorrect `ROBOTICUS_WALLET_PASSPHRASE` now produces a hard error instead of silently falling through to the default passphrase.
FIXMED: Config/session export errors: `to_string_pretty` failures in config/session export now return proper error responses instead of empty bodies.
FIXMED: Corrupt skills warning: Corrupt `skills_json` values now log a warning instead of being silently ignored.
FIXMED: Translation request errors: `translate_request` failures now return HTTP 500 with a proper error body instead of an empty response.
FIXMED: Translation response errors: `translate_response` failures now return HTTP 502 with a descriptive message instead of `"(no response)"`.
FIXLOW: Loop detection consolidation: Removed redundant `is_looping` pre-check, consolidating loop detection into a single code path.
FIXLOW: Archive count accuracy: `rotate_agent_scope_sessions` now returns the actual archived count instead of a potentially incorrect value.
FIXLOW: Token parse overflow: Token parsing now uses saturating `u32` casts, capping at `u32::MAX` instead of panicking on overflow.
FIXLOW: Subtask dedup ordering: `split_subtasks` now uses a `HashSet` for order-preserving deduplication instead of unstable dedup.
FIXLOW: Session row corruption logging: Corrupted session rows now log a warning instead of being silently dropped during iteration.
FIXLOW: DB error logging for cost queries: Database errors in turn-query average cost calculations are now logged instead of silently ignored.
FIXLOW: Defrag read error handling: File defragmentation now skips files on read error with a warning instead of substituting an empty string.

v0.8.7

Bug Fixes & Stability (21 changes)2026-02-28

Fixed: 19 changes. Added: 2 changes. Key changes: CRIT: Cron jobs silently never firing: `run_cron_worker` timestamp format lacked timezone suffix (`Z`), causing `evaluate_cron` RFC 3339 parse to always fail — all cron-scheduled jobs were silently skipped. HIGH: Telegram chunk_message UTF-8 panic: Byte-level string slicing in `chunk_message` panicked on multi-byte characters (emoji, CJK). Now uses `floor_char_boundary()` matching the Discord adapter. Release notes for v0.8.5 and v0.8.6 (missing from previous releases, blocking release doc gate). Roadmap section 1.24: Built-in CLI Agent Skills (Claude Code + Codex CLI).

Highlights

  • CRIT: Cron jobs silently never firing: `run_cron_worker` timestamp format lacked timezone suffix (`Z`), causing `evaluate_cron` RFC 3339 parse to always fail — all cron-scheduled jobs were silently skipped.
  • HIGH: Telegram chunk_message UTF-8 panic: Byte-level string slicing in `chunk_message` panicked on multi-byte characters (emoji, CJK). Now uses `floor_char_boundary()` matching the Discord adapter.
  • HIGH: Keystore redact_key_name UTF-8 panic: Byte-level `&key[..3]` slicing panicked on multi-byte key names. Now uses `key.chars().take(3)`.
  • HIGH: LLM forward_stream missing query: auth mode: Streaming requests to providers using query-string authentication (e.g., Google Generative AI) failed because the `query:` prefix was not handled, sending it as a literal HTTP header instead.
  • HIGH: yield_engine U256-to-u64 panic: `real_a_token_balance` panicked via `U256::to::<u64>()` if an aToken balance exceeded `u64::MAX`. Now uses safe `try_into::<u128>()`.
  • HIGH: yield_engine amount_to_raw saturation: `amount_to_raw` silently saturated USDC amounts above ~$18.4B via unchecked `f64 -> u64` cast. Now explicitly clamps.
  • MED: Email adapter SMTP relay panic: `EmailAdapter::new` panicked via `.expect()` on invalid SMTP hostname. Now returns `Result`.
  • MED: Email adapter mutex panics: `push_message`/`recv` used `.expect("mutex poisoned")`. Now uses `.unwrap_or_else(|e| e.into_inner())` for poison recovery, matching other adapters.
FIXCRIT: Cron jobs silently never firing: `run_cron_worker` timestamp format lacked timezone suffix (`Z`), causing `evaluate_cron` RFC 3339 parse to always fail — all cron-scheduled jobs were silently skipped.
FIXHIGH: Telegram chunk_message UTF-8 panic: Byte-level string slicing in `chunk_message` panicked on multi-byte characters (emoji, CJK). Now uses `floor_char_boundary()` matching the Discord adapter.
FIXHIGH: Keystore redact_key_name UTF-8 panic: Byte-level `&key[..3]` slicing panicked on multi-byte key names. Now uses `key.chars().take(3)`.
FIXHIGH: LLM forward_stream missing query: auth mode: Streaming requests to providers using query-string authentication (e.g., Google Generative AI) failed because the `query:` prefix was not handled, sending it as a literal HTTP header instead.
FIXHIGH: yield_engine U256-to-u64 panic: `real_a_token_balance` panicked via `U256::to::<u64>()` if an aToken balance exceeded `u64::MAX`. Now uses safe `try_into::<u128>()`.
FIXHIGH: yield_engine amount_to_raw saturation: `amount_to_raw` silently saturated USDC amounts above ~$18.4B via unchecked `f64 -> u64` cast. Now explicitly clamps.
FIXMED: Email adapter SMTP relay panic: `EmailAdapter::new` panicked via `.expect()` on invalid SMTP hostname. Now returns `Result`.
FIXMED: Email adapter mutex panics: `push_message`/`recv` used `.expect("mutex poisoned")`. Now uses `.unwrap_or_else(|e| e.into_inner())` for poison recovery, matching other adapters.
FIXMED: Discord GatewayConnection mutex panics: All 4 accessor methods used `.expect("mutex poisoned")`. Now uses poison recovery matching the rest of the Discord adapter.
FIXMED: CDP client initialization panic: `CdpClient::new` panicked via `.expect()` on TLS cert issues. Now returns `Result`.
FIXMED: Embedding URL double API key: When both Google format and `query:` auth were active, the API key was appended twice. Made the two paths mutually exclusive.
FIXMED: Embedding URL missing percent-encoding: API keys were interpolated into URLs without encoding. Now uses `pct_encode_query_value`.
FIXMED: Hippocampus Unicode/ASCII mismatch: `create_agent_table` allowed Unicode alphanumeric characters but `drop_agent_table` required ASCII-only, creating undeletable tables. Both now require ASCII.
FIXMED: Skills reload counters wrong on failure: `added`/`updated` counters incremented even when DB operations failed. Now only increment on success.
FIXMED: Skills rollback silent failures: File rollback operations used `let _ =` silently. Now log errors at error level.
FIXLOW: sanitize_platform mixed byte/char units: Truncation used `.chars().take()` (char count) after a `.len()` (byte count) guard. Now truncates at byte boundary consistently.
FIXLOW: mock_tx_hash f64 saturation: Used `amount * 1e18` (overflows u64 above ~18.4). Changed to USDC scale (1e6).
FIXLOW: Session model column never populated: `update_model()` was not called after LLM routing, leaving the `sessions.model` column perpetually NULL.
FIXLOW: Moonshot/Kimi tier misclassified: `classify()` in `tier.rs` did not match `moonshot` or `kimi` substrings, causing Kimi K2 models to fall through to the T2 default instead of T3.
FEATRelease notes for v0.8.5 and v0.8.6 (missing from previous releases, blocking release doc gate).
FEATRoadmap section 1.24: Built-in CLI Agent Skills (Claude Code + Codex CLI).

v0.8.6

Security Hardening & More2026-02-28

Security: 9 changes. Fixed: 14 changes. Added: 2 changes. Key changes: CRIT: Unauthenticated rate-limit actor identity: Removed `x-user-id` header as rate-limit actor identity — it was unauthenticated and trivially spoofable. CRIT: Stable token fingerprinting: Replaced `DefaultHasher` with SHA-256 for token fingerprinting, since `DefaultHasher` is not stable across processes and could cause cache/rate-limit bypasses. Windows daemon error propagation: `schtasks /Create` errors now propagate instead of being silently dropped; post-spawn verification added; `schtasks /Delete` errors during uninstall handled correctly. CLI API key headers: Added `--api-key`/`ROBOTICUS_API_KEY` global CLI argument. All 22 bare `reqwest` calls replaced with `http_client()` helper that injects API key as default header.

Highlights

  • CRIT: Unauthenticated rate-limit actor identity: Removed `x-user-id` header as rate-limit actor identity — it was unauthenticated and trivially spoofable.
  • CRIT: Stable token fingerprinting: Replaced `DefaultHasher` with SHA-256 for token fingerprinting, since `DefaultHasher` is not stable across processes and could cause cache/rate-limit bypasses.
  • HIGH: Rate-limit IP fallback: IP extraction now uses `ConnectInfo<SocketAddr>` (real TCP peer address) instead of a hardcoded `127.0.0.1` fallback.
  • HIGH: ASCII-only identifiers: `validate_identifier` now restricts to ASCII alphanumeric characters, closing Unicode homoglyph and normalization attacks.
  • HIGH: Memory search query cap: `/api/memory/search` query parameter capped at 512 characters to prevent regex-based DoS.
  • HIGH: Error message sanitization: Added SQLite schema-leaking prefixes (`no such table`, `no such column`, etc.) to the error sanitization blocklist.
  • MED: Rate-limit counter ordering: Global rate-limit counter now incremented after per-IP/per-actor checks pass, preventing global exhaustion from blocked IPs.
  • MED: Symlink-safe directory traversal: `collect_findings_recursive` now uses `entry.file_type()` and skips symlinks, preventing symlink-following attacks.
FIXCRIT: Unauthenticated rate-limit actor identity: Removed `x-user-id` header as rate-limit actor identity — it was unauthenticated and trivially spoofable.
FIXCRIT: Stable token fingerprinting: Replaced `DefaultHasher` with SHA-256 for token fingerprinting, since `DefaultHasher` is not stable across processes and could cause cache/rate-limit bypasses.
FIXHIGH: Rate-limit IP fallback: IP extraction now uses `ConnectInfo<SocketAddr>` (real TCP peer address) instead of a hardcoded `127.0.0.1` fallback.
FIXHIGH: ASCII-only identifiers: `validate_identifier` now restricts to ASCII alphanumeric characters, closing Unicode homoglyph and normalization attacks.
FIXHIGH: Memory search query cap: `/api/memory/search` query parameter capped at 512 characters to prevent regex-based DoS.
FIXHIGH: Error message sanitization: Added SQLite schema-leaking prefixes (`no such table`, `no such column`, etc.) to the error sanitization blocklist.
FIXMED: Rate-limit counter ordering: Global rate-limit counter now incremented after per-IP/per-actor checks pass, preventing global exhaustion from blocked IPs.
FIXMED: Symlink-safe directory traversal: `collect_findings_recursive` now uses `entry.file_type()` and skips symlinks, preventing symlink-following attacks.
FIXMED: WhatsApp HMAC raw byte comparison: HMAC verification now compares raw bytes instead of hex string representations, closing timing side-channels from variable-length hex comparison.
FIXWindows daemon error propagation: `schtasks /Create` errors now propagate instead of being silently dropped; post-spawn verification added; `schtasks /Delete` errors during uninstall handled correctly.
FIXCLI API key headers: Added `--api-key`/`ROBOTICUS_API_KEY` global CLI argument. All 22 bare `reqwest` calls replaced with `http_client()` helper that injects API key as default header.
FIXFlaky test elimination: Replaced TOCTOU ephemeral port test with RFC 5737 TEST-NET-1 address (192.0.2.1) for deterministic unreachable-port testing.
FIXBundled providers parse failure (F5): Changed `.unwrap_or_default()` to `.expect()` — bundled TOML is build-time data; parse failure means the binary is broken and should panic fast.
FIXUpdate state save errors (F3): Three `state.save().ok()` sites now log errors before discarding, plus update state load now logs parse/read failures.
FIXLegacy Windows service cleanup (F7): `sc.exe stop/delete` errors during legacy cleanup now logged at debug level instead of silently dropped.
FIXOAuth token resolution (F8): `resolve_token().ok()` now logs failures, surfacing OAuth refresh errors that were previously invisible.
FIXTranslate request error propagation (F9): `translate_request` errors now return HTTP 400 instead of falling back to an empty JSON body.
FIXCorrupted cost row logging (F10): `filter_map(|r| r.ok())` on cost query rows now logs dropped rows.
FIXEmbedding failure logging (F12): Three `embed_single().ok()` sites now log failures, making RAG degradation visible.
FIXDefrag stdout write errors (F14): JSON stdout writes now propagate `io::Error` instead of silently dropping.
FIXSession nickname update (F19): `update_nickname().ok()` now logs failures.
FIXRecommendation inference cost (F20): `record_inference_cost().ok()` now logs failures.
FIXAgent status query errors: Tool call and turn queries in agent status now log errors at debug level.
FEATAuth middleware roundtrip tests: wrong key rejection, no-auth passthrough, POST method coverage.
FEATSSE streaming endpoint validation tests: empty content, oversized content, missing fields.

v0.8.5

Bug Fixes & Stability (28 changes)2026-02-28

Security: 6 changes. Fixed: 22 changes. Key changes: WASM preemptive timeout (BUG-101): WASM plugin execution now runs on a dedicated thread with `recv_timeout`, providing true preemptive timeout instead of the previous post-hoc elapsed-time check that allowed malicious modules to run indefinitely. Script runner orphan kill (BUG-102): Script runner now captures the child PID before `wait_with_output()` and sends `kill -9` on timeout, preventing orphan process accumulation. reqwest Client panic (BUG-105): `VectorDbSource::new()` and `GraphSource::new()` now return `Result` instead of panicking via `.expect()` when TLS initialization fails. Signal handler crash (BUG-108): SIGTERM handler installation now falls back to SIGINT-only mode instead of crashing via `.expect()` in containerized environments.

Highlights

  • WASM preemptive timeout (BUG-101): WASM plugin execution now runs on a dedicated thread with `recv_timeout`, providing true preemptive timeout instead of the previous post-hoc elapsed-time check that allowed malicious modules to run indefinitely.
  • Script runner orphan kill (BUG-102): Script runner now captures the child PID before `wait_with_output()` and sends `kill -9` on timeout, preventing orphan process accumulation.
  • Rate limiter memory bounds (BUG-103): Per-IP and per-actor rate limit maps are now capped at 10,000 and 5,000 entries respectively, preventing unbounded memory growth during distributed floods. Throttle tracking maps are also cleared on window reset.
  • Knowledge/Obsidian bounded reads (BUG-104, BUG-110): `DirectorySource::query()` and `parse_note()` now enforce 10 MB and 5 MB file size limits respectively, preventing OOM on oversized files.
  • Config secret allowlist (BUG-106): Admin config endpoint now uses an allowlist (`ALLOWED_FIELDS`) instead of a blocklist for field filtering, ensuring new secret fields are safe by default.
  • Interview turn cap (BUG-107): Interview sessions now enforce a 200-turn maximum to prevent unbounded memory growth within the 3600s TTL.
  • reqwest Client panic (BUG-105): `VectorDbSource::new()` and `GraphSource::new()` now return `Result` instead of panicking via `.expect()` when TLS initialization fails.
  • Signal handler crash (BUG-108): SIGTERM handler installation now falls back to SIGINT-only mode instead of crashing via `.expect()` in containerized environments.
FIXWASM preemptive timeout (BUG-101): WASM plugin execution now runs on a dedicated thread with `recv_timeout`, providing true preemptive timeout instead of the previous post-hoc elapsed-time check that allowed malicious modules to run indefinitely.
FIXScript runner orphan kill (BUG-102): Script runner now captures the child PID before `wait_with_output()` and sends `kill -9` on timeout, preventing orphan process accumulation.
FIXRate limiter memory bounds (BUG-103): Per-IP and per-actor rate limit maps are now capped at 10,000 and 5,000 entries respectively, preventing unbounded memory growth during distributed floods. Throttle tracking maps are also cleared on window reset.
FIXKnowledge/Obsidian bounded reads (BUG-104, BUG-110): `DirectorySource::query()` and `parse_note()` now enforce 10 MB and 5 MB file size limits respectively, preventing OOM on oversized files.
FIXConfig secret allowlist (BUG-106): Admin config endpoint now uses an allowlist (`ALLOWED_FIELDS`) instead of a blocklist for field filtering, ensuring new secret fields are safe by default.
FIXInterview turn cap (BUG-107): Interview sessions now enforce a 200-turn maximum to prevent unbounded memory growth within the 3600s TTL.
FIXreqwest Client panic (BUG-105): `VectorDbSource::new()` and `GraphSource::new()` now return `Result` instead of panicking via `.expect()` when TLS initialization fails.
FIXSignal handler crash (BUG-108): SIGTERM handler installation now falls back to SIGINT-only mode instead of crashing via `.expect()` in containerized environments.
FIXHeartbeat unreachable panic (BUG-109): `interval_for_tier()` catch-all arm now returns a safe default (`interval_ms * 2`) instead of `unreachable!()`, preventing runtime panics if new `SurvivalTier` variants are added.
FIXRegex recompilation (BUG-111): Obsidian tag and wikilink regexes are now `LazyLock` statics instead of being recompiled on every invocation.
FIXBudget float precision (BUG-112): `record_spending()` now uses epsilon-aware comparison to avoid IEEE 754 rounding errors causing spurious over-budget rejections.
FIXSub-agent lifecycle failures (SF-15–SF-20): All `let _ =` patterns on `registry.register()`, `start_agent()`, `stop_agent()`, `unregister()`, and `assign_agent()` now log errors at appropriate levels.
FIXAPI key env var diagnostics (SF-21, SF-22): Empty and missing API key / email password environment variables now produce warn-level log messages instead of silently returning empty strings.
FIXSub-agent list errors (SF-23): `list_sub_agents` DB errors now propagate at the delegation entry point and log at remaining fallback sites.
FIXSkills list errors (SF-24): `list_skills` DB failure now logged before fallback.
FIXMCP discovery failure (SF-25): MCP client discovery errors at startup now logged at warn level.
FIXSemantic cache load failure (SF-26): Cache load errors now logged before fallback to empty.
FIXProvider key resolution (SF-27): Missing provider keys for non-local providers now produce warn-level diagnostics.
FIXBundled providers parse failure (SF-28): TOML parse errors for bundled providers now logged.
FIXConfig backup restore (SF-29): Failed hot-reload backup restoration now logged at error level.
FIXMigration SQL errors (SF-30): SQL execution failures during migration now surfaced as warnings.
FIXThinking indicator failures (SF-31): Channel thinking indicator send failures now logged at debug level across all 4 platforms.
FIXSession candidates JSON (SF-32): Model selection candidate deserialization errors now logged.
FIXTelegram API errors (SF-33): Typing indicator and message delete HTTP failures now logged at debug level.
FIXSession counts fallback (SF-34): Sub-agent session count DB errors now logged before fallback.
FIXSubtask JSON parse (SF-35): Malformed `subtasks` parameter (non-array) now produces a warning instead of silently returning empty.
FIX19 additional MEDIUM silent failures (SF-36–SF-52): Error logging added across oauth, plugin-sdk, retrieval, digest, skills, signal, discord, whatsapp, sessions, defrag, embedding, main CLI, keystore, and obsidian modules.
FIXMigration export cascade (SF-48): Channel export now properly reports file read failures and JSON serialization errors instead of silently producing empty output.

v0.8.4

Bug Fixes & Stability & More2026-02-28

Security: 3 changes. Fixed: 16 changes. Changed: 1 change. Key changes: WebSocket message size limit: Unauthenticated WebSocket connections now enforce a 4 KiB inbound message limit and no longer echo full message bodies, closing a ~3x memory amplification DoS vector. Hippocampus TOCTOU fix: `drop_agent_table` auth check and DROP are now wrapped in a single transaction, preventing race-condition bypasses. Agent amnesia on DB error (SF-2): `list_messages` calls in agent routes now propagate errors instead of silently returning empty history via `.unwrap_or_default()`. Governor silent write failures (SF-1): Session expiry and compaction errors are now logged at warn/error level; `tick()` returns an accurate expired count instead of silently swallowing failures with `.ok()`.

Highlights

  • WebSocket message size limit: Unauthenticated WebSocket connections now enforce a 4 KiB inbound message limit and no longer echo full message bodies, closing a ~3x memory amplification DoS vector.
  • Hippocampus TOCTOU fix: `drop_agent_table` auth check and DROP are now wrapped in a single transaction, preventing race-condition bypasses.
  • Script runner bounded reads: Shebang detection now uses `BufReader::take(512)` instead of `read_to_string`, preventing OOM on oversized script files.
  • Agent amnesia on DB error (SF-2): `list_messages` calls in agent routes now propagate errors instead of silently returning empty history via `.unwrap_or_default()`.
  • Governor silent write failures (SF-1): Session expiry and compaction errors are now logged at warn/error level; `tick()` returns an accurate expired count instead of silently swallowing failures with `.ok()`.
  • Money::from_dollars NaN panic (BUG-2): `from_dollars` now returns `Result`, rejecting NaN and Infinity inputs instead of panicking via `assert!`.
  • Delivery queue recovery (SF-7): `recover_from_store` is now async with proper `.lock().await`, replacing a `try_lock()` that silently dropped recovered messages.
  • Agent loop detection enforcement (BUG-3): `is_looping()` is now called inside `transition()` and forces `Done` state, preventing callers from bypassing loop detection.
FIXWebSocket message size limit: Unauthenticated WebSocket connections now enforce a 4 KiB inbound message limit and no longer echo full message bodies, closing a ~3x memory amplification DoS vector.
FIXHippocampus TOCTOU fix: `drop_agent_table` auth check and DROP are now wrapped in a single transaction, preventing race-condition bypasses.
FIXScript runner bounded reads: Shebang detection now uses `BufReader::take(512)` instead of `read_to_string`, preventing OOM on oversized script files.
FIXAgent amnesia on DB error (SF-2): `list_messages` calls in agent routes now propagate errors instead of silently returning empty history via `.unwrap_or_default()`.
FIXGovernor silent write failures (SF-1): Session expiry and compaction errors are now logged at warn/error level; `tick()` returns an accurate expired count instead of silently swallowing failures with `.ok()`.
FIXMoney::from_dollars NaN panic (BUG-2): `from_dollars` now returns `Result`, rejecting NaN and Infinity inputs instead of panicking via `assert!`.
FIXDelivery queue recovery (SF-7): `recover_from_store` is now async with proper `.lock().await`, replacing a `try_lock()` that silently dropped recovered messages.
FIXAgent loop detection enforcement (BUG-3): `is_looping()` is now called inside `transition()` and forces `Done` state, preventing callers from bypassing loop detection.
FIXDigit-leading SQL identifiers (BUG-7): `validate_identifier` now rejects names starting with digits, which would produce invalid SQL.
FIXEmbedding API key error message (SF-4): Missing API key env var now returns a clear error message instead of a cryptic 401 via `.unwrap_or_default()`.
FIXANN index corruption paths (SF-6, SF-10): Corrupt embedding JSON is now logged and skipped; RwLock poison on write returns an error instead of silently recovering with stale data.
FIXAdmin dashboard false empties (SF-3): DB read errors in dashboard endpoints are now logged with `inspect_err` before falling back to defaults, enabling diagnosis.
FIXSession tool call queries (SF-9): Tool call endpoints now propagate DB errors with proper HTTP 500 responses instead of returning empty arrays.
FIXEventBus publish logging (SF-5): `let _ =` on channel send replaced with debug-level logging when no subscribers are active.
FIXDelivery queue timestamp fallback (SF-11): Failed timestamp parse now falls back to `UNIX_EPOCH` (safe backoff) instead of `Utc::now()` (immediate retry).
FIXDead letter false empties (SF-8): `dead_letters_from_store` errors now logged before fallback.
FIXAdmin config serialization (SF-12): Config endpoint returns HTTP 500 on serialization failure instead of null body.
FIXEfficiency report serialization (SF-13): Efficiency endpoint returns HTTP 500 on serialization failure instead of null body.
FIXWebhook body bytes (SF-14): Failed body extraction now logs a warning instead of silently discarding the payload.
CHORECrate publish ordering: Release workflow now publishes crates in correct topological dependency order with increased index propagation wait times, fixing the v0.8.3 publish failure.

v0.8.3

Security Hardening & More2026-02-27

Security: 4 changes. Fixed: 4 changes. Added: 1 change. Key changes: Auth bypass when no API key: Requests to non-exempt API routes now fail closed when no API key is configured — only loopback connections are allowed. Previously, missing API key config silently allowed all traffic. A2A replay protection: Added nonce registry with TTL-based expiry to the A2A protocol, preventing message replay attacks within the nonce window. UTF-8 panic in memory truncation: Replaced unsafe byte-level string slicing with `floor_char_boundary()` to prevent panics on multi-byte characters (emoji, CJK) near the 200-char truncation point. Script plugin zombie processes: Script timeout now explicitly kills the child process and reaps it, preventing zombie accumulation.

Highlights

  • Auth bypass when no API key: Requests to non-exempt API routes now fail closed when no API key is configured — only loopback connections are allowed. Previously, missing API key config silently allowed all traffic.
  • A2A replay protection: Added nonce registry with TTL-based expiry to the A2A protocol, preventing message replay attacks within the nonce window.
  • Plugin permission enforcement: New `strict_permissions` and `allowed_permissions` config fields for plugin policy. In strict mode, undeclared permissions are blocked; in permissive mode (default), they produce a warning.
  • Ethereum signature recovery ID: EIP-191 signatures now include the recovery byte (v = 27 or 28), producing correct 65-byte signatures instead of 64-byte truncated ones.
  • UTF-8 panic in memory truncation: Replaced unsafe byte-level string slicing with `floor_char_boundary()` to prevent panics on multi-byte characters (emoji, CJK) near the 200-char truncation point.
  • Script plugin zombie processes: Script timeout now explicitly kills the child process and reaps it, preventing zombie accumulation.
  • Script plugin unbounded output: stdout/stderr from plugin scripts are now capped at 10 MB via `AsyncReadExt::take()`.
  • Keystore lock ordering: Consolidated two separate mutexes into a single `KeystoreState` mutex, eliminating potential deadlock scenarios.
FIXAuth bypass when no API key: Requests to non-exempt API routes now fail closed when no API key is configured — only loopback connections are allowed. Previously, missing API key config silently allowed all traffic.
FIXA2A replay protection: Added nonce registry with TTL-based expiry to the A2A protocol, preventing message replay attacks within the nonce window.
FIXPlugin permission enforcement: New `strict_permissions` and `allowed_permissions` config fields for plugin policy. In strict mode, undeclared permissions are blocked; in permissive mode (default), they produce a warning.
FIXEthereum signature recovery ID: EIP-191 signatures now include the recovery byte (v = 27 or 28), producing correct 65-byte signatures instead of 64-byte truncated ones.
FIXUTF-8 panic in memory truncation: Replaced unsafe byte-level string slicing with `floor_char_boundary()` to prevent panics on multi-byte characters (emoji, CJK) near the 200-char truncation point.
FIXScript plugin zombie processes: Script timeout now explicitly kills the child process and reaps it, preventing zombie accumulation.
FIXScript plugin unbounded output: stdout/stderr from plugin scripts are now capped at 10 MB via `AsyncReadExt::take()`.
FIXKeystore lock ordering: Consolidated two separate mutexes into a single `KeystoreState` mutex, eliminating potential deadlock scenarios.
FEAT`roboticus defrag` command: New workspace coherence scanner with 6 passes — refs (dead reference elimination), drift (config drift detection), artifacts (orphaned file cleanup), stale (ghost state entry removal), identity (brand consistency), and scripts (script health validation). Supports `--fix` for auto-repair, `--yes` for non-interactive mode, and `--json` for machine-readable output.

v0.8.2

New Features2026-02-27

Added: 3 changes. Fixed: 5 changes. Key changes: 100+ API route integration tests: Comprehensive test coverage for sessions, turns, interviews, feedback, skills, model selection, channels, webhooks, dead letters, admin, memory, cron, context, and approvals endpoints. Tests exercise both success and error paths including validation, 404s, auth, and edge cases. Workspace test count now at 3,316. Homebrew tap distribution: macOS/Linux users can install via `brew install robot-accomplice/tap/roboticus`. 29 stabilization bug fixes: Resolved input validation gaps, API error format inconsistencies, query parameter hardening, security headers, dashboard trailing content, model persistence, cron field naming, and Windows TOML path issues discovered during exhaustive hands-on testing of v0.8.1. HTML injection prevention: Closed remaining sanitization coverage gaps in API write endpoints.

Highlights

  • 100+ API route integration tests: Comprehensive test coverage for sessions, turns, interviews, feedback, skills, model selection, channels, webhooks, dead letters, admin, memory, cron, context, and approvals endpoints. Tests exercise both success and error paths including validation, 404s, auth, and edge cases. Workspace test count now at 3,316.
  • Homebrew tap distribution: macOS/Linux users can install via `brew install robot-accomplice/tap/roboticus`.
  • Winget package distribution: Windows users can install via Winget package manager.
  • 29 stabilization bug fixes: Resolved input validation gaps, API error format inconsistencies, query parameter hardening, security headers, dashboard trailing content, model persistence, cron field naming, and Windows TOML path issues discovered during exhaustive hands-on testing of v0.8.1.
  • HTML injection prevention: Closed remaining sanitization coverage gaps in API write endpoints.
  • Dashboard SPA cleanup: Removed duplicate trailing content after `</html>` close tag.
  • Model change persistence: Fixed model selection not persisting across server restarts.
  • Config serialization: Fixed TOML config serialization on Windows paths.
FEAT100+ API route integration tests: Comprehensive test coverage for sessions, turns, interviews, feedback, skills, model selection, channels, webhooks, dead letters, admin, memory, cron, context, and approvals endpoints. Tests exercise both success and error paths including validation, 404s, auth, and edge cases. Workspace test count now at 3,316.
FEATHomebrew tap distribution: macOS/Linux users can install via `brew install robot-accomplice/tap/roboticus`.
FEATWinget package distribution: Windows users can install via Winget package manager.
FIX29 stabilization bug fixes: Resolved input validation gaps, API error format inconsistencies, query parameter hardening, security headers, dashboard trailing content, model persistence, cron field naming, and Windows TOML path issues discovered during exhaustive hands-on testing of v0.8.1.
FIXHTML injection prevention: Closed remaining sanitization coverage gaps in API write endpoints.
FIXDashboard SPA cleanup: Removed duplicate trailing content after `</html>` close tag.
FIXModel change persistence: Fixed model selection not persisting across server restarts.
FIXConfig serialization: Fixed TOML config serialization on Windows paths.

v0.8.1

Bug Fixes & Stability (14 changes)2026-02-27

Fixed: 12 changes. Changed: 2 changes. Key changes: 40 smoke/UAT bug fixes: Resolved 40 bugs (5 critical, 6 high, 15 medium, 14 low/UX) discovered during comprehensive smoke testing of all 85 REST routes, 32 CLI commands, and 13 dashboard pages. Input validation hardening: Added field-length limits, HTML sanitization, and null-byte rejection across all API write endpoints. CI scripts use POSIX grep: Replaced all `rg` (ripgrep) invocations with standard `grep -E`/`grep -qE` in CI scripts for broader runner compatibility. Windows compilation: Added conditional `allow(unused_mut)` for platform-gated mutation in security audit command.

Highlights

  • 40 smoke/UAT bug fixes: Resolved 40 bugs (5 critical, 6 high, 15 medium, 14 low/UX) discovered during comprehensive smoke testing of all 85 REST routes, 32 CLI commands, and 13 dashboard pages.
  • Input validation hardening: Added field-length limits, HTML sanitization, and null-byte rejection across all API write endpoints.
  • JSON error responses: All API error paths now return structured `{"error": "..."}` JSON instead of plain text.
  • Memory search deduplication: FTS memory search no longer returns duplicate entries; results are now structured with category/timestamp metadata.
  • Cron scheduler accuracy: `next_run_at` is now persisted after computation; heartbeat no longer floods logs with virtual job IDs; jobs use actual agent IDs.
  • Cost display precision: Floating-point noise eliminated from cost/efficiency metrics (rounded to 6 decimal places with division-by-zero guard).
  • Skills metadata: `risk_level` is now parameterized (not hardcoded "Caution"); skills track `last_loaded_at` timestamp.
  • CLI resilience: `roboticus check` no longer crashes with raw Rust IO errors; shows friendly messages with config path suggestions.
FIX40 smoke/UAT bug fixes: Resolved 40 bugs (5 critical, 6 high, 15 medium, 14 low/UX) discovered during comprehensive smoke testing of all 85 REST routes, 32 CLI commands, and 13 dashboard pages.
FIXInput validation hardening: Added field-length limits, HTML sanitization, and null-byte rejection across all API write endpoints.
FIXJSON error responses: All API error paths now return structured `{"error": "..."}` JSON instead of plain text.
FIXMemory search deduplication: FTS memory search no longer returns duplicate entries; results are now structured with category/timestamp metadata.
FIXCron scheduler accuracy: `next_run_at` is now persisted after computation; heartbeat no longer floods logs with virtual job IDs; jobs use actual agent IDs.
FIXCost display precision: Floating-point noise eliminated from cost/efficiency metrics (rounded to 6 decimal places with division-by-zero guard).
FIXSkills metadata: `risk_level` is now parameterized (not hardcoded "Caution"); skills track `last_loaded_at` timestamp.
FIXCLI resilience: `roboticus check` no longer crashes with raw Rust IO errors; shows friendly messages with config path suggestions.
FIXDashboard UX: Fixed 14 display bugs including schedule text duplication, raw-seconds uptime, missing pagination, broken status indicators, and external font dependency removal.
FIXFilesystem path exposure: Skills API no longer leaks `source_path`/`script_path` in responses.
FIXSession creation response: `POST /api/sessions` now returns the full session object instead of just the ID.
FIX404 fallback handler: Unknown API routes now return JSON `{"error": "not found"}` instead of empty 404.
CHORECI scripts use POSIX grep: Replaced all `rg` (ripgrep) invocations with standard `grep -E`/`grep -qE` in CI scripts for broader runner compatibility.
CHOREWindows compilation: Added conditional `allow(unused_mut)` for platform-gated mutation in security audit command.

v0.8.0

Security Hardening & More2026-02-26

Security: 17 changes. Fixed: 22 changes. Added: 16 changes. Changed: 4 changes. Key changes: CORS hardening: Removed wildcard `Access-Control-Allow-Origin: *` fallback when no API key is configured; CORS now always restricts to the configured bind address origin. Wallet key zeroing: Decrypted API keys in the keystore and child agent wallet secrets are now wrapped in `Zeroizing<String>` so key material is zeroed on drop. Telegram invalid-token resilience: Telegram `404/401` poll failures are now classified as likely invalid/revoked bot-token errors with explicit repair guidance and adaptive backoff to reduce noisy tight-loop logging. Subagent runtime activation sync: Taskable subagents are now auto-started at boot and kept in sync with create/update/toggle/delete operations, fixing the `enabled > 0, running = 0` stall where configured subagents stayed idle.

Highlights

  • CORS hardening: Removed wildcard `Access-Control-Allow-Origin: *` fallback when no API key is configured; CORS now always restricts to the configured bind address origin.
  • Wallet key zeroing: Decrypted API keys in the keystore and child agent wallet secrets are now wrapped in `Zeroizing<String>` so key material is zeroed on drop.
  • WalletFile Debug redaction: `WalletFile` no longer derives `Debug`; a manual impl redacts `private_key_hex` to prevent accidental key leakage in logs or panics.
  • Plaintext wallet detection: Loading an unencrypted wallet file now emits a `SECURITY` warning at `warn!` level instead of silently succeeding.
  • Webhook signature enforcement: WhatsApp webhook verification now rejects requests with an error when `app_secret` is unconfigured, instead of silently skipping verification.
  • OAuth token persistence errors surfaced: `OAuthManager::persist()` now returns `Result<()>` and callers log failures at `error!` level instead of silently swallowing write errors.
  • Skill catalog path traversal prevention: Skill download filenames from remote registries are now validated and canonicalized to prevent `../` path traversal.
  • API key URL encoding: The `query:` auth mode now percent-encodes API keys before appending to URLs, preventing malformed requests and log leakage.
FIXCORS hardening: Removed wildcard `Access-Control-Allow-Origin: *` fallback when no API key is configured; CORS now always restricts to the configured bind address origin.
FIXWallet key zeroing: Decrypted API keys in the keystore and child agent wallet secrets are now wrapped in `Zeroizing<String>` so key material is zeroed on drop.
FIXWalletFile Debug redaction: `WalletFile` no longer derives `Debug`; a manual impl redacts `private_key_hex` to prevent accidental key leakage in logs or panics.
FIXPlaintext wallet detection: Loading an unencrypted wallet file now emits a `SECURITY` warning at `warn!` level instead of silently succeeding.
FIXWebhook signature enforcement: WhatsApp webhook verification now rejects requests with an error when `app_secret` is unconfigured, instead of silently skipping verification.
FIXOAuth token persistence errors surfaced: `OAuthManager::persist()` now returns `Result<()>` and callers log failures at `error!` level instead of silently swallowing write errors.
FIXSkill catalog path traversal prevention: Skill download filenames from remote registries are now validated and canonicalized to prevent `../` path traversal.
FIXAPI key URL encoding: The `query:` auth mode now percent-encodes API keys before appending to URLs, preventing malformed requests and log leakage.
FIXScript runner absolute path rejection: `resolve_script_path` now unconditionally rejects absolute paths instead of accepting them.
FIXScript file permission check: Script runner validates that script files are not world-writable on Unix before execution.
FIXSubagent name validation: Subagent names are now restricted to max 128 characters, alphanumeric + hyphens + underscores only.
FIXPlugin name/version validation: Plugin manifest validation now enforces character restrictions on plugin names and versions matching tool name rules.
FIXAudit log key redaction: Keystore audit log entries now redact key names to first 3 characters instead of logging full key identifiers.
FIXx402 recipient address validation: Payment authorization now validates that recipient addresses match Ethereum address format (0x + 40 hex chars).
FIXJSON merge depth limit: `update_config` recursive merge is now bounded to 10 levels of nesting to prevent stack overflow.
FIXError message sanitization: `sanitize_error_message` now strips content after common sensitive prefixes (file paths, SQLite errors, stack traces).
FIXDecided-by field sanitization: Approval decision `decided_by` field is now limited to 256 characters with control characters stripped.
FIXTelegram invalid-token resilience: Telegram `404/401` poll failures are now classified as likely invalid/revoked bot-token errors with explicit repair guidance and adaptive backoff to reduce noisy tight-loop logging.
FIXSubagent runtime activation sync: Taskable subagents are now auto-started at boot and kept in sync with create/update/toggle/delete operations, fixing the `enabled > 0, running = 0` stall where configured subagents stayed idle.
FIXFTS duplicate row accumulation: `store_semantic` and `store_working` now delete existing FTS entries before re-inserting, preventing unbounded duplicate growth in `memory_fts` on upserts.
FIXSSE stream UTF-8 corruption: `SseChunkStream` now uses proper incremental UTF-8 decoding instead of `from_utf8_lossy`, preserving multi-byte characters split across HTTP chunks.
FIXSSE buffer unbounded growth: SSE chunk stream buffer is now capped at 10 MB to prevent unbounded memory growth from long SSE lines.
FIXHeartbeat interval recovery: Heartbeat daemon interval now recovers to the original configured value when the survival tier returns to Normal, instead of permanently remaining at the degraded rate.
FIXAgentCardRefresh task activation: `HeartbeatTask::AgentCardRefresh` is now included in `default_tasks()` instead of being a dead variant.
FIXHippocampus identifier consistency: Table name validation in `create_agent_table` no longer allows hyphens, matching `validate_identifier` behavior.
FIXNegative hours SQL comment injection: `query_transactions` now clamps `hours` to positive values, preventing negative values from producing SQL comments.
FIXPRAGMA identifier quoting: `has_column` now quotes table names in `PRAGMA table_info` statements.
FIXCron lease identity verification: `release_lease` now requires the `lease_holder` parameter and verifies ownership before releasing.
FIXCoverage gate alignment: Local `justfile` coverage threshold now matches CI at 80% minimum.
FIX`just run-release` binary name: Fixed reference from `roboticus-server` to `roboticus`.
FIXSmoke test default port: `run-smoke.sh` default port corrected from 8787 to 18789.
FIXCORS fallback logging: Invalid CORS origin parse now logs a warning and falls back to `127.0.0.1` loopback instead of silently becoming wildcard `*`.
FIXCrypto function error propagation: `derive_key`, `encrypt_wallet_data` in wallet now return `Result` instead of panicking with `expect`.
FIXCapacityTracker mutex resilience: All `expect("mutex poisoned")` calls replaced with `unwrap_or_else(|e| e.into_inner())` for graceful recovery.
FIXRate limit / approval mutex resilience: Same mutex poison recovery applied to policy engine and approval manager.
FIXCron lease/run error logging: `acquire_lease`, `record_run`, and `release_lease` errors are now logged at `warn` level instead of silently discarded.
FIXInterval expression UTF-8 safety: `parse_interval_expr_to_ms` now uses `char_indices()` for correct byte-offset slicing of multi-byte characters.
FIXTOML serialization error propagation: `generate_operator_toml` and `generate_directives_toml` now return `Result<String>` instead of silently returning empty strings.
FIXFloating-point tier threshold: `SurvivalTier::from_balance` uses 0.999 epsilon for the `hours_below_zero` check to handle floating-point rounding.
FEATv0.8.0 zero-regression release gate: Added canonical `just test-v080-go-live` orchestration and release-blocking CI/release jobs for workspace tests, integration/regression batteries, bounded soak/fuzz checks, CLI+web UAT smoke, and release-doc/provenance consistency checks.
FEATWASM execution timeout enforcement: WASM plugin execution now tracks elapsed time against the configured `execution_timeout_ms` and logs warnings when exceeded.
FEATWASM memory bounds validation: WASM input writes check memory size before writing; output reads validate `ptr + len` against module memory bounds.
FEATBrowser evaluate length limit: `BrowserAction::Evaluate` rejects expressions exceeding 100,000 characters.
FEATEmail body size limit: Email adapter truncates message bodies exceeding 1 MB.
FEATA2a session establishment check: Added `is_established()` method and documentation for session key typestate.
FEATA2a rate window eviction: Rate limit windows now evict stale entries (>1 hour idle) when exceeding 1,000 tracked peers.
FEATInboundMessage platform sanitization: Added `sanitize_platform()` to strip control characters and enforce 64-char limit.
FEATYieldEngine field encapsulation: All fields made private with getter methods.
FEATTreasuryPolicy field encapsulation: All fields made private with constructor and getter methods.
FEATZero-amount deposit/withdraw rejection: `YieldEngine::deposit()` and `withdraw()` now reject amounts <= 0.
FEATPlugin registry unregister: Added `unregister()` method to fully remove plugin entries.
FEATScript shebang validation: Extensionless script files now require a recognized shebang line.
FEATDocker HEALTHCHECK: Dockerfile now includes a health check against `/api/health`.
FEATDocker build reproducibility: Dockerfile now uses `--locked`, MSRV-pinned Rust image, and dependency layer caching.
FEATRelease CI supply-chain hardening: `cross` installation pinned to versioned release instead of git HEAD.
CHOREWhatsApp client initialization: `reqwest::Client` builder now uses `expect()` instead of `unwrap_or_default()` to surface TLS initialization failures.
CHORECDP client initialization: Same `expect()` change applied to browser CDP HTTP client.
CHORESemantic search scan limit: `search_similar` now includes `LIMIT 10000` to bound memory usage pending AnnIndex integration.
CHORESemanticCache thread safety documentation: Documented `&mut self` requirement and external synchronization expectations.

v0.7.1

Hotfixes & Reliability2026-02-25

Fixed: 6 changes. Key changes: Windows daemon startup and binary update reliability fixes, dashboard render boundary hardening, and loopback-proxy migration safeguards with explicit deprecation guidance for v0.8.0 removal.

Highlights

  • Windows daemon startup reliability: Replaced the broken `sc.exe` service launch path with a detached user-process daemon flow.
  • Windows binary update guardrail: `roboticus update binary` now blocks in-process self-update on Windows and prints safe manual upgrade steps.
  • Dashboard JS bleed-through fix: Dashboard rendering is clipped to the canonical HTML document boundary.
  • In-process provider routing metadata: `/api/models/available` reports in-process proxy mode and provider diagnostics for clearer operator visibility.
  • Loopback proxy deprecation guidance: `0.7.x` warns that `127.0.0.1:8788/<provider>` is deprecated and will be removed in `v0.8.0`.
FIXWindows daemon startup reliability: Replaced the broken `sc.exe` service launch path with a detached user-process daemon flow.
FIXWindows binary update guardrail: `roboticus update binary` now blocks in-process self-update on Windows and prints safe manual upgrade steps.
FIXDashboard JS bleed-through fix: Dashboard rendering is clipped to the canonical HTML document boundary.
FIXIn-process provider routing metadata: `/api/models/available` reports in-process proxy mode and provider diagnostics for clearer operator visibility.
DOCSLoopback proxy deprecation guidance: `0.7.x` warns that `127.0.0.1:8788/<provider>` is deprecated and will be removed in `v0.8.0`.

v0.7.0

New Features2026-02-25

Added: 4 changes. Changed: 3 changes. Key changes: Subagent contract enforcement: Added explicit `subagent` vs `model-proxy` role validation, fixed-skills persistence/validation, and strict rejection of personality payloads for taskable subagents. Model-selection forensics pipeline: Added persistent `model_selection_events` storage, turn-linked forensics APIs (`GET /api/turns/{id}/model-selection`, `GET /api/models/selections`), and live dashboard views for candidate evaluation details. Roster and status semantics: `/api/roster`, `/api/agent/status`, and dashboard agent views now distinguish taskable subagents from model proxies and report taskable counts with clearer operator-facing terminology. Subagent model assignment options: Added support for `auto` (router-controlled) and `commander` (primary-agent-assigned) model modes for taskable subagents, including runtime model resolution behavior.

Highlights

  • Subagent contract enforcement: Added explicit `subagent` vs `model-proxy` role validation, fixed-skills persistence/validation, and strict rejection of personality payloads for taskable subagents.
  • Model-selection forensics pipeline: Added persistent `model_selection_events` storage, turn-linked forensics APIs (`GET /api/turns/{id}/model-selection`, `GET /api/models/selections`), and live dashboard views for candidate evaluation details.
  • Streaming turn traceability: `POST /api/agent/message/stream` now emits stable `turn_id` values from stream start through completion and records per-turn model-selection audits for streamed responses.
  • Subagent ubiquitous-language architecture doc: Added `docs/architecture/subagent-ubiquitous-language.md` with canonical terminology, gap audit, and dataflow diagrams.
  • Roster and status semantics: `/api/roster`, `/api/agent/status`, and dashboard agent views now distinguish taskable subagents from model proxies and report taskable counts with clearer operator-facing terminology.
  • Subagent model assignment options: Added support for `auto` (router-controlled) and `commander` (primary-agent-assigned) model modes for taskable subagents, including runtime model resolution behavior.
  • Context forensics UX: Context Explorer now supports live stream-turn handoff and direct forensic drill-down using active `turn_id` metadata.
FEATSubagent contract enforcement: Added explicit `subagent` vs `model-proxy` role validation, fixed-skills persistence/validation, and strict rejection of personality payloads for taskable subagents.
FEATModel-selection forensics pipeline: Added persistent `model_selection_events` storage, turn-linked forensics APIs (`GET /api/turns/{id}/model-selection`, `GET /api/models/selections`), and live dashboard views for candidate evaluation details.
FEATStreaming turn traceability: `POST /api/agent/message/stream` now emits stable `turn_id` values from stream start through completion and records per-turn model-selection audits for streamed responses.
FEATSubagent ubiquitous-language architecture doc: Added `docs/architecture/subagent-ubiquitous-language.md` with canonical terminology, gap audit, and dataflow diagrams.
CHORERoster and status semantics: `/api/roster`, `/api/agent/status`, and dashboard agent views now distinguish taskable subagents from model proxies and report taskable counts with clearer operator-facing terminology.
CHORESubagent model assignment options: Added support for `auto` (router-controlled) and `commander` (primary-agent-assigned) model modes for taskable subagents, including runtime model resolution behavior.
CHOREContext forensics UX: Context Explorer now supports live stream-turn handoff and direct forensic drill-down using active `turn_id` metadata.

v0.6.1

Bug Fixes & Stability2026-02-24

Fixed: 3 changes. Key changes: Release integrity follow-up: Merged post-tag regression fixes from the 0.6.0 release branch into `develop`, including web peer-scope identity validation, dashboard WebSocket token encoding, and release-gate compile/test stabilization. Session creation stability: Restored explicit default agent scope behavior in DB session creation paths to avoid `500` failures in session lifecycle APIs/tests.

Highlights

  • Release integrity follow-up: Merged post-tag regression fixes from the 0.6.0 release branch into `develop`, including web peer-scope identity validation, dashboard WebSocket token encoding, and release-gate compile/test stabilization.
  • Session creation stability: Restored explicit default agent scope behavior in DB session creation paths to avoid `500` failures in session lifecycle APIs/tests.
  • Routing test alignment: Updated router integration expectations to reflect current fallback behavior when primary providers are breaker-blocked.
FIXRelease integrity follow-up: Merged post-tag regression fixes from the 0.6.0 release branch into `develop`, including web peer-scope identity validation, dashboard WebSocket token encoding, and release-gate compile/test stabilization.
FIXSession creation stability: Restored explicit default agent scope behavior in DB session creation paths to avoid `500` failures in session lifecycle APIs/tests.
FIXRouting test alignment: Updated router integration expectations to reflect current fallback behavior when primary providers are breaker-blocked.

v0.6.0

New Features2026-02-24

Added: 4 changes. Changed: 5 changes. Key changes: Capacity headroom telemetry: New `GET /api/stats/capacity` endpoint exposes per-provider headroom, utilization, and sustained-pressure flags for operator visibility. Capacity-aware circuit preemption: Circuit breakers now accept soft capacity pressure signals and expose preemptive `half_open` state before hard failure trips. Routing quality now capacity-weighted: `select_for_complexity()` scores candidates by model quality and provider headroom, rather than binary near-capacity fallback behavior. Inference feedback loop now records capacity usage: both non-stream and stream response paths record provider token/request usage and update capacity pressure signals.

Highlights

  • Capacity headroom telemetry: New `GET /api/stats/capacity` endpoint exposes per-provider headroom, utilization, and sustained-pressure flags for operator visibility.
  • Capacity-aware circuit preemption: Circuit breakers now accept soft capacity pressure signals and expose preemptive `half_open` state before hard failure trips.
  • Session scope backfill migration: Added `012_session_scope_backfill_unique.sql` to normalize legacy sessions to explicit scope and enforce unique active scoped sessions.
  • Safe markdown rendering in dashboard sessions: Session chat and Context Explorer now render markdown with strict URL sanitization and no raw HTML execution.
  • Routing quality now capacity-weighted: `select_for_complexity()` scores candidates by model quality and provider headroom, rather than binary near-capacity fallback behavior.
  • Inference feedback loop now records capacity usage: both non-stream and stream response paths record provider token/request usage and update capacity pressure signals.
  • Session scoping defaults to explicit agent scope: `find_or_create()` now uses `agent` scope by default and channel/web paths pass scoped keys for peer/group isolation.
  • Channel session affinity: Channel dedup and session selection now use resolved chat/channel identity instead of platform-only sender affinity.
FEATCapacity headroom telemetry: New `GET /api/stats/capacity` endpoint exposes per-provider headroom, utilization, and sustained-pressure flags for operator visibility.
FEATCapacity-aware circuit preemption: Circuit breakers now accept soft capacity pressure signals and expose preemptive `half_open` state before hard failure trips.
FEATSession scope backfill migration: Added `012_session_scope_backfill_unique.sql` to normalize legacy sessions to explicit scope and enforce unique active scoped sessions.
FEATSafe markdown rendering in dashboard sessions: Session chat and Context Explorer now render markdown with strict URL sanitization and no raw HTML execution.
CHORERouting quality now capacity-weighted: `select_for_complexity()` scores candidates by model quality and provider headroom, rather than binary near-capacity fallback behavior.
CHOREInference feedback loop now records capacity usage: both non-stream and stream response paths record provider token/request usage and update capacity pressure signals.
CHORESession scoping defaults to explicit agent scope: `find_or_create()` now uses `agent` scope by default and channel/web paths pass scoped keys for peer/group isolation.
CHOREChannel session affinity: Channel dedup and session selection now use resolved chat/channel identity instead of platform-only sender affinity.
CHOREHeartbeat now runs SessionGovernor: stale sessions are expired with compaction draft capture; optional hourly rotation is triggered when `session.reset_schedule` is configured.

v0.5.0

New Features (25 changes)2026-02-23

Added: 18 changes. Changed: 7 changes. Key changes: Addressability Filter: Composable filter chain for group chat addressability detection. Agent only responds when mentioned by name, replied to, or in a DM. Configurable via `[addressability]` config section with alias names support. Response Transform Pipeline: Three-stage pipeline applied to all LLM responses -- `ReasoningExtractor` (captures `<think>` blocks), `FormatNormalizer` (whitespace/fence cleanup), `ContentGuard` (injection defense). Replaces the previous inline `scan_output` approach. All 10 crate READMEs updated to v0.5.0 with expanded descriptions and key types. All 10 `lib.rs` files now have `//!` crate-level doc comments.

Highlights

  • Addressability Filter: Composable filter chain for group chat addressability detection. Agent only responds when mentioned by name, replied to, or in a DM. Configurable via `[addressability]` config section with alias names support.
  • Response Transform Pipeline: Three-stage pipeline applied to all LLM responses -- `ReasoningExtractor` (captures `<think>` blocks), `FormatNormalizer` (whitespace/fence cleanup), `ContentGuard` (injection defense). Replaces the previous inline `scan_output` approach.
  • Flexible Network Binding: Interface-based binding (`bind_interface`), optional TLS via `axum-server` with rustls, and `advertise_url` for agent card generation.
  • Approval Workflow Loop Integration: Agent pauses on gated tool calls, publishes `pending_approval` events via WebSocket, and resumes after admin approve/deny. Dashboard "Approvals" panel with real-time updates.
  • Browser as Agent Tool: `BrowserTool` adapter wrapping the 12-action `roboticus-browser` crate, registered in `ToolRegistry`. Tool schemas injected into system prompt so the LLM can request browser actions.
  • Context Observatory: Full turn inspector and analytics suite:
  • Turn recording with `context_snapshots` table capturing token allocation, memory tier breakdown, complexity level, and model for every LLM call
  • Turn & Context API: `GET /api/sessions/{id}/turns`, `GET /api/turns/{id}`, `GET /api/turns/{id}/context`, `GET /api/turns/{id}/tools`
FEATAddressability Filter: Composable filter chain for group chat addressability detection. Agent only responds when mentioned by name, replied to, or in a DM. Configurable via `[addressability]` config section with alias names support.
FEATResponse Transform Pipeline: Three-stage pipeline applied to all LLM responses -- `ReasoningExtractor` (captures `<think>` blocks), `FormatNormalizer` (whitespace/fence cleanup), `ContentGuard` (injection defense). Replaces the previous inline `scan_output` approach.
FEATFlexible Network Binding: Interface-based binding (`bind_interface`), optional TLS via `axum-server` with rustls, and `advertise_url` for agent card generation.
FEATApproval Workflow Loop Integration: Agent pauses on gated tool calls, publishes `pending_approval` events via WebSocket, and resumes after admin approve/deny. Dashboard "Approvals" panel with real-time updates.
FEATBrowser as Agent Tool: `BrowserTool` adapter wrapping the 12-action `roboticus-browser` crate, registered in `ToolRegistry`. Tool schemas injected into system prompt so the LLM can request browser actions.
FEATContext Observatory: Full turn inspector and analytics suite:
FEATTurn recording with `context_snapshots` table capturing token allocation, memory tier breakdown, complexity level, and model for every LLM call
FEATTurn & Context API: `GET /api/sessions/{id}/turns`, `GET /api/turns/{id}`, `GET /api/turns/{id}/context`, `GET /api/turns/{id}/tools`
FEATDashboard per-message context expansion (token allocation bar, memory breakdown, reasoning trace, tool calls)
FEATContext Explorer tab with session selector, turn timeline, and aggregate charts
FEATHeuristic context analyzer with 12 per-turn rules and 10 session-aggregate rules across Budget, Memory, Prompt, Tools, Cost, and Quality categories
FEATLLM-powered deep analysis stub for on-demand qualitative context evaluation
FEATPrompt efficiency metrics per model: output density, budget utilization, memory ROI, cache hit rate, context pressure, cost attribution
FEATEfficiency dashboard with model comparison cards, time series charts, period selector, and auto-generated cost optimization tips
FEATOutcome grading: 1-5 star ratings on assistant responses via `turn_feedback` table, with quality-adjusted metrics (cost per quality point, quality by complexity, memory impact analysis)
FEATBehavioral recommendations engine: ~14 heuristic rules across 7 categories (query crafting, model selection, session management, memory leverage, cost optimization, tool usage, configuration) with evidence and estimated impact
FEATStreaming LLM Responses: `SseChunkStream` adapter for token-by-token streaming. `POST /api/agent/message/stream` SSE endpoint. WebSocket forwarding via EventBus. Dashboard incremental rendering with typing indicator.
FEATNew reference documents: `docs/CONFIGURATION.md`, `docs/CLI.md`, `docs/API.md`, `docs/DEPLOYMENT.md`, `docs/ENV.md`
CHOREAll 10 crate READMEs updated to v0.5.0 with expanded descriptions and key types
CHOREAll 10 `lib.rs` files now have `//!` crate-level doc comments
CHORE10 new dataflow diagrams added to `roboticus-dataflow.md` (approval, browser, context, transform, streaming, addressability, observatory, plugin SDK, OAuth, channel lifecycle)
CHORE6 new sequence diagrams added to `roboticus-sequences.md` (approval, streaming, turn recording, grading, TLS, CDP)
CHOREAll 6 C4 component diagrams updated with ~40 previously undocumented modules
CHOREDocumentation standards added to CONTRIBUTING.md
CHORE`cargo doc` CI gate added with `-D warnings` to prevent future documentation drift

v0.4.3

New Features & More2026-02-23

Added: 6 changes. Fixed: 3 changes. Changed: 2 changes. Key changes: Slash commands for agent chat: `/model`, `/models`, `/breaker`, `/retry` for runtime LLM control. Runtime model override via `/model set <model>` — temporarily forces a specific model, bypassing routing. Credit/billing errors now permanently trip the circuit breaker (no auto-recovery to HalfOpen) — providers with exhausted credits are never probed again until explicitly reset via `/breaker reset`. Dashboard "Save to keystore" button now sends `Content-Type: application/json` header — previously failed with "Expected request with 'Content-Type: application/json'".

Highlights

  • Slash commands for agent chat: `/model`, `/models`, `/breaker`, `/retry` for runtime LLM control
  • Runtime model override via `/model set <model>` — temporarily forces a specific model, bypassing routing
  • Circuit breaker status and reset via `/breaker` and `/breaker reset [provider]` slash commands
  • Breaker-aware model routing — `select_for_complexity` and `select_cheapest_qualified` now skip providers with tripped circuit breakers
  • Pre-flight API key check in `infer_with_fallback` — cloud providers with no configured key are skipped before sending a doomed request
  • Dashboard settings inputs show a dimmed "none" placeholder instead of literal "null" for empty fields
  • Credit/billing errors now permanently trip the circuit breaker (no auto-recovery to HalfOpen) — providers with exhausted credits are never probed again until explicitly reset via `/breaker reset`
  • Dashboard "Save to keystore" button now sends `Content-Type: application/json` header — previously failed with "Expected request with 'Content-Type: application/json'"
FEATSlash commands for agent chat: `/model`, `/models`, `/breaker`, `/retry` for runtime LLM control
FEATRuntime model override via `/model set <model>` — temporarily forces a specific model, bypassing routing
FEATCircuit breaker status and reset via `/breaker` and `/breaker reset [provider]` slash commands
FEATBreaker-aware model routing — `select_for_complexity` and `select_cheapest_qualified` now skip providers with tripped circuit breakers
FEATPre-flight API key check in `infer_with_fallback` — cloud providers with no configured key are skipped before sending a doomed request
FEATDashboard settings inputs show a dimmed "none" placeholder instead of literal "null" for empty fields
FIXCredit/billing errors now permanently trip the circuit breaker (no auto-recovery to HalfOpen) — providers with exhausted credits are never probed again until explicitly reset via `/breaker reset`
FIXDashboard "Save to keystore" button now sends `Content-Type: application/json` header — previously failed with "Expected request with 'Content-Type: application/json'"
FIXSettings form no longer renders `"null"` as a literal value in input fields; empty fields display a styled placeholder and save as `null`
CHOREMerged "Roster" and "Agents" into a single "Agents" page with tabbed Roster/List views
CHORERemoved CLI typing sound effects (`start_typing_sound` / `SoundHandle`) from banner rendering

v0.4.2

Bug Fixes & Stability2026-02-23

Fixed: 3 changes. Key changes: `roboticus daemon start` now verifies the service is actually running after `launchctl load` — previously reported "Daemon started" even when the service crashed immediately. `roboticus daemon install` resolves the config path to absolute before embedding in the plist — previously used the relative path which launchd couldn't resolve.

Highlights

  • `roboticus daemon start` now verifies the service is actually running after `launchctl load` — previously reported "Daemon started" even when the service crashed immediately
  • `roboticus daemon install` resolves the config path to absolute before embedding in the plist — previously used the relative path which launchd couldn't resolve
  • Captures launchctl stderr and checks `LastExitStatus` / PID to give actionable error messages on daemon start failure
FIX`roboticus daemon start` now verifies the service is actually running after `launchctl load` — previously reported "Daemon started" even when the service crashed immediately
FIX`roboticus daemon install` resolves the config path to absolute before embedding in the plist — previously used the relative path which launchd couldn't resolve
FIXCaptures launchctl stderr and checks `LastExitStatus` / PID to give actionable error messages on daemon start failure

v0.4.1

Security Hardening & More2026-02-23

Added: 5 changes. Fixed: 7 changes. Security: 4 changes. Key changes: `roboticus daemon start|stop|restart` subcommands for full daemon lifecycle management. Interactive prompt after `roboticus daemon install` asking whether to start immediately. Replaced stale `[providers.local]` (localhost:8080) with `[providers.moonshot]` in bundled and registry provider configs. Added `moonshot/kimi-k2.5` to dashboard known-models list for settings autocomplete.

Highlights

  • `roboticus daemon start|stop|restart` subcommands for full daemon lifecycle management
  • Interactive prompt after `roboticus daemon install` asking whether to start immediately
  • `--start` flag on `roboticus daemon install` for non-interactive use
  • Dashboard keystore management: save/remove provider API keys from the settings page
  • Session nicknames in dashboard sessions table with click-to-copy session ID
  • Replaced stale `[providers.local]` (localhost:8080) with `[providers.moonshot]` in bundled and registry provider configs
  • Added `moonshot/kimi-k2.5` to dashboard known-models list for settings autocomplete
  • `roboticus daemon install` now actually offers to load the service (previously only wrote the plist/unit file)
FEAT`roboticus daemon start|stop|restart` subcommands for full daemon lifecycle management
FEATInteractive prompt after `roboticus daemon install` asking whether to start immediately
FEAT`--start` flag on `roboticus daemon install` for non-interactive use
FEATDashboard keystore management: save/remove provider API keys from the settings page
FEATSession nicknames in dashboard sessions table with click-to-copy session ID
FIXReplaced stale `[providers.local]` (localhost:8080) with `[providers.moonshot]` in bundled and registry provider configs
FIXAdded `moonshot/kimi-k2.5` to dashboard known-models list for settings autocomplete
FIX`roboticus daemon install` now actually offers to load the service (previously only wrote the plist/unit file)
FIX`roboticus daemon uninstall` now stops the running service before removing the file
FIX`roboticus daemon status` distinguishes between "not installed" and "installed but not running"
FIXRegistry URL restored to correct `roboticus.ai/registry` path (not subdomain)
FIXEmpty env vars no longer falsely reported as "configured" in key status checks
FIX`delete_provider_key` endpoint now validates provider exists before allowing keystore deletion
FIXUnified key resolution via `KeySource` enum eliminates 3 duplicated cascade implementations
FIX`resolve_provider_key` returns `Option<String>` instead of silently sending empty auth headers
FIXReplace secret-looking test placeholders to prevent false GitGuardian alerts

v0.4.0

New Features & More2026-02-23

Added: 10 changes. Changed: 5 changes. Fixed: 2 changes. Key changes: Signal channel adapter backed by signal-cli JSON-RPC daemon (`roboticus-channels::signal`). Unified thinking indicator (🤖🧠…) for all chat channels (Telegram, WhatsApp, Discord, Signal). `thinking_threshold_seconds` moved from per-channel (`TelegramConfig`) to `ChannelsConfig` level. Channel message processing is now platform-agnostic via `send_typing_indicator` / `send_thinking_indicator` helpers.

Highlights

  • Signal channel adapter backed by signal-cli JSON-RPC daemon (`roboticus-channels::signal`)
  • Unified thinking indicator (🤖🧠…) for all chat channels (Telegram, WhatsApp, Discord, Signal)
  • Configurable `thinking_threshold_seconds` on `[channels]` — estimated latency gate for thinking indicator (default: 30s)
  • `send_typing` and `send_ephemeral` on WhatsApp and Discord adapters
  • Latency estimator based on model tier, input length, and circuit-breaker state
  • LLM fallback chain: `infer_with_fallback` helper retries across configured providers on transient errors
  • Permanent error detection in delivery queue — 403/401/400 and "bot blocked" errors dead-letter immediately
  • Config auto-discovery: `roboticus start` checks `~/.roboticus/roboticus.toml` when no `--config` flag is given
FEATSignal channel adapter backed by signal-cli JSON-RPC daemon (`roboticus-channels::signal`)
FEATUnified thinking indicator (🤖🧠…) for all chat channels (Telegram, WhatsApp, Discord, Signal)
FEATConfigurable `thinking_threshold_seconds` on `[channels]` — estimated latency gate for thinking indicator (default: 30s)
FEAT`send_typing` and `send_ephemeral` on WhatsApp and Discord adapters
FEATLatency estimator based on model tier, input length, and circuit-breaker state
FEATLLM fallback chain: `infer_with_fallback` helper retries across configured providers on transient errors
FEATPermanent error detection in delivery queue — 403/401/400 and "bot blocked" errors dead-letter immediately
FEATConfig auto-discovery: `roboticus start` checks `~/.roboticus/roboticus.toml` when no `--config` flag is given
FEATObsidian vault integration module with read, search, and write tools
FEATGitHub Actions release workflow for cross-platform binaries and crates.io publishing
CHORE`thinking_threshold_seconds` moved from per-channel (`TelegramConfig`) to `ChannelsConfig` level
CHOREChannel message processing is now platform-agnostic via `send_typing_indicator` / `send_thinking_indicator` helpers
CHOREDelivery queue `mark_failed` checks for permanent errors before scheduling retries
CHOREChannel router `send_to` and `drain_retry_queue` skip retry enqueue for permanent errors
CHORECircuit breaker test updated to reflect fallback-first behavior
FIXLLM inference no longer returns a static error when the primary provider is down — falls through to configured fallbacks
FIXTelegram bot no longer retries messages to chats it was removed from (permanent error dead-lettering)

v0.3.0

Security Hardening & More2026-02-23

Security: 8 changes. Fixed: 11 changes. Changed: 10 changes. Added: 1 change. Key changes: Plugin sandbox: validate tool names against allowlist; reject path-traversal payloads; add `shutdown_all` for graceful teardown. Browser restrictions: block `file://`, `javascript:`, `data:` URI schemes in CDP navigation; harden Chrome launch flags. Telegram adapter now processes all updates in a batch, not just the first. Cron worker dispatches jobs instead of unconditionally marking success.

Highlights

  • Plugin sandbox: validate tool names against allowlist; reject path-traversal payloads; add `shutdown_all` for graceful teardown
  • Browser restrictions: block `file://`, `javascript:`, `data:` URI schemes in CDP navigation; harden Chrome launch flags
  • Session role validation: reject messages with roles outside `{user, assistant, system, tool}`
  • Channel message authority: trusted sender IDs config for elevated `ChannelAuthority`
  • WhatsApp webhook signature verification via HMAC-SHA256
  • Docker: run as non-root `roboticus` user
  • Wallet: encrypt private keys with machine-derived passphrase; never store plaintext
  • API key `#[serde(skip_serializing)]` prevents accidental serialization leakage
FIXPlugin sandbox: validate tool names against allowlist; reject path-traversal payloads; add `shutdown_all` for graceful teardown
FIXBrowser restrictions: block `file://`, `javascript:`, `data:` URI schemes in CDP navigation; harden Chrome launch flags
FIXSession role validation: reject messages with roles outside `{user, assistant, system, tool}`
FIXChannel message authority: trusted sender IDs config for elevated `ChannelAuthority`
FIXWhatsApp webhook signature verification via HMAC-SHA256
FIXDocker: run as non-root `roboticus` user
FIXWallet: encrypt private keys with machine-derived passphrase; never store plaintext
FIXAPI key `#[serde(skip_serializing)]` prevents accidental serialization leakage
FIXTelegram adapter now processes all updates in a batch, not just the first
FIXCron worker dispatches jobs instead of unconditionally marking success
FIXCron expressions use the `cron` crate for full syntax support (ranges, lists, steps)
FIXPer-IP rate-limit HashMap evicted on window reset, preventing unbounded growth
FIXInterview sessions capped at 100 with 1-hour TTL; expired sessions evicted
FIX`Cargo.lock` committed; CI builds use `--locked` for reproducible builds
FIXGraceful shutdown handler (SIGINT + SIGTERM) via `with_graceful_shutdown()`
FIXDuplicate migration version numbers renumbered to unique sequential IDs
FIXMigrations wrapped in transactions for atomicity
FIXSQL `LIKE` patterns escape user-supplied wildcards
FIXMemory query endpoints clamp limit to 1000
CHOREDeduplicated `Optional<T>` trait across 5 DB modules; use `rusqlite::OptionalExtension`
CHORE`SessionStatus` and `MessageRole` enums added for future type-safe migration
CHORERegex allocation in `decode_common_encodings` hoisted to static `LazyLock`
CHORESilent `.ok()` calls in `ingest_turn()` replaced with `tracing::warn!` logging
CHOREReusable `reqwest::Client` stored in `Wallet` for connection pooling
CHOREA2A sessions made private with TTL eviction and 256-session cap
CHOREPlugin registry releases lock before tool execution (`Arc<Mutex<Box<dyn Plugin>>>`)
CHORE`CdpSession::set_timeout` now functional (was a documented no-op)
CHOREDaemon logs written to `~/.roboticus/logs/` instead of world-readable `/tmp/`
CHOREDeduplicated `collect_string_values` across policy rules
FEATPre-commit hook for fast format checks (`hooks/pre-commit`)

v0.2.0

Alpha Release2026-02-23

Full roadmap implementation — 35 items across 7 phases. ReAct agent loop, RAG retrieval pipeline, embedding provider integration, ANN index, persistent semantic cache, sub-agent framework, and comprehensive bug fixes from code review.

Highlights

  • ReAct agent loop with idle/loop detection
  • 5-tier hybrid RAG retrieval (FTS5 + vector cosine)
  • Embedding provider integration (OpenAI, Ollama, Google)
  • HNSW approximate nearest neighbor index
  • Persistent semantic cache (SQLite-backed, auto-eviction)
  • Sub-agent framework with isolated tool registries
  • 22 code review issues resolved (6 critical, 12 high, 4 medium)
  • RwLock deadlock fix in circuit breaker path
  • UTF-8 safety, atomic OAuth persistence, poison recovery
FEATImplement full Roboticus roadmap (35 items across 7 phases)
FEATApplication layer — ReAct agent, RAG retrieval, sub-agents, full server wiring
FEATFoundation layer — embeddings, keystore, ANN index, cache persistence
FIXResolve RwLock deadlock in circuit breaker path
FIXResolve all 22 code review issues (6 CRITICAL, 12 HIGH, 4 MEDIUM)
FIXReplace all placeholder code with real implementations
FIXAuto-restart on port conflict during serve
FIXUpdate bootstrap sequence to 13 steps with cache-load step
FIXGoogle batch endpoint, parse error propagation, query auth in embedding
FIXBlocking read in async, UTF-8 safe chunking, dedup release on send failure
FIXChannel L4 filter, survival tier, dedup leaks, interview deadlock
FIXUTF-8 safety, atomic OAuth persist, poison recovery, embedding errors
FIXWire BOOT_6B node, remove OpenClaw refs, reorder roadmap sections
FIXLint errors from merge and update coverage baseline
DOCSUpdate architecture diagrams, roadmap, and crate READMEs
CHOREBump version to 0.2.0 for Roboticus alpha release

v0.1.0

New Features & More2026-02-22

Added: 5 changes. Changed: 1 change. Fixed: 1 change. Key changes: Initial Project Roboticus baseline for Roboticus. Multi-crate Rust workspace foundation (runtime crates + integration test crate). Prepared packaging/publish metadata for early release workflows. Early release stabilization fixes for binary packaging, startup wiring, and quality gates.

Highlights

  • Initial Project Roboticus baseline for Roboticus.
  • Multi-crate Rust workspace foundation (runtime crates + integration test crate).
  • Core SQLite persistence layer with schema/migrations and operational defaults.
  • Early HTTP API, CLI surface, and embedded dashboard scaffolding.
  • Initial architecture and reference documentation set.
  • Prepared packaging/publish metadata for early release workflows.
  • Early release stabilization fixes for binary packaging, startup wiring, and quality gates.
FEATInitial Project Roboticus baseline for Roboticus.
FEATMulti-crate Rust workspace foundation (runtime crates + integration test crate).
FEATCore SQLite persistence layer with schema/migrations and operational defaults.
FEATEarly HTTP API, CLI surface, and embedded dashboard scaffolding.
FEATInitial architecture and reference documentation set.
CHOREPrepared packaging/publish metadata for early release workflows.
FIXEarly release stabilization fixes for binary packaging, startup wiring, and quality gates.