Skip to content

Changelog

Release history for the Roboticus autonomous agent runtime. Follows conventional commits.

v1.0.12

Bug Fixes & Stability2026-05-08

Fixed: 3 changes. Key changes: Hardened `roboticus upgrade all` against transient GitHub release-asset CDN failures by retrying temporary HTTP failures and falling back from `browser_download_url` to the GitHub release asset API URL with `Accept: application/octet-stream`. Added retry/backoff to the public Linux/macOS and Windows bootstrap installers so fresh installs do not fail on a single transient `SHA256SUMS.txt` or binary download error.

Highlights

  • Hardened `roboticus upgrade all` against transient GitHub release-asset CDN failures by retrying temporary HTTP failures and falling back from `browser_download_url` to the GitHub release asset API URL with `Accept: application/octet-stream`.
  • Added retry/backoff to the public Linux/macOS and Windows bootstrap installers so fresh installs do not fail on a single transient `SHA256SUMS.txt` or binary download error.
  • Added install/update regression coverage for transient asset recovery across `upgrade all`, `install.sh`, and `install.ps1`.
FIXHardened `roboticus upgrade all` against transient GitHub release-asset CDN failures by retrying temporary HTTP failures and falling back from `browser_download_url` to the GitHub release asset API URL with `Accept: application/octet-stream`.
FIXAdded retry/backoff to the public Linux/macOS and Windows bootstrap installers so fresh installs do not fail on a single transient `SHA256SUMS.txt` or binary download error.
FIXAdded install/update regression coverage for transient asset recovery across `upgrade all`, `install.sh`, and `install.ps1`.

v1.0.11

Bug Fixes & Stability2026-05-07

Fixed: 8 changes. Key changes: Closed the web/chat streaming liveness gap where a turn could be marked successful without durable, visible assistant content. Made stream finalization pipeline-owned and fail-closed: empty/unusable streams now emit typed visible errors instead of persisting blank assistant messages.

Highlights

  • Closed the web/chat streaming liveness gap where a turn could be marked successful without durable, visible assistant content.
  • Made stream finalization pipeline-owned and fail-closed: empty/unusable streams now emit typed visible errors instead of persisting blank assistant messages.
  • Preserved pre-chunk streaming provider errors so credit/quota failures still trip the provider credit breaker before falling back.
  • Prevented macOS service-install tests from contaminating the operator's real LaunchAgent by refusing Go test binaries and explicit missing config paths at the daemon install seam.
  • Extended the local-state isolation rule from databases to service-manager artifacts such as LaunchAgents, pid files, and operator launch logs.
  • Routed repairable maintenance `quick_check` failures through the central DB-owned derived-structure repair seam before failing heartbeat tasks, while preserving fail-closed behavior for authoritative or unknown corruption.
  • Raised the CI/release Go toolchain to `1.26.3` so security scans run on the standard-library fixes required by `govulncheck`.
  • Synchronized daemon startup worker registration with shutdown waiting so `Start`/`Stop` remains race-detector clean on macOS service lifecycle tests.
FIXClosed the web/chat streaming liveness gap where a turn could be marked successful without durable, visible assistant content.
FIXMade stream finalization pipeline-owned and fail-closed: empty/unusable streams now emit typed visible errors instead of persisting blank assistant messages.
FIXPreserved pre-chunk streaming provider errors so credit/quota failures still trip the provider credit breaker before falling back.
FIXPrevented macOS service-install tests from contaminating the operator's real LaunchAgent by refusing Go test binaries and explicit missing config paths at the daemon install seam.
FIXExtended the local-state isolation rule from databases to service-manager artifacts such as LaunchAgents, pid files, and operator launch logs.
FIXRouted repairable maintenance `quick_check` failures through the central DB-owned derived-structure repair seam before failing heartbeat tasks, while preserving fail-closed behavior for authoritative or unknown corruption.
FIXRaised the CI/release Go toolchain to `1.26.3` so security scans run on the standard-library fixes required by `govulncheck`.
FIXSynchronized daemon startup worker registration with shutdown waiting so `Start`/`Stop` remains race-detector clean on macOS service lifecycle tests.

v1.0.10

Bug Fixes & Stability2026-05-07

Fixed: 6 changes. Key changes: Moved macOS local daemon install/start onto the user LaunchAgent path (`~/Library/LaunchAgents`, `gui/$UID`) instead of defaulting local desktop installs through root LaunchDaemons. Isolated behavior soaks from the operator's live `~/.roboticus/state.db` by default and removed report assumptions tied to live DB snapshotting.

Highlights

  • Moved macOS local daemon install/start onto the user LaunchAgent path (`~/Library/LaunchAgents`, `gui/$UID`) instead of defaulting local desktop installs through root LaunchDaemons.
  • Isolated behavior soaks from the operator's live `~/.roboticus/state.db` by default and removed report assumptions tied to live DB snapshotting.
  • Added mandatory resource-contention warnings and proceed confirmations to behavior soaks and model benchmarks, with explicit CI/release overrides.
  • Centralized scheduler inventory follow-up projection so observed cron evidence is not lost at finalization or misread as a cron-creation request.
  • Made stale sidecar repair output actionable by naming blocked paths and printing concrete cleanup commands.
  • Hardened provider-key alias normalization, OpenAI-compatible tool-call history repair, final-model route attribution, and repairable semantic-memory derived-index handling.
FIXMoved macOS local daemon install/start onto the user LaunchAgent path (`~/Library/LaunchAgents`, `gui/$UID`) instead of defaulting local desktop installs through root LaunchDaemons.
FIXIsolated behavior soaks from the operator's live `~/.roboticus/state.db` by default and removed report assumptions tied to live DB snapshotting.
FIXAdded mandatory resource-contention warnings and proceed confirmations to behavior soaks and model benchmarks, with explicit CI/release overrides.
FIXCentralized scheduler inventory follow-up projection so observed cron evidence is not lost at finalization or misread as a cron-creation request.
FIXMade stale sidecar repair output actionable by naming blocked paths and printing concrete cleanup commands.
FIXHardened provider-key alias normalization, OpenAI-compatible tool-call history repair, final-model route attribution, and repairable semantic-memory derived-index handling.

v1.0.9

Bug Fixes & Stability2026-05-06

Fixed: 4 changes. Key changes: Published the post-`v1.0.8` behavioral hotfix as `v1.0.9` so installer and upgrade flows can resolve a newer binary artifact instead of remaining on the original `v1.0.8` tag. Hardened the release workflow's changelog validation so release versions are matched literally rather than interpreted as regular expressions.

Highlights

  • Published the post-`v1.0.8` behavioral hotfix as `v1.0.9` so installer and upgrade flows can resolve a newer binary artifact instead of remaining on the original `v1.0.8` tag.
  • Hardened the release workflow's changelog validation so release versions are matched literally rather than interpreted as regular expressions.
  • Added a normative SemVer policy and moved the planned feature slate formerly parked after `v1.0.9` to `v1.1.0`; patch numbers remain reserved for hotfix-class releases.
  • Preserved the final expanded behavior-soak gate evidence for the hotfix: `33/33` scenarios passed with all qualitative scorecard families graded `A`.
FIXPublished the post-`v1.0.8` behavioral hotfix as `v1.0.9` so installer and upgrade flows can resolve a newer binary artifact instead of remaining on the original `v1.0.8` tag.
FIXHardened the release workflow's changelog validation so release versions are matched literally rather than interpreted as regular expressions.
FIXAdded a normative SemVer policy and moved the planned feature slate formerly parked after `v1.0.9` to `v1.1.0`; patch numbers remain reserved for hotfix-class releases.
FIXPreserved the final expanded behavior-soak gate evidence for the hotfix: `33/33` scenarios passed with all qualitative scorecard families graded `A`.

v1.0.8

Improvements2026-05-05

Changed: 3 changes. Fixed: 3 changes. Key changes: Turned `v1.0.8` into a diagnostic-trust release focused on benchmark/RCA. Made `web_search` the canonical built-in public-web discovery surface through. Removed release-critical benchmark CLI loopback through Roboticus `/api/**`. Improved provider/RCA classification for request-shape defects, breaker-open.

Highlights

  • Turned `v1.0.8` into a diagnostic-trust release focused on benchmark/RCA
  • Made `web_search` the canonical built-in public-web discovery surface through
  • Tightened release and upgrade expectations around version truth, deterministic
  • Removed release-critical benchmark CLI loopback through Roboticus `/api/**`
  • Improved provider/RCA classification for request-shape defects, breaker-open
  • Hardened behavior/continuity truth around pending-action follow-ups, false
CHORETurned `v1.0.8` into a diagnostic-trust release focused on benchmark/RCA
CHOREMade `web_search` the canonical built-in public-web discovery surface through
CHORETightened release and upgrade expectations around version truth, deterministic
FIXRemoved release-critical benchmark CLI loopback through Roboticus `/api/**`
FIXImproved provider/RCA classification for request-shape defects, breaker-open
FIXHardened behavior/continuity truth around pending-action follow-ups, false

v1.0.7

Improvements2026-04-24

Changed: 3 changes. Fixed: 3 changes. Key changes: Turned `v1.0.7` into an explicit behavior-hardening release with canonical. Tightened model/routing truth so lifecycle policy, intent-scoped exercise,. Removed several framework-owned false negatives around tool use and authoring. Fixed multiple observability/RCA truth seams so turn diagnostics, trace flow,.

Highlights

  • Turned `v1.0.7` into an explicit behavior-hardening release with canonical
  • Tightened model/routing truth so lifecycle policy, intent-scoped exercise,
  • Hardened the release/deployment control plane by making release procedure
  • Removed several framework-owned false negatives around tool use and authoring
  • Fixed multiple observability/RCA truth seams so turn diagnostics, trace flow,
  • Closed release-branch and post-merge CI defects uncovered during the
CHORETurned `v1.0.7` into an explicit behavior-hardening release with canonical
CHORETightened model/routing truth so lifecycle policy, intent-scoped exercise,
CHOREHardened the release/deployment control plane by making release procedure
FIXRemoved several framework-owned false negatives around tool use and authoring
FIXFixed multiple observability/RCA truth seams so turn diagnostics, trace flow,
FIXClosed release-branch and post-merge CI defects uncovered during the

v1.0.6

Improvements2026-04-21

Changed: 8 changes. Key changes: The scoped parity systems have final dispositions in. The release is not claiming verbatim Rust recreation where Go now has a.

Highlights

  • The scoped parity systems have final dispositions in
  • The release is not claiming verbatim Rust recreation where Go now has a
  • Prompt compression is explicitly deferred for this release on negative
  • M3.3 LIKE-block deletion: telemetry surface (`AggregateRetrievalPaths`) is in place; the actual deletion in `retrieval_tiers.go` waits for production traces to demonstrate dormancy per the documented retirement procedure.
  • External behavioral soak rerun pending: the earlier pass-5 residuals around introspection and filesystem behavior have been addressed in code via an introspection alias, a runtime-owned capability-introspection fast path, explicit `~` refusal in both tool resolution and policy evaluation, and stripping parsed tool-call JSON from assistant content. Release confidence should still be refreshed with another managed clone/fresh soak pass before declaring the behavioral matrix fully closed.
  • Fresh-state soak lane: the managed `fresh` lane now boots from an isolated generated default config rather than a copied operator config, so clean-state validation is finally meaningful. Long-duration fresh-state soak evidence is still an rc validation artifact rather than something pinned in this notes file.
  • Prompt compression failed the corrected paired-soak gate and is not accepted for v1.0.6: the repaired harness now evaluates history-bearing scenarios that actually exercise the current Go compression surface. On that gate, baseline passed `3/3` while compression passed `0/3`; the compressed lane lost history recall and inflated latency to roughly 975s / 1540s / 1520s on the three history-bearing scenarios. Prompt compression should remain disabled for this release.
  • Prompt-compression soak lanes are now isolated more defensibly: the paired harness now gives each lane its own base URL/port, waits for managed-server teardown between lanes, and defaults to a much longer lane timeout. That does not clear prompt compression, but it removes a bad source of false failure from the evidence path.
CHOREThe scoped parity systems have final dispositions in
CHOREThe release is not claiming verbatim Rust recreation where Go now has a
CHOREPrompt compression is explicitly deferred for this release on negative
CHOREM3.3 LIKE-block deletion: telemetry surface (`AggregateRetrievalPaths`) is in place; the actual deletion in `retrieval_tiers.go` waits for production traces to demonstrate dormancy per the documented retirement procedure.
CHOREExternal behavioral soak rerun pending: the earlier pass-5 residuals around introspection and filesystem behavior have been addressed in code via an introspection alias, a runtime-owned capability-introspection fast path, explicit `~` refusal in both tool resolution and policy evaluation, and stripping parsed tool-call JSON from assistant content. Release confidence should still be refreshed with another managed clone/fresh soak pass before declaring the behavioral matrix fully closed.
CHOREFresh-state soak lane: the managed `fresh` lane now boots from an isolated generated default config rather than a copied operator config, so clean-state validation is finally meaningful. Long-duration fresh-state soak evidence is still an rc validation artifact rather than something pinned in this notes file.
CHOREPrompt compression failed the corrected paired-soak gate and is not accepted for v1.0.6: the repaired harness now evaluates history-bearing scenarios that actually exercise the current Go compression surface. On that gate, baseline passed `3/3` while compression passed `0/3`; the compressed lane lost history recall and inflated latency to roughly 975s / 1540s / 1520s on the three history-bearing scenarios. Prompt compression should remain disabled for this release.
CHOREPrompt-compression soak lanes are now isolated more defensibly: the paired harness now gives each lane its own base URL/port, waits for managed-server teardown between lanes, and defaults to a much longer lane timeout. That does not clear prompt compression, but it removes a bad source of false failure from the evidence path.

v1.0.3

WebSocket-First Dashboard & Structural Debt2026-04-12

Replaced all HTTP polling with WebSocket topic subscriptions. Pipeline events push real-time activity to dashboard. Struct-driven settings schema. Plugin/app catalog. 8 workspace/theme bug fixes. 77 files changed.

Highlights

  • WebSocket-first: Zero HTTP polling — all dashboard data pushed via WebSocket topic subscriptions
  • Pipeline events: Agent activity (inference, idle, tool use) shown in real-time on workspace canvas
  • Ticket auth: WS connections authenticated via one-time tickets, API key used exactly once
  • Struct-driven settings: Config schema derived from Go struct via reflect (303 fields)
FEATWebSocket topic subscriptions: Dashboard subscribes to topics (workspace, agent.status, models), server pushes snapshots on subscribe and deltas on change
FEATPipeline → EventBus bridge: DashboardNotifier interface publishes agent_working, stream_start, stream_end, agent_idle events
FEATWS ticket validation: One-time tickets consumed on WebSocket upgrade, API key never sent in subsequent requests
FEATStruct-driven settings schema: GET /api/config/schema returns 303 fields with types, defaults, enums, immutability flags
FEATPlugin/app catalog: Registry fallback with install status, theme catalog with variables/textures/fonts
FIX_catalogThemeVars crash: Variable declared at module scope, reset per render cycle
FIXWorkspace footer pinned: height:100% matching Rust (was calc viewport misfire)
FIXWorkstation layout: Equidistant spacing with 80px dynamic edge clamping at any viewport
FIXparseThemeColors cached: DOM reads eliminated from 60fps render loop, invalidated on theme change
FIXTheme previews: Catalog serializes variables/textures/fonts so previews show actual theme colors

v1.0.2

Parity Sweep & Architecture Gaps2026-04-12

Added: 4 bot commands, 5 CLI commands. Fixed: 3 architecture gaps, FTS coverage. Memory system now auto-indexes at ingestion and searches all 5 tiers via FTS5. Model baselining detects memory recall capability.

Highlights

  • Bot command parity: /model, /models, /breaker, /retry with authority gating
  • FTS5 for all tiers: Procedural and relationship memories now searchable via full-text search
  • Auto-indexing: Memories indexed at ingestion, not just during consolidation backfill
  • IntentMemoryRecall: Router can now escalate memory-heavy queries away from weak models
FEATBot commands: /model (show/set/reset override), /models (list chain), /breaker (status/reset), /retry (replay)
FEATIntentMemoryRecall: New exercise intent class with confabulation penalty scoring
FEATCLI parity: channels health/connect/disconnect, update providers/skills, apps alias
FEAT@bot_name stripping: Telegram group mentions now parsed correctly
FIXFTS5 triggers: Procedural + relationship tiers now indexed for full-text search
FIXTable name normalization: memory_fts and memory_index now use consistent names
FIXAuto-indexing: memory_index entries created at ingestion (was: consolidation only)
FIXGap 2: API routes now set SecurityClaim for audit consistency
FIXGap 3: HMAC trust boundary instructions added to system prompt
FIXBanner version: Startup display now shows actual version from release build

v1.0.1

Memory Recall & Tool Serialization2026-04-12

Fixed: 5 bugs. Added: 3 beyond-parity features. Memory recall was fundamentally broken — 5 compounding issues caused confabulation instead of actual recall. Multi-turn tool use with remote providers was silently broken due to message serialization.

Highlights

  • search_memories tool: New agent tool for topic-based memory search (FTS5 + LIKE fallback)
  • Query-aware memory index: Injected index now surfaces topic-matched entries for the current query
  • Two-stage memory injection: Only working memory + recent activity injected directly (Rust parity)
  • Tool serialization fix: Multi-turn tool use with OpenAI, Kimi K2, and other providers now works
FIXFTS5 episodic retrieval: Union strategy with MATCH clause (was: JOIN without MATCH — old memories invisible)
FIXTwo-stage memory injection: Only working + ambient injected; all other tiers via index (was: full dump causing confabulation)
FIXMemory index noise filter: Tool output entries (bash, errors, introspection) excluded from injected index
FIXConfidence inflation: Default 0.8, incremental +0.1 reinforce (was: binary 1.0 reset making all entries indistinguishable)
FIXOpenAI tool_call_id serialization: Assistant tool-call messages now include explicit `content` field (was: omitted by Go's omitempty)
FEATsearch_memories(query) tool: FTS5 + LIKE fallback search across all 5 memory tiers (beyond Rust parity)
FEATQuery-aware memory index: BuildMemoryIndex surfaces topic-matched entries in first 1/3 of injected index
FEATAnti-confabulation behavioral contract: Explicit rule against fabricating memories in system prompt

v1.0.0

New Features & More2026-04-11

Added: 22 changes. Changed: 14 changes. Fixed: 5 changes. Key changes: Go runtime: Full rewrite from Rust to Go with modernc.org/sqlite (no CGO). ReAct agent loop: 25-turn state machine with idle detection and loop prevention. Prompt ordering: Firmware before personality (matching Rust prompt.rs). Injection defense: 5 markers (Rust set), full content replacement with flagging instead of silent strip.

Highlights

  • Go runtime: Full rewrite from Rust to Go with modernc.org/sqlite (no CGO)
  • ReAct agent loop: 25-turn state machine with idle detection and loop prevention
  • 25-guard output safety pipeline: Behavioral, truthfulness, quality, and protocol guards
  • Semantic classifier: Embedding-based guard scoring with 5 exemplar categories (NARRATED_DELEGATION, CAPABILITY_DENIAL, TASK_DEFERRAL, FALSE_COMPLETION, FINANCIAL_ACTION_CLAIM)
  • 10+ LLM providers: OpenAI, Anthropic, Google Gemini, Ollama, OpenRouter, Moonshot, vLLM, llama-cpp, sglang, docker-model-runner
  • 6-axis metascore routing: Efficacy, Cost, Availability, Locality, Confidence, Speed with ML router
  • 3-tier semantic cache: Exact hash, tool-aware TTL, cosine similarity
  • 5-tier memory system: Working, Episodic, Semantic, Procedural, Relationship with FTS5 + HNSW
FEATGo runtime: Full rewrite from Rust to Go with modernc.org/sqlite (no CGO)
FEATReAct agent loop: 25-turn state machine with idle detection and loop prevention
FEAT25-guard output safety pipeline: Behavioral, truthfulness, quality, and protocol guards
FEATSemantic classifier: Embedding-based guard scoring with 5 exemplar categories (NARRATED_DELEGATION, CAPABILITY_DENIAL, TASK_DEFERRAL, FALSE_COMPLETION, FINANCIAL_ACTION_CLAIM)
FEAT10+ LLM providers: OpenAI, Anthropic, Google Gemini, Ollama, OpenRouter, Moonshot, vLLM, llama-cpp, sglang, docker-model-runner
FEAT6-axis metascore routing: Efficacy, Cost, Availability, Locality, Confidence, Speed with ML router
FEAT3-tier semantic cache: Exact hash, tool-aware TTL, cosine similarity
FEAT5-tier memory system: Working, Episodic, Semantic, Procedural, Relationship with FTS5 + HNSW
FEAT9 channel adapters: Telegram, Discord (WebSocket gateway), Signal, WhatsApp, Email (OAuth2), Voice, Matrix (E2E), A2A (X25519+AES), Web (WebSocket)
FEATDiscord WebSocket gateway: Full lifecycle with heartbeat, identify/resume, MESSAGE_CREATE dispatch, reconnection with backoff
FEATEmail OAuth2: XOAUTH2 SASL authentication for Gmail IMAP
FEATMCP client/server: stdio + SSE transports, live-tested with Playwright (21 tools)
FEATRevenue scoring algorithm: 3-component scoring (confidence/effort/risk) with feedback signals
FEATHybrid FTS5+vector search: FTS5 MATCH combined with cosine similarity via weighted merge
FEATEmbedded SPA dashboard: 9,155-line single-page app with routing profile persistence
FEAT38 CLI commands: Full operator surface including models exercise/baseline/reset
FEATEVM wallet: secp256k1, EIP-3009, x402 payments, treasury policy
FEATPlugin system: Managed registry with skill pairing and archive packaging
FEATTUI: bubbletea + lipgloss terminal interface
FEAT4-layer personality: OS, FIRMWARE, OPERATOR, DIRECTIVES (TOML, hot-reloadable)
FEATParity test suite: internal/parity/ package verifying resolved gaps against Rust behavior
FEATRelease footprint document: docs/releases/v1.0.0-footprint.md tracking all 101 gap closures
CHOREPrompt ordering: Firmware before personality (matching Rust prompt.rs)
CHOREInjection defense: 5 markers (Rust set), full content replacement with flagging instead of silent strip
CHOREMoney type: Microdollars replaced with cents (i64, Rust parity), saturating arithmetic
CHOREEmbedding format: JSON text replaced with 4-byte LE IEEE 754 BLOB (Rust parity)
CHOREN-gram hash: Byte trigrams replaced with rune trigrams, removed signed projection
CHOREStop word list: Aligned to Rust's 77-word set (was 63 with wrong mix)
CHOREShell validation: Blanket pattern blocking replaced with Rust's specific compound checks
CHOREGuard behavioral alignment: TaskDeferral (7 tools + semantic), ExecutionTruth (11 intents + FALSE_COMPLETION), InternalJargon (NARRATED_DELEGATION > 0.8), InternalProtocol (3-category, no bracket markers), DeclaredAction (removed Go-unique indicators)
CHOREConsolidation constants: Extracted magic numbers to named constants (DedupJaccardThreshold, DecayFactor, DecayFloor, PromotionGroupThreshold)
CHOREConfig defaults: Treasury limits aligned to Rust (daily_transfer=2000, hourly=500, reserve=5, inference=50)
CHOREA2A rate limit: Zero value means unlimited (was default to 30)
CHOREMatrix timestamp: Server TS replaced with local clock (Rust parity: Utc::now())
CHOREMCP notifications: Fixed JSON-RPC notification ID bug (notifications must not have "id" field)
CHORESkill formatting: Flat list replaced with nested subsections (### Skill N)
FIXRouting profile persistence: 6-axis profile now stored directly as RoutingProfileData, load prefers persisted values over lossy-derived defaults
FIXHNSW index: BuildFromStore reads embedding_blob (binary LE) instead of nonexistent embedding_json column
FIXPost-turn embedding: Writes binary BLOB format instead of JSON text into BLOB column
FIXConsolidation quiescence gate: Data-moving phases skip when session active within 5 seconds
FIXCron pipeline compliance: Cron executor uses RunPipeline (was already correct, verified)

v0.11.0

New Features & More2026-03-25

Added: 5 changes. Changed: 4 changes. Fixed: 4 changes. Key changes: Agent efficacy push: Memory introspection, selective forgetting, relationship-memory automation, and task operating state are now first-class parts of the shared pipeline. Introspection-driven execution: Task-oriented turns now begin from introspected runtime truth, can compose specialists from a clean slate, and preserve executed state across normalization retries. Delegation and composition flow: Empty-roster specialist requests now progress into composition and delegation instead of collapsing into narrated intent or centralization. Task-path truthfulness: Filesystem/runtime blockers now surface as real blockers instead of canned fallback prose.

Highlights

  • Agent efficacy push: Memory introspection, selective forgetting, relationship-memory automation, and task operating state are now first-class parts of the shared pipeline.
  • Introspection-driven execution: Task-oriented turns now begin from introspected runtime truth, can compose specialists from a clean slate, and preserve executed state across normalization retries.
  • MCP release-grade management: Shared `/api/mcp/servers` management surface aligned across dashboard and CLI.
  • Skill and subagent utilization telemetry: Usage count and last-used signals exposed for skills and subagents.
  • Release vetting automation: Added `scripts/run-v0110-vetting.sh` and `docs/testing/v0110-vetting-matrix.md` to lock in the v0.11.0 regression contract.
  • Delegation and composition flow: Empty-roster specialist requests now progress into composition and delegation instead of collapsing into narrated intent or centralization.
  • Task-path truthfulness: Filesystem/runtime blockers now surface as real blockers instead of canned fallback prose.
  • Prompt and planner behavior: Introspection is now treated as the first operational step for task work and feeds a shared task operating state/action planner.
FEATAgent efficacy push: Memory introspection, selective forgetting, relationship-memory automation, and task operating state are now first-class parts of the shared pipeline.
FEATIntrospection-driven execution: Task-oriented turns now begin from introspected runtime truth, can compose specialists from a clean slate, and preserve executed state across normalization retries.
FEATMCP release-grade management: Shared `/api/mcp/servers` management surface aligned across dashboard and CLI.
FEATSkill and subagent utilization telemetry: Usage count and last-used signals exposed for skills and subagents.
FEATRelease vetting automation: Added `scripts/run-v0110-vetting.sh` and `docs/testing/v0110-vetting-matrix.md` to lock in the v0.11.0 regression contract.
CHOREDelegation and composition flow: Empty-roster specialist requests now progress into composition and delegation instead of collapsing into narrated intent or centralization.
CHORETask-path truthfulness: Filesystem/runtime blockers now surface as real blockers instead of canned fallback prose.
CHOREPrompt and planner behavior: Introspection is now treated as the first operational step for task work and feeds a shared task operating state/action planner.
CHORERelease documentation: Active docs, architecture notes, roadmap entries, and release gates now reflect shipped v0.11.0 behavior.
FIXPipeline trace drift: Legacy databases missing `pipeline_traces.session_id` and related fields are repaired on boot.
FIXPrompt Performance persistence: Routing weights and context budget now persist and rehydrate correctly.
FIXDashboard regressions: Repaired session archive, raw TOML editing, Observability nav, semantic memory navigation, roster skill drill-down, workspace symmetry, and related operator-surface gaps.
FIXAnalysis/recommendation soak failures: Live deep-analysis surfaces validated against a stronger provider path during soak.

v0.10.0

Correctness, Safety & Operational Maturity (14 changes)2026-03-23

Added: 8 changes. Fixed: 4 changes. Changed: 2 changes. Key changes: Model categorization with 29-model capability profiles, skill authoring API, Landlock/Job Object script confinement, typestate session lifecycle, delegation scoring engine, and critical signal adapter fixes.

Highlights

  • Model categorization (Phase 1): 10 task categories, 29 model profiles across 8 providers, category-aware routing with `category_fit` metascore dimension.
  • Skill authoring API: Create, validate, and publish Markdown instruction skills via `POST /api/skills/author`.
  • Landlock & Job Object confinement: Linux filesystem sandboxing and Windows process isolation for script execution.
  • Typestate sessions: Compile-time session lifecycle — `Session<Created>` → `Session<Active>` → `Session<Closed>`.
  • Delegation scoring engine: `score_agent_fit()`, `composite_fit_ratio()`, and `utility_margin_for_delegation()` for decomposition decisions.
  • Signal adapter critical fixes: Replaced `std::sync::Mutex` in async context, added rate limiting, bounded buffer growth.
FEATModel categorization (Phase 1): `TaskCategory` enum (10 types), `classify_task()`, benchmark profiles for 29 models, `CategoryQualityTracker`, and `category_fit` metascore dimension (0.15 weight).
FEATSkill authoring API: `POST /api/skills/author` for creating, validating, and publishing Markdown instruction skills with safety scanning.
FEATLandlock confinement: Linux filesystem sandboxing via `landlock` crate for script execution. Windows Job Object isolation.
FEATTypestate session lifecycle: `Session<Created>`, `Session<Active>`, `Session<Closed>` compile-time state machine in `roboticus-db`.
FEATDelegation scoring engine: `score_agent_fit()`, `composite_fit_ratio()`, `utility_margin_for_delegation()` for principled decomposition gate decisions.
FEATOpenAPI spec endpoint: `GET /openapi.json` serves OpenAPI 3.1 spec; `GET /docs` provides spec access for Swagger UI viewers.
FEATCodex CLI plugin: Delegate coding tasks to OpenAI Codex CLI with structured JSON output and approval mode support.
FEATDead letter alerting: Atomic counter with configurable threshold, error-level logging, and `GET /api/stats/delivery` endpoint.
FIXSignal adapter async/sync mutex (Critical): Replaced `std::sync::Mutex<VecDeque>` with bounded `tokio::sync::mpsc` channel to eliminate runtime thread blocking.
FIXSignal adapter rate limiting: Added `governor::RateLimiter` (5 req/s default) to prevent signal-cli daemon DoS.
FIXDelivery queue error detection: `is_permanent_error()` now extracts HTTP status codes first (429=transient, 4xx=permanent, 5xx=transient).
FIXPlugin catalog: Registry manifest now includes `plugins` section; empty catalog shows helpful message instead of hard error.
CHOREFormatter single-pass optimization: Replaced 3-allocation `strip→clean→collapse` chain with single-pass `clean_content()` across all formatters.
CHOREMetascore weight redistribution: Adjusted routing weights to accommodate new `category_fit` dimension (0.15).

v0.9.9

Terminal UX & Release Hardening (8 changes)2026-03-18

Added: 6 changes. Changed: 1 change. Fixed: 1 change. Key changes: New `roboticus tui` terminal application, configurable context budget tiers, integrations management endpoints/CLI, tool output noise filtering, and dashboard configuration/routing UX improvements.

Highlights

  • Terminal UI: Added `roboticus tui` (`roboticus-tui` crate) with chat, logs, status bar, streaming responses, and session resume.
  • Context budget tuning: Added configurable L0-L3 token budgets and per-channel minimum complexity level controls.
  • Integrations management: Added `POST /api/channels/{platform}/test`, dashboard per-channel probes, and `roboticus integrations` CLI commands.
  • Tool output filter chain: Added ANSI/progress/duplicate/whitespace filtering before LLM observation to reduce token noise.
  • Dashboard and routing polish: Exposed unconfigured sections with enable actions and improved routing profile validation/toasts/defaults.
  • Default bind address: Changed defaults from `127.0.0.1` to `localhost` for safer local consistency.
FEATTerminal user interface (`roboticus tui`): New `roboticus-tui` crate with chat/log/status UX, streaming responses, and session create/resume support.
FEATContext budget tuning: Configurable `[context_budget]` tiers with dashboard sliders and per-channel minimum complexity level.
FEATIntegrations management: Added channel probe endpoint (`POST /api/channels/{platform}/test`), dashboard integrations panel controls, and `roboticus integrations` CLI group.
FEATTool output noise filter: Introduced `ToolOutputFilterChain` with ANSI strip, progress-line filtering, duplicate-line dedupe, and whitespace normalization.
FEATDashboard config exposure: Unconfigured sections/channels now render in the dashboard with explicit enable actions.
FEATRouting profile polish: Added >1.0 weight validation warning, apply toast, and default profile display when unset.
CHOREDefault bind address: Switched defaults/docs from `127.0.0.1` to `localhost` (loopback literals retained where RFC-required).
FIXWindows script-runner tests: Added `#[cfg(unix)]` guard around Unix-only permissions test module to prevent Windows compile failures.

v0.9.8

Platform Refactor & Reliability (22 changes)2026-03-16

Added: 7 changes. Fixed: 10 changes. Changed: 5 changes. Key changes: server crate split (`roboticus-cli`/`roboticus-api`/slim server), unified error model, channel adapter helper extraction, model categorization/router spec, and broad hardening around SQL safety, session integrity, config generation, and silent error triage.

Highlights

  • Server crate split: Decomposed the large server crate into `roboticus-cli`, `roboticus-api`, and a slim `roboticus` runtime bootstrap.
  • Error type unification: Consolidated into a nested `RoboticusError` hierarchy with clean `From` conversions.
  • Channel adapter helper extraction: Added shared formatter/chunking/allowlist helpers across channel adapters.
  • Model categorization spec: Added 10-category task taxonomy with routing/orchestrator integration points.
  • SQL safety + session integrity: Hardened `drop_column` identifier handling and fixed session `find_or_create` error swallowing.
  • Config and platform reliability: Fixed Windows TOML path escaping and converted silent failures into explicit logging/warnings.
FEATServer crate split: Split `roboticus-server` into `roboticus-cli`, `roboticus-api`, and slim `roboticus` runtime entrypoint.
FEATError type unification: Introduced nested `RoboticusError` hierarchy with `thiserror` + `From` conversion coverage.
FEATChannel adapter helpers: Added shared `ChannelFormatter::format()`, `chunk_message()`, and allowlist checks.
FEATModel categorization spec: Added 10-category task taxonomy and integration points for routing/orchestrator flows.
FEAT`--json` listing support: Added structured JSON output to all listing-style CLI commands.
FEAT`/health` alias and `/dashboard` redirect: Added convenience endpoint routing for operators.
FEATConfig backup management: Moved backups to `./backups/` with retention by count and age.
FIXSQL injection hardening in `drop_column`: Enforced identifier validation + safe quoting.
FIXSession `find_or_create`: Stopped swallowing real DB failures by removing `.ok()`-based fallback behavior.
FIXKeystore refresh observability: Converted silent refresh failures into explicit `tracing::warn` logs.
FIXWindows TOML path escaping: Added path normalization for generated config values on Windows.
FIXSilent error triage: Implemented explicit logging/warning tiers for previously silent failure paths.
CHOREDependency cleanup: Removed 19 unused dependencies left over from crate split transitions.
CHORE`Wallet::test_mock()` feature gate: Restricted to `test-support` feature and excluded from production artifacts.
CHOREDead code and duplicate test utilities cleanup: Consolidated shared helpers and removed stale paths.

v0.9.7

Bug Fixes & Stability (26 changes)2026-03-14

Added: 8 changes. Fixed: 18 changes. Key changes: DB fitness hardening (DF-1–DF-18): 18-item SQLite performance audit resolved — retention pruning for 5 high-growth tables, orphan cleanup sweeps (working memory + embeddings), `auto_vacuum=INCREMENTAL`, 6 missing indexes, episodic dead-entry pruning, cache NULL-expiry fix, `PRAGMA synchronous=NORMAL` under WAL, CHECK constraints on 11 columns, and dead `proxy_stats` table removal. Memory hygiene mechanic: `roboticus mechanic` detects and (with `--repair`) purges contaminated memory entries using 7 deterministic LIKE-prefix patterns across 3 tiers, with JSON-structured findings. Circuit breaker window reset: `record_failure()` now tracks `window_start` for rolling-window accumulation — failures spaced ~60s apart correctly accumulate instead of resetting. Embedding auth for local providers: `EmbeddingConfig.is_local` skips API key resolution and auth headers for Ollama/llama.cpp.

Highlights

  • DB fitness hardening (DF-1–DF-18): 18-item SQLite performance audit resolved — retention pruning for 5 high-growth tables, orphan cleanup sweeps (working memory + embeddings), `auto_vacuum=INCREMENTAL`, 6 missing indexes, episodic dead-entry pruning, cache NULL-expiry fix, `PRAGMA synchronous=NORMAL` under WAL, CHECK constraints on 11 columns, and dead `proxy_stats` table removal.
  • Memory hygiene mechanic: `roboticus mechanic` detects and (with `--repair`) purges contaminated memory entries using 7 deterministic LIKE-prefix patterns across 3 tiers, with JSON-structured findings.
  • Sandbox boundary management: Filesystem confinement for skill scripts (skills_dir + `$ROBOTICUS_WORKSPACE`, no traversal/symlink escape), configurable network isolation (`unshare(CLONE_NEWNET)` on Linux), memory ceiling via `RLIMIT_AS`, interpreter allowlist via absolute-path resolution, and mechanic sandbox health reporting.
  • Filesystem security overhaul: `FilesystemSecurityConfig` with `workspace_only` mode, ~25 default protected path patterns, `tool_allowed_paths` whitelist (auto-populated from Obsidian vault path), macOS `sandbox-exec` write-denial confinement, and dashboard UI toggles.
  • Unified pipeline architecture: `IntentRegistry` (22-variant `Intent` enum), `GuardChain` (12 guards with `full()`/`cached()`/`streaming()` presets), `ShortcutDispatcher` (15 handlers replacing 983-line god function), `PipelineConfig` (4 presets: `api`/`streaming`/`channel`/`cron`), and `DedupGuard` RAII replacing 11 manual release patterns. Net ~653 lines removed.
  • ChannelFormatter trait: Per-platform output formatting with static dispatch registry — `TelegramFormatter` (Markdown→MarkdownV2), `DiscordFormatter`, `WhatsAppFormatter`, `SignalFormatter`, `WebFormatter`, `EmailFormatter` — wired into `channel_message.rs` delivery path. 31 unit tests.
  • Configurable inference timeouts: Per-provider `timeout_seconds` setting (`[providers.*.timeout_seconds]`) with 300-second default, surfaced in dashboard provider configuration.
  • Dashboard session ID copy button: One-click copy-to-clipboard for session IDs in the Sessions panel.
FEATDB fitness hardening (DF-1–DF-18): 18-item SQLite performance audit resolved — retention pruning for 5 high-growth tables, orphan cleanup sweeps (working memory + embeddings), `auto_vacuum=INCREMENTAL`, 6 missing indexes, episodic dead-entry pruning, cache NULL-expiry fix, `PRAGMA synchronous=NORMAL` under WAL, CHECK constraints on 11 columns, and dead `proxy_stats` table removal.
FEATMemory hygiene mechanic: `roboticus mechanic` detects and (with `--repair`) purges contaminated memory entries using 7 deterministic LIKE-prefix patterns across 3 tiers, with JSON-structured findings.
FEATSandbox boundary management: Filesystem confinement for skill scripts (skills_dir + `$ROBOTICUS_WORKSPACE`, no traversal/symlink escape), configurable network isolation (`unshare(CLONE_NEWNET)` on Linux), memory ceiling via `RLIMIT_AS`, interpreter allowlist via absolute-path resolution, and mechanic sandbox health reporting.
FEATFilesystem security overhaul: `FilesystemSecurityConfig` with `workspace_only` mode, ~25 default protected path patterns, `tool_allowed_paths` whitelist (auto-populated from Obsidian vault path), macOS `sandbox-exec` write-denial confinement, and dashboard UI toggles.
FEATUnified pipeline architecture: `IntentRegistry` (22-variant `Intent` enum), `GuardChain` (12 guards with `full()`/`cached()`/`streaming()` presets), `ShortcutDispatcher` (15 handlers replacing 983-line god function), `PipelineConfig` (4 presets: `api`/`streaming`/`channel`/`cron`), and `DedupGuard` RAII replacing 11 manual release patterns. Net ~653 lines removed.
FEATChannelFormatter trait: Per-platform output formatting with static dispatch registry — `TelegramFormatter` (Markdown→MarkdownV2), `DiscordFormatter`, `WhatsAppFormatter`, `SignalFormatter`, `WebFormatter`, `EmailFormatter` — wired into `channel_message.rs` delivery path. 31 unit tests.
FEATConfigurable inference timeouts: Per-provider `timeout_seconds` setting (`[providers.*.timeout_seconds]`) with 300-second default, surfaced in dashboard provider configuration.
FEATDashboard session ID copy button: One-click copy-to-clipboard for session IDs in the Sessions panel.
FIXCircuit breaker window reset: `record_failure()` now tracks `window_start` for rolling-window accumulation — failures spaced ~60s apart correctly accumulate instead of resetting.
FIXEmbedding auth for local providers: `EmbeddingConfig.is_local` skips API key resolution and auth headers for Ollama/llama.cpp.
FIXCron `schedule_kind: "once"` support: Runtime maps "once" → "at" dispatch, calls `DurableScheduler::evaluate_at()`, auto-disables after single execution.
FIXVault path whitelisting: `tool_allowed_paths` auto-populated from `obsidian.vault_path` during config normalization — workspace-only mode no longer blocks configured external paths.
FIXFleet activity chart capacity model: Stacked area normalizes per-agent scores by `1/agentCount` with `fixedMax: 1.0`.
FIXCache guard parity: `cached()` guard set now includes `SubagentClaim` + `LiteraryQuoteRetry` (previously missing).
FIXExecutionTruthGuard: Tool-results bypass bug removed.
FIXCollapsible if lint: Updated `impl_core.rs` to use `if let` chain (edition 2024).
FIXWallet RPC rate-limit backoff: `get_all_balances()` detects rate-limit error codes (`-32016`, `-32005`, `429`) and stops iterating remaining tokens instead of repeatedly hitting the provider.
FIXCron once-type orphan jobs: Jobs with `schedule_kind: "once"` and no `schedule_expr` are now auto-disabled on first encounter instead of emitting a warning every 60s.
FIXDashboard sidebar footer: Navigation bar footer now stays pinned to the bottom of the viewport (added `height: 100%` to sidebar container).
FIXDashboard custom model Add button: Custom model text input row now has its own Add button; both Add buttons use a shared class selector.
FIXTelegram double-underscore italic: `text` was incorrectly emitted as Telegram underline instead of italic — formatter now maps to `_text_`.
FIXConfig hot-reload path divergence: `normalize_paths()` and `merge_bundled_providers()` were skipped during hot-reload — reloaded configs now match boot-time normalization.
FIXRouting audit fixes: Attempt counter not incrementing on retry, `u32` truncation on cost metrics, misleading timeout error message wording.
FIXDashboard UI stall during inference: 4 `RwLock` guard-scope fixes release locks before async I/O, preventing cascading reader starvation.
FIXCron semaphore hot-reload race: Semaphore not released when cron runtime reloads config, causing phantom permit exhaustion. Dead `LlmService` method removed, lock consolidation in admin routes.
FIXAgent audit fixes: Tautological always-true test condition, timeout hint parsing edge case, unreachable branch removal.

v0.9.6

New Features (14 changes)2026-03-12

Added: 14 changes. Key changes: Compliance-first self-funding control plane: Complete revenue opportunity lifecycle (intake → qualify → score → plan → fulfill → settle) with DB-backed restart safety, strategy-level scoring (confidence/effort/risk/priority/recommendation), feedback persistence per opportunity and summary by strategy, configurable post-settlement asset routing (default `PALM_USD`), EVM swap submission with tx-hash tracking and on-chain receipt reconciliation, tax payout lifecycle mirroring swap tasks, and operator-visible accounting (net profit, attributable costs, retained earnings, tax allocation) across API, CLI, and mechanic surfaces. Revenue mechanic integration: Mechanic can probe, reconcile, and repair orphaned or stale revenue jobs and swap/tax reconciliation mismatches via `run_gateway_provider_and_revenue_checks` and `run_gateway_integrated_repair_sweep`.

Highlights

  • Compliance-first self-funding control plane: Complete revenue opportunity lifecycle (intake → qualify → score → plan → fulfill → settle) with DB-backed restart safety, strategy-level scoring (confidence/effort/risk/priority/recommendation), feedback persistence per opportunity and summary by strategy, configurable post-settlement asset routing (default `PALM_USD`), EVM swap submission with tx-hash tracking and on-chain receipt reconciliation, tax payout lifecycle mirroring swap tasks, and operator-visible accounting (net profit, attributable costs, retained earnings, tax allocation) across API, CLI, and mechanic surfaces.
  • Revenue mechanic integration: Mechanic can probe, reconcile, and repair orphaned or stale revenue jobs and swap/tax reconciliation mismatches via `run_gateway_provider_and_revenue_checks` and `run_gateway_integrated_repair_sweep`.
  • Skills catalog: `PluginCatalog` with CLI flows (`roboticus skills catalog list/install/activate`) and API endpoints (`GET/POST /api/skills/catalog`, `/install`, `/activate`). Registry manifest fetch from remote URL.
  • Skill registry protocol: Migration 022 adds `version`, `author`, `registry_source` columns to skills table. Multi-registry support via `RegistrySource { name, url, priority, enabled }` with backward-compatible fallback from legacy single-URL `registry_url`.
  • Multi-registry fetch: Registry sync iterates all configured sources, namespaces skills as `{registry_name}/{skill_name}` for non-local sources, applies semver comparison to skip redundant downloads, and resolves conflicts by registry priority.
  • Learning loop closure: Agent now detects repeating multi-step tool sequences on session close and synthesizes reusable SKILL.md procedure files. `learned_skills` table (migration 021) tracks reinforcement history (success/failure counts, priority). `LearningConfig` exposes tuneable thresholds for minimum sequence length, success ratio, priority boost/decay, and skill cap. Inspired by recent work on autonomous tool-use learning in LLM agents ([arXiv:2603.05344](https://arxiv.org/abs/2603.05344)).
  • Procedural failure recording: `record_procedural_failure()` (previously dead code in the DB layer) is now called from `ingest_turn()` when tool results indicate failure, closing the procedural memory feedback loop.
  • Skill priority adjustment: Governor `tick()` now runs `adjust_learned_skill_priorities()` after episodic decay — learned skills with high success ratios get priority boosts; those with poor ratios get decayed.
FEATCompliance-first self-funding control plane: Complete revenue opportunity lifecycle (intake → qualify → score → plan → fulfill → settle) with DB-backed restart safety, strategy-level scoring (confidence/effort/risk/priority/recommendation), feedback persistence per opportunity and summary by strategy, configurable post-settlement asset routing (default `PALM_USD`), EVM swap submission with tx-hash tracking and on-chain receipt reconciliation, tax payout lifecycle mirroring swap tasks, and operator-visible accounting (net profit, attributable costs, retained earnings, tax allocation) across API, CLI, and mechanic surfaces.
FEATRevenue mechanic integration: Mechanic can probe, reconcile, and repair orphaned or stale revenue jobs and swap/tax reconciliation mismatches via `run_gateway_provider_and_revenue_checks` and `run_gateway_integrated_repair_sweep`.
FEATSkills catalog: `PluginCatalog` with CLI flows (`roboticus skills catalog list/install/activate`) and API endpoints (`GET/POST /api/skills/catalog`, `/install`, `/activate`). Registry manifest fetch from remote URL.
FEATSkill registry protocol: Migration 022 adds `version`, `author`, `registry_source` columns to skills table. Multi-registry support via `RegistrySource { name, url, priority, enabled }` with backward-compatible fallback from legacy single-URL `registry_url`.
FEATMulti-registry fetch: Registry sync iterates all configured sources, namespaces skills as `{registry_name}/{skill_name}` for non-local sources, applies semver comparison to skip redundant downloads, and resolves conflicts by registry priority.
FEATLearning loop closure: Agent now detects repeating multi-step tool sequences on session close and synthesizes reusable SKILL.md procedure files. `learned_skills` table (migration 021) tracks reinforcement history (success/failure counts, priority). `LearningConfig` exposes tuneable thresholds for minimum sequence length, success ratio, priority boost/decay, and skill cap. Inspired by recent work on autonomous tool-use learning in LLM agents ([arXiv:2603.05344](https://arxiv.org/abs/2603.05344)).
FEATProcedural failure recording: `record_procedural_failure()` (previously dead code in the DB layer) is now called from `ingest_turn()` when tool results indicate failure, closing the procedural memory feedback loop.
FEATSkill priority adjustment: Governor `tick()` now runs `adjust_learned_skill_priorities()` after episodic decay — learned skills with high success ratios get priority boosts; those with poor ratios get decayed.
FEATSkill subdirectory loading: `SkillLoader` now recurses into `learned/` subdirectory, loading machine-synthesized skills alongside hand-authored ones.
FEATProgressive context compaction: 5-stage compaction (`Trim` → `Summarize` → `Archive` → `Evict` → `Emergency`) in `compact_before_archive()` with `CompactionStage::from_excess()` selector.
FEATDecay-weighted episodic retrieval: `rerank_episodic_by_decay()` applies time-based decay at retrieval time, preventing stale context from dominating memory budget.
FEATInstruction anti-fade micro-reminders: Event-driven system prompt reinforcement at agent decision points to combat instruction-following drift.
FEATx402 autonomous payment: LLM HTTP client now handles `402 Payment Required` responses with autonomous on-chain payment and request retry.
FEATHomebrew & Winget packaging: `release.yml` contains complete `update-homebrew` (SHA256 extraction, formula generation, tap push) and `update-winget` (`vedantmgoyal9/winget-releaser@v2`) jobs. Activation requires tap repo creation and secrets provisioning.

v0.9.5

Improvements & More2026-03-06

Changed: 8 changes. Fixed: 5 changes. Note: 1 change. Key changes: Terminology normalization: `soul_text` → `os_text`, `soul_history` → `os_personality_history` (migration 020) for firmware/OS terminology coherency. Behavior soak hardening: `scripts/run-agent-behavior-soak.py` now includes regression checks for filesystem capability truthfulness, subagent capability response quality, and affirmative continuation quality, with rubric updates to score substantive outcomes over brittle phrase matching. Internal protocol fallback leakage: response sanitization no longer surfaces protocol-placeholder fallback text; empty/degraded sanitized content now resolves through deterministic user-facing quality fallback. Markdown count execution reliability: execution shortcut path now handles recursive markdown-file count prompts deterministically, including strict numeric-only responses when requested (`count only` / `only the number` style prompts).

Highlights

  • Terminology normalization: `soul_text` → `os_text`, `soul_history` → `os_personality_history` (migration 020) for firmware/OS terminology coherency.
  • Behavior soak hardening: `scripts/run-agent-behavior-soak.py` now includes regression checks for filesystem capability truthfulness, subagent capability response quality, and affirmative continuation quality, with rubric updates to score substantive outcomes over brittle phrase matching.
  • Roadmap/release traceability: `docs/releases/v0.9.5.md` and `docs/ROADMAP.md` updated with current v0.9.5 prep status for speculative execution, browser runtime support, CLI skill roadmap slice, and behavior continuity validation.
  • Architecture documentation: Added explicit v0.9.5-prep control/dataflow coverage for deterministic execution shortcuts and guarded response sanitization in `docs/architecture/roboticus-dataflow.md` and `docs/architecture/roboticus-sequences.md`.
  • Browser runtime continuity: Browser action execution now attempts a single stop/start session recovery when CDP disconnect/closed-socket errors are detected, limited to idempotent actions to avoid duplicate side effects on replay.
  • Autonomy turn-budget controls: Added configurable agent-level ReAct budget controls (`autonomy_max_react_turns`, `autonomy_max_turn_duration_seconds`) and wired enforcement into the runtime loop.
  • CLI adapter response contract: `run_script` now emits stable typed metadata (`adapter`, `schema_version`, `status`, `error_class`) and normalized script error classes for downstream handling.
  • Speculative policy invariants: Added explicit test coverage enforcing Safe-only speculative eligibility (Caution/Dangerous/Forbidden remain excluded from speculative execution).
CHORETerminology normalization: `soul_text` → `os_text`, `soul_history` → `os_personality_history` (migration 020) for firmware/OS terminology coherency.
CHOREBehavior soak hardening: `scripts/run-agent-behavior-soak.py` now includes regression checks for filesystem capability truthfulness, subagent capability response quality, and affirmative continuation quality, with rubric updates to score substantive outcomes over brittle phrase matching.
CHORERoadmap/release traceability: `docs/releases/v0.9.5.md` and `docs/ROADMAP.md` updated with current v0.9.5 prep status for speculative execution, browser runtime support, CLI skill roadmap slice, and behavior continuity validation.
CHOREArchitecture documentation: Added explicit v0.9.5-prep control/dataflow coverage for deterministic execution shortcuts and guarded response sanitization in `docs/architecture/roboticus-dataflow.md` and `docs/architecture/roboticus-sequences.md`.
CHOREBrowser runtime continuity: Browser action execution now attempts a single stop/start session recovery when CDP disconnect/closed-socket errors are detected, limited to idempotent actions to avoid duplicate side effects on replay.
CHOREAutonomy turn-budget controls: Added configurable agent-level ReAct budget controls (`autonomy_max_react_turns`, `autonomy_max_turn_duration_seconds`) and wired enforcement into the runtime loop.
CHORECLI adapter response contract: `run_script` now emits stable typed metadata (`adapter`, `schema_version`, `status`, `error_class`) and normalized script error classes for downstream handling.
CHORESpeculative policy invariants: Added explicit test coverage enforcing Safe-only speculative eligibility (Caution/Dangerous/Forbidden remain excluded from speculative execution).
FIXInternal protocol fallback leakage: response sanitization no longer surfaces protocol-placeholder fallback text; empty/degraded sanitized content now resolves through deterministic user-facing quality fallback.
FIXMarkdown count execution reliability: execution shortcut path now handles recursive markdown-file count prompts deterministically, including strict numeric-only responses when requested (`count only` / `only the number` style prompts).
FIXDelegation shortcut boundary: markdown-count shortcut no longer hijacks explicitly delegated prompts, preserving delegation intent handling.
FIXSpeculative branch cleanup safety: introduced RAII speculation slot guards and abort-path tests to guarantee no slot leakage when speculative tasks are canceled.
FIXCLI skill sandbox isolation coverage: added explicit tests that secret env vars are stripped while only allowlisted runtime vars are propagated under `skills.sandbox_env=true`.
CHORE$50 seed exercise deferred: The revenue infrastructure is proven via integration tests and the seed exercise plan is authored (`docs/releases/v0.9.6-seed-exercise.md`), but the exercise itself is deferred — the economic ecosystem for autonomous bot services is still nascent. The rails are in place; the market is not.

v0.9.4+hotfix.1

Improvements & More2026-03-05

Added: 3 changes. Changed: 7 changes. Fixed: 2 changes. Security: 1 change. Key changes: Routing observability UX: Metrics dashboard now includes an explorable model-decision graph and a routing-profile spider graph (correctness/cost/speed) with runtime apply support via safe config patching. Model shift telemetry: Non-streaming inference pipeline now emits websocket `model_shift` events when execution model differs from selected model (fallback or cache continuity path). Agent message contract: `/api/agent/message` responses now expose both routing-time and execution-time model fields (`selected_model`, `model`, `model_shift_from`) for continuity diagnostics. Routing dataset privacy default: `GET /api/models/routing-dataset` now redacts `user_excerpt` by default; explicit opt-in is required to include excerpts.

Highlights

  • Routing observability UX: Metrics dashboard now includes an explorable model-decision graph and a routing-profile spider graph (correctness/cost/speed) with runtime apply support via safe config patching.
  • Model shift telemetry: Non-streaming inference pipeline now emits websocket `model_shift` events when execution model differs from selected model (fallback or cache continuity path).
  • Routing profile roadmap spec: Added `docs/roadmap/0.9.4/features/user-routing-profile-spider-graph.md` and linked roadmap entry.
  • Agent message contract: `/api/agent/message` responses now expose both routing-time and execution-time model fields (`selected_model`, `model`, `model_shift_from`) for continuity diagnostics.
  • Routing dataset privacy default: `GET /api/models/routing-dataset` now redacts `user_excerpt` by default; explicit opt-in is required to include excerpts.
  • Routing eval validation: `POST /api/models/routing-eval` now validates `cost_weight`, `accuracy_floor`, and `accuracy_min_obs` bounds.
  • Config defaults/tests: routing defaults now use `metascore`; legacy `heuristic` input is accepted and normalized to `metascore` during validation.
  • Cache integrity mode for live agent path: semantic near-match cache reuse is now disabled in the inference pipeline (`lookup_strict`: exact + tool-TTL only) to prevent instruction-mismatched cached responses.
FEATRouting observability UX: Metrics dashboard now includes an explorable model-decision graph and a routing-profile spider graph (correctness/cost/speed) with runtime apply support via safe config patching.
FEATModel shift telemetry: Non-streaming inference pipeline now emits websocket `model_shift` events when execution model differs from selected model (fallback or cache continuity path).
FEATRouting profile roadmap spec: Added `docs/roadmap/0.9.4/features/user-routing-profile-spider-graph.md` and linked roadmap entry.
CHOREAgent message contract: `/api/agent/message` responses now expose both routing-time and execution-time model fields (`selected_model`, `model`, `model_shift_from`) for continuity diagnostics.
CHORERouting dataset privacy default: `GET /api/models/routing-dataset` now redacts `user_excerpt` by default; explicit opt-in is required to include excerpts.
CHORERouting eval validation: `POST /api/models/routing-eval` now validates `cost_weight`, `accuracy_floor`, and `accuracy_min_obs` bounds.
CHOREConfig defaults/tests: routing defaults now use `metascore`; legacy `heuristic` input is accepted and normalized to `metascore` during validation.
CHORECache integrity mode for live agent path: semantic near-match cache reuse is now disabled in the inference pipeline (`lookup_strict`: exact + tool-TTL only) to prevent instruction-mismatched cached responses.
CHOREPath normalization parity: runtime `PUT /api/config` updates now apply the same tilde (`~`) path expansion as TOML load (`normalize_paths`), including multimodal, device, and knowledge source path fields.
CHOREExplicit config path behavior: `resolve_config_path(Some(\"~/...\"))` now expands to the user home directory instead of preserving a literal `~`.
FIXLive startup migration deadlock on legacy DBs: database initialization/migration order no longer fails on `inference_costs.turn_id` index creation when the column is absent in legacy state.
FIXMigration 13 idempotency: routing v0.9.4 migration path now handles pre-existing `turn_id`/routing columns without `duplicate column` failures.
FIXStrict deny-by-default channels: adapters now reject traffic when allowlists are empty (`deny_on_empty=true`). Alpha update/mechanic flows are expected to repair channel allowlists during upgrade/install.

v0.9.2

New Features & More2026-03-02

Added: 15 changes. Changed: 7 changes. Removed: 4 changes. Key changes: Wiring Remediation (Phase 0): Comprehensive Tier 1–3 wiring audit remediation. 14 gates cleared — all functional wires verified against code. See `docs/audit/wiring-audit-v0.9.md` for the full re-audit. Unified Request Pipeline: API (`agent_message`) and channel (`process_channel_message`) paths now share `prepare_inference` + `execute_inference_pipeline` in `core.rs`, eliminating 6+ behavioral asymmetries between entry points. `post_turn_ingest` Tool Results: All call sites now pass actual tool call name + result from the ReAct loop instead of `&[]`. Episodic memory captures tool-use context, improving digest quality. Gate System Note: `build_gate_system_note` now wired in both API and channel paths (previously channel-only).

Highlights

  • Wiring Remediation (Phase 0): Comprehensive Tier 1–3 wiring audit remediation. 14 gates cleared — all functional wires verified against code. See `docs/audit/wiring-audit-v0.9.md` for the full re-audit.
  • Unified Request Pipeline: API (`agent_message`) and channel (`process_channel_message`) paths now share `prepare_inference` + `execute_inference_pipeline` in `core.rs`, eliminating 6+ behavioral asymmetries between entry points.
  • Multi-Tool Parsing: `parse_tool_calls` (plural) correctly parses multiple tool invocations from a single LLM response across all four provider formats.
  • OpenAI Responses + Google Tool Wiring: Bidirectional tool support for OpenAI Responses API and Google Generative AI — tool definitions translated into requests, structured tool calls parsed from responses with `{"tool_call": ...}` shim.
  • Quality Warm Start: `QualityTracker` is seeded from `inference_costs` on startup, eliminating cold-start assumptions for metascore routing.
  • Escalation Read Feedback: `EscalationTracker` acceptance history now feeds routing weight adjustments via `escalation_bias`, closing the feedback loop.
  • Approval Resume: Blocked tool calls are re-executed asynchronously after approval via `execute_tool_call_after_approval`.
  • Hippocampus (2.13): Self-describing schema map with auto-discovery of all system tables. Agent-created tables (`ag_<id>_*`) with access levels, row counts, and guardrails. Compact summary injected into system prompt (~200 tokens) for ambient storage awareness.
FEATWiring Remediation (Phase 0): Comprehensive Tier 1–3 wiring audit remediation. 14 gates cleared — all functional wires verified against code. See `docs/audit/wiring-audit-v0.9.md` for the full re-audit.
FEATUnified Request Pipeline: API (`agent_message`) and channel (`process_channel_message`) paths now share `prepare_inference` + `execute_inference_pipeline` in `core.rs`, eliminating 6+ behavioral asymmetries between entry points.
FEATMulti-Tool Parsing: `parse_tool_calls` (plural) correctly parses multiple tool invocations from a single LLM response across all four provider formats.
FEATOpenAI Responses + Google Tool Wiring: Bidirectional tool support for OpenAI Responses API and Google Generative AI — tool definitions translated into requests, structured tool calls parsed from responses with `{"tool_call": ...}` shim.
FEATQuality Warm Start: `QualityTracker` is seeded from `inference_costs` on startup, eliminating cold-start assumptions for metascore routing.
FEATEscalation Read Feedback: `EscalationTracker` acceptance history now feeds routing weight adjustments via `escalation_bias`, closing the feedback loop.
FEATApproval Resume: Blocked tool calls are re-executed asynchronously after approval via `execute_tool_call_after_approval`.
FEATHippocampus (2.13): Self-describing schema map with auto-discovery of all system tables. Agent-created tables (`ag_<id>_*`) with access levels, row counts, and guardrails. Compact summary injected into system prompt (~200 tokens) for ambient storage awareness.
FEATAgent Data Tools: `CreateTable`, `AlterTable`, `DropTable` registered in ToolRegistry with hippocampus auto-registration, size limits, and reserved-name enforcement.
FEATDocument Ingestion Pipeline (3.5.5): `roboticus ingest <path>` CLI and `POST /api/knowledge/ingest` API. Supports `.md`, `.txt`, `.rs`, `.py`, `.js`, `.ts`, `.pdf` files. Parse → chunk (512 tokens, 64-token overlap) → embed → store in memory system.
FEATIANA Timezone Support (1.18): Cron scheduler evaluates session reset schedules using IANA timezone identifiers. Conformance tests for DST transitions, sub-minute cron, timezone-prefixed expressions.
FEATInference Costs Extension: `latency_ms` (INTEGER), `quality_score` (REAL), `escalation` (BOOLEAN) columns added to `inference_costs` table. All inference calls now record latency and escalation state.
FEATMCP Server Gateway: First plugin release. `RoboticusMcpHandler` bridges rmcp's `ServerHandler` to the ToolRegistry. External MCP clients (Claude Desktop, Cursor, VS Code) connect via StreamableHTTP, discover tools through `tools/list`, invoke through `tools/call`. All MCP tool calls run with `InputAuthority::External`.
FEATGolden Test Fixtures: Deterministic golden files for delegation, delegation follow-up, echo follow-up, and echo tool-call pathways.
FEATTool-Call Shim Tests: Harness integration tests verifying the full structured tool_call → parse → execute → observation → follow-up pipeline.
CHORE`post_turn_ingest` Tool Results: All call sites now pass actual tool call name + result from the ReAct loop instead of `&[]`. Episodic memory captures tool-use context, improving digest quality.
CHOREGate System Note: `build_gate_system_note` now wired in both API and channel paths (previously channel-only).
CHOREShared Confidence Evaluator: `infer_with_fallback` uses the shared `LlmService.confidence` instance instead of creating a local copy.
CHOREContext Pruning: `needs_pruning()` → `soft_trim()` wired in `build_context` when assembled context exceeds the token budget.
CHORECheckpoint Load: `load_checkpoint` called during inference preparation for session resume (previously write-only).
CHOREImportance Decay: `decay_importance` called from `SessionGovernor.tick()` after digest, preventing stale context accumulation.
CHORECI Pipeline: Parallelized per-crate test execution and harness quick-test stages for faster CI runtime.
CHORE`SpawnManager`: Dead module removed (`spawning.rs` deleted, zero references). Virtual delegation tool pattern replaced it.
CHOREDead Routing Surfaces: `uniroute.rs` (ModelVector, QueryRequirements, ModelVectorRegistry) deleted. Dead selector functions (`select_for_complexity`, `select_cheapest_qualified`, `select_for_quality_target`) removed. `ModelRouter` retained as active runtime override/fallback router.
CHORE`router_integration.rs`: Dead test module removed (tested deleted routing code).
CHORE`skills-roadmap-2026.md`: Superseded by `capabilities-roadmap-2026.md`.

v0.9.1

New Features2026-03-02

Added: 6 changes. Changed: 2 changes. Key changes: Model Metascore Routing (2.19 core): Unified per-model scoring replaces availability-first routing. `ModelProfile` combines static provider attributes (cost, tier, locality) with dynamic observations (quality, capacity headroom, circuit breaker health). `metascore()` produces a transparent 5-dimension breakdown (efficacy, cost, availability, locality, confidence) with configurable weights for cost-aware mode. `select_by_metascore()` is now the primary routing decision in `select_routed_model_with_audit()`. Tiered Inference Pipeline (2.3): `ConfidenceEvaluator` scores local model responses using token probability, response length, and self-reported uncertainty signals. Responses below the confidence floor trigger automatic escalation to the next model in the fallback chain. `EscalationTracker` records escalation events for capacity/cost telemetry. Routing hot path: `select_routed_model_with_audit()` now extracts features from user content, classifies task complexity, builds model profiles, and selects via metascore — replacing the previous first-usable-model strategy. Rate limiter architecture: `GlobalRateLimitLayer` is now constructed once at startup and shared between the axum middleware stack and `AppState`, enabling admin observability of the same rate-limit counters the middleware uses.

Highlights

  • Model Metascore Routing (2.19 core): Unified per-model scoring replaces availability-first routing. `ModelProfile` combines static provider attributes (cost, tier, locality) with dynamic observations (quality, capacity headroom, circuit breaker health). `metascore()` produces a transparent 5-dimension breakdown (efficacy, cost, availability, locality, confidence) with configurable weights for cost-aware mode. `select_by_metascore()` is now the primary routing decision in `select_routed_model_with_audit()`.
  • Tiered Inference Pipeline (2.3): `ConfidenceEvaluator` scores local model responses using token probability, response length, and self-reported uncertainty signals. Responses below the confidence floor trigger automatic escalation to the next model in the fallback chain. `EscalationTracker` records escalation events for capacity/cost telemetry.
  • Throttle Event Observability (1.17): New `GET /api/stats/throttle` endpoint exposes live rate-limit counters including global/per-IP/per-actor request counts, throttle tallies, and top-10 offenders. `ThrottleSnapshot` struct provides admin visibility into abuse patterns.
  • Quality Tracking: `QualityTracker` now records observations on every inference success with a heuristic quality signal (response structure, finish reason, latency). Exponential moving average feeds into metascore efficacy dimension.
  • Audit Trail Extensions: `ModelSelectionAudit` now includes `metascore_breakdown` (full per-dimension scores) and `complexity_score` for routing decisions. `ModelCandidateAudit` includes per-candidate metascores.
  • Profile module (`roboticus-llm::profile`): `ModelProfile`, `MetascoreBreakdown`, `build_model_profiles()`, `select_by_metascore()` — 9 unit tests covering local/cloud task routing, cold-start penalties, cost-aware selection, blocked model filtering, and deterministic tie-breaking.
  • Routing hot path: `select_routed_model_with_audit()` now extracts features from user content, classifies task complexity, builds model profiles, and selects via metascore — replacing the previous first-usable-model strategy.
  • Rate limiter architecture: `GlobalRateLimitLayer` is now constructed once at startup and shared between the axum middleware stack and `AppState`, enabling admin observability of the same rate-limit counters the middleware uses.
FEATModel Metascore Routing (2.19 core): Unified per-model scoring replaces availability-first routing. `ModelProfile` combines static provider attributes (cost, tier, locality) with dynamic observations (quality, capacity headroom, circuit breaker health). `metascore()` produces a transparent 5-dimension breakdown (efficacy, cost, availability, locality, confidence) with configurable weights for cost-aware mode. `select_by_metascore()` is now the primary routing decision in `select_routed_model_with_audit()`.
FEATTiered Inference Pipeline (2.3): `ConfidenceEvaluator` scores local model responses using token probability, response length, and self-reported uncertainty signals. Responses below the confidence floor trigger automatic escalation to the next model in the fallback chain. `EscalationTracker` records escalation events for capacity/cost telemetry.
FEATThrottle Event Observability (1.17): New `GET /api/stats/throttle` endpoint exposes live rate-limit counters including global/per-IP/per-actor request counts, throttle tallies, and top-10 offenders. `ThrottleSnapshot` struct provides admin visibility into abuse patterns.
FEATQuality Tracking: `QualityTracker` now records observations on every inference success with a heuristic quality signal (response structure, finish reason, latency). Exponential moving average feeds into metascore efficacy dimension.
FEATAudit Trail Extensions: `ModelSelectionAudit` now includes `metascore_breakdown` (full per-dimension scores) and `complexity_score` for routing decisions. `ModelCandidateAudit` includes per-candidate metascores.
FEATProfile module (`roboticus-llm::profile`): `ModelProfile`, `MetascoreBreakdown`, `build_model_profiles()`, `select_by_metascore()` — 9 unit tests covering local/cloud task routing, cold-start penalties, cost-aware selection, blocked model filtering, and deterministic tie-breaking.
CHORERouting hot path: `select_routed_model_with_audit()` now extracts features from user content, classifies task complexity, builds model profiles, and selects via metascore — replacing the previous first-usable-model strategy.
CHORERate limiter architecture: `GlobalRateLimitLayer` is now constructed once at startup and shared between the axum middleware stack and `AppState`, enabling admin observability of the same rate-limit counters the middleware uses.

v0.8.9

Bug Fixes & Stability (17 changes)2026-03-01

Security: 3 changes. Fixed: 14 changes. Key changes: HIGH: RwLock held across LLM call: Config read-lock was held for the entire duration of streaming LLM calls, blocking all config writes. Now clones needed values and drops the lock before the network call. HIGH: CSS selector injection: Browser `click` and `type_text` actions now validate CSS selectors, rejecting inputs containing `{`/`}` (which can escape selector context into rule injection) and enforcing a 500-character length limit. HIGH: SSE streaming drops tool-use deltas: OpenAI-format SSE chunks with `content: null` (common in function-call and tool-use deltas) were silently dropped. Now emits an empty-string delta, matching the Anthropic and Google format arms. HIGH: Done event schema mismatch: The SSE `stream_done` event used `"content"` key while all streaming chunks used `"delta"`, causing clients to miss the done signal. Now consistently uses `"delta"`.

Highlights

  • HIGH: RwLock held across LLM call: Config read-lock was held for the entire duration of streaming LLM calls, blocking all config writes. Now clones needed values and drops the lock before the network call.
  • HIGH: CSS selector injection: Browser `click` and `type_text` actions now validate CSS selectors, rejecting inputs containing `{`/`}` (which can escape selector context into rule injection) and enforcing a 500-character length limit.
  • HIGH: Relaxed atomic ordering: Cross-task flags and counters using `Ordering::Relaxed` upgraded to `Acquire`/`Release`/`AcqRel` to ensure correct visibility guarantees across async task boundaries.
  • HIGH: SSE streaming drops tool-use deltas: OpenAI-format SSE chunks with `content: null` (common in function-call and tool-use deltas) were silently dropped. Now emits an empty-string delta, matching the Anthropic and Google format arms.
  • HIGH: Done event schema mismatch: The SSE `stream_done` event used `"content"` key while all streaming chunks used `"delta"`, causing clients to miss the done signal. Now consistently uses `"delta"`.
  • HIGH: Dead-letter replay race: Two locks acquired non-atomically during message replay could interleave with concurrent deliveries. Now holds both locks in a single scope.
  • HIGH: ReAct tool errors bypass scan_output: Error messages from tool execution were returned directly to the model without content scanning. Now calls `scan_output()` on tool error strings.
  • HIGH: derive_nickname Unicode panic: `&text[prefix.len()..]` applied a byte offset from a lowercased string to the original, panicking on multi-byte characters. Now uses `char_indices().nth()` for safe boundary detection.
FIXHIGH: RwLock held across LLM call: Config read-lock was held for the entire duration of streaming LLM calls, blocking all config writes. Now clones needed values and drops the lock before the network call.
FIXHIGH: CSS selector injection: Browser `click` and `type_text` actions now validate CSS selectors, rejecting inputs containing `{`/`}` (which can escape selector context into rule injection) and enforcing a 500-character length limit.
FIXHIGH: Relaxed atomic ordering: Cross-task flags and counters using `Ordering::Relaxed` upgraded to `Acquire`/`Release`/`AcqRel` to ensure correct visibility guarantees across async task boundaries.
FIXHIGH: SSE streaming drops tool-use deltas: OpenAI-format SSE chunks with `content: null` (common in function-call and tool-use deltas) were silently dropped. Now emits an empty-string delta, matching the Anthropic and Google format arms.
FIXHIGH: Done event schema mismatch: The SSE `stream_done` event used `"content"` key while all streaming chunks used `"delta"`, causing clients to miss the done signal. Now consistently uses `"delta"`.
FIXHIGH: Dead-letter replay race: Two locks acquired non-atomically during message replay could interleave with concurrent deliveries. Now holds both locks in a single scope.
FIXHIGH: ReAct tool errors bypass scan_output: Error messages from tool execution were returned directly to the model without content scanning. Now calls `scan_output()` on tool error strings.
FIXHIGH: derive_nickname Unicode panic: `&text[prefix.len()..]` applied a byte offset from a lowercased string to the original, panicking on multi-byte characters. Now uses `char_indices().nth()` for safe boundary detection.
FIXMED: WebSocket idle timeout missing: `handle_socket` had no timeout — idle clients held file descriptors and broadcast receivers indefinitely. Now sends ping every 30s with a 90s idle timeout.
FIXMED: Web path bypasses decomposition gate: `evaluate_decomposition_gate` was only called in `process_channel_message`, not in the web API's `agent_message`. Extracted into a shared helper called from both paths.
FIXMED: Agent processing invisible in logs: Neither `agent_message` nor `process_channel_message` logged entry spans. Added `info!` spans with session_id and channel at function entry.
FIXMED: --json flag ignored: The `--json` CLI flag was only threaded to `cmd_defrag`. Now threaded to `cmd_status` and other output-producing commands.
FIXMED: Config capabilities empty: `/api/config/capabilities` returned an empty `immutable_sections` list. Now populated with `["server", "treasury", "a2a", "wallet"]`.
FIXMED: config get returns stale TOML: `roboticus config get` read from the on-disk TOML even when the server was running with different runtime values. Now tries the live API first, falling back to TOML when offline.
FIXMED: A2A missing from channel status: `/api/channels/status` omitted the A2A channel. Now includes a hardcoded A2A entry reading enabled/listening state from server state.
FIXMED: Dashboard scheduler hardcodes agent_id: The scheduler panel used a hardcoded `agent_id: 'roboticus'` instead of the active agent. Now uses `App._activeAgentId`.
FIXLOW: Missing #[must_use] annotations: Added `#[must_use]` to 8 builder/constructor methods across `speculative.rs`, `actions.rs`, and `knowledge.rs` to prevent accidental discard of return values.

v0.8.8

Security Hardening (39 changes)2026-03-01

Security: 13 changes. Fixed: 26 changes. Key changes: HIGH: WebSocket API key leak: Replaced `?token=` query-string authentication on WebSocket upgrade with a ticket-based flow, preventing API keys from appearing in server logs, proxy logs, and browser history. HIGH: Prompt injection in tips: `get_turn_tips` and `get_session_insights` now sanitize LLM-generated tips before rendering, preventing stored prompt injection via malicious session content. HIGH: Float policy bypass: Policy enforcement on `amount` fields now falls back to `as_f64()` conversion, closing a bypass where float amounts evaded integer-only checks. HIGH: Tool call parsing failures: `parse_tool_call` now uses `rfind` with a candidate loop, correctly parsing tool calls that contain the delimiter character in arguments.

Highlights

  • HIGH: WebSocket API key leak: Replaced `?token=` query-string authentication on WebSocket upgrade with a ticket-based flow, preventing API keys from appearing in server logs, proxy logs, and browser history.
  • HIGH: Prompt injection in tips: `get_turn_tips` and `get_session_insights` now sanitize LLM-generated tips before rendering, preventing stored prompt injection via malicious session content.
  • HIGH: Provider error info leak: `classify_provider_error` in `run_llm_analysis` now strips internal details from error responses before returning to callers.
  • MED: XSS in sanitize_html: `sanitize_html` now escapes all 5 OWASP-recommended HTML entities (`& < > " '`), closing a reflected XSS vector.
  • MED: Input validation on identifiers: `peer_id`, `group_id`, and `channel` fields now enforce length and character-set constraints, preventing injection of oversized or malformed identifiers.
  • MED: Webhook body size limit: Public webhook router now applies `DefaultBodyLimit` to prevent memory exhaustion from oversized payloads.
  • MED: Analysis route DoS protection: Analysis routes now apply `ConcurrencyLimitLayer(3)` to prevent resource exhaustion from concurrent expensive LLM calls.
  • MED: Config schema leak: `update_config` error responses now return a generic message instead of leaking internal schema details.
FIXHIGH: WebSocket API key leak: Replaced `?token=` query-string authentication on WebSocket upgrade with a ticket-based flow, preventing API keys from appearing in server logs, proxy logs, and browser history.
FIXHIGH: Prompt injection in tips: `get_turn_tips` and `get_session_insights` now sanitize LLM-generated tips before rendering, preventing stored prompt injection via malicious session content.
FIXHIGH: Provider error info leak: `classify_provider_error` in `run_llm_analysis` now strips internal details from error responses before returning to callers.
FIXMED: XSS in sanitize_html: `sanitize_html` now escapes all 5 OWASP-recommended HTML entities (`& < > " '`), closing a reflected XSS vector.
FIXMED: Input validation on identifiers: `peer_id`, `group_id`, and `channel` fields now enforce length and character-set constraints, preventing injection of oversized or malformed identifiers.
FIXMED: Webhook body size limit: Public webhook router now applies `DefaultBodyLimit` to prevent memory exhaustion from oversized payloads.
FIXMED: Analysis route DoS protection: Analysis routes now apply `ConcurrencyLimitLayer(3)` to prevent resource exhaustion from concurrent expensive LLM calls.
FIXMED: Config schema leak: `update_config` error responses now return a generic message instead of leaking internal schema details.
FIXMED: Feedback comment size limit: `FeedbackRequest.comment` now enforces a 4096-character cap, preventing oversized payloads from reaching storage.
FIXMED: Config allowlist tightening: Removed `extra_headers` from the `get_config` response allowlist, preventing exposure of sensitive header values.
FIXLOW: Unsafe UTF-8 decode: Replaced `from_utf8_unchecked` with safe `from_utf8` to prevent undefined behavior on malformed input.
FIXLOW: Embedding test env isolation: Embedding test uses a unique env var name with a SAFETY comment to prevent cross-test interference.
FIXLOW: Path traversal defense-in-depth: `obsidian_read` now validates paths against directory traversal patterns as an additional defense layer.
FIXHIGH: Float policy bypass: Policy enforcement on `amount` fields now falls back to `as_f64()` conversion, closing a bypass where float amounts evaded integer-only checks.
FIXHIGH: Tool call parsing failures: `parse_tool_call` now uses `rfind` with a candidate loop, correctly parsing tool calls that contain the delimiter character in arguments.
FIXHIGH: Unicode string metric: `common_prefix_ratio` now operates on `chars()` instead of byte slices, producing correct ratios for multi-byte characters.
FIXHIGH: Incorrect P50 latency: `latency_p50` now computes the true median by averaging the two middle values for even-length arrays.
FIXHIGH: Speculation cache collisions: `SpeculationKey` now stores full parameter JSON instead of using `DefaultHasher`, which was not stable across processes and caused incorrect cache hits.
FIXHIGH: WhatsApp adapter panic: `WhatsAppAdapter::new` now returns `Result<Self>` instead of panicking on initialization failures.
FIXHIGH: Export agents silent failure: `export_agents` now matches on `Result` and propagates errors instead of silently dropping them.
FIXHIGH: Inference cost logging: `record_inference_cost` now uses `inspect_err` to log failures instead of silently discarding them with `.ok()`.
FIXMED: Turn count inflation: `turn_count` now only increments on `Think` state transitions, fixing 2-3x count inflation from duplicate counting.
FIXMED: Archive truncation: `compact_before_archive` now fetches all messages instead of being capped at 20, preventing data loss during session archival.
FIXMED: URL decoder corruption: `%XX` decoder now preserves characters on invalid hex sequences instead of silently dropping them.
FIXMED: Task handoff stalls: Handoff logic now skips `Failed` tasks to find the next `Pending` task, preventing the scheduler from stalling on failed work.
FIXMED: Config write propagation: `write_defaults` now propagates errors with `?` instead of silently discarding them with `.ok()`.
FIXMED: Cron validation logging: Invalid cron expressions now log a warning before returning `false`, replacing a silent rejection.
FIXMED: Wallet passphrase fallthrough: An incorrect `ROBOTICUS_WALLET_PASSPHRASE` now produces a hard error instead of silently falling through to the default passphrase.
FIXMED: Config/session export errors: `to_string_pretty` failures in config/session export now return proper error responses instead of empty bodies.
FIXMED: Corrupt skills warning: Corrupt `skills_json` values now log a warning instead of being silently ignored.
FIXMED: Translation request errors: `translate_request` failures now return HTTP 500 with a proper error body instead of an empty response.
FIXMED: Translation response errors: `translate_response` failures now return HTTP 502 with a descriptive message instead of `"(no response)"`.
FIXLOW: Loop detection consolidation: Removed redundant `is_looping` pre-check, consolidating loop detection into a single code path.
FIXLOW: Archive count accuracy: `rotate_agent_scope_sessions` now returns the actual archived count instead of a potentially incorrect value.
FIXLOW: Token parse overflow: Token parsing now uses saturating `u32` casts, capping at `u32::MAX` instead of panicking on overflow.
FIXLOW: Subtask dedup ordering: `split_subtasks` now uses a `HashSet` for order-preserving deduplication instead of unstable dedup.
FIXLOW: Session row corruption logging: Corrupted session rows now log a warning instead of being silently dropped during iteration.
FIXLOW: DB error logging for cost queries: Database errors in turn-query average cost calculations are now logged instead of silently ignored.
FIXLOW: Defrag read error handling: File defragmentation now skips files on read error with a warning instead of substituting an empty string.

v0.8.7

Bug Fixes & Stability (21 changes)2026-02-28

Fixed: 19 changes. Added: 2 changes. Key changes: CRIT: Cron jobs silently never firing: `run_cron_worker` timestamp format lacked timezone suffix (`Z`), causing `evaluate_cron` RFC 3339 parse to always fail — all cron-scheduled jobs were silently skipped. HIGH: Telegram chunk_message UTF-8 panic: Byte-level string slicing in `chunk_message` panicked on multi-byte characters (emoji, CJK). Now uses `floor_char_boundary()` matching the Discord adapter. Release notes for v0.8.5 and v0.8.6 (missing from previous releases, blocking release doc gate). Roadmap section 1.24: Built-in CLI Agent Skills (Claude Code + Codex CLI).

Highlights

  • CRIT: Cron jobs silently never firing: `run_cron_worker` timestamp format lacked timezone suffix (`Z`), causing `evaluate_cron` RFC 3339 parse to always fail — all cron-scheduled jobs were silently skipped.
  • HIGH: Telegram chunk_message UTF-8 panic: Byte-level string slicing in `chunk_message` panicked on multi-byte characters (emoji, CJK). Now uses `floor_char_boundary()` matching the Discord adapter.
  • HIGH: Keystore redact_key_name UTF-8 panic: Byte-level `&key[..3]` slicing panicked on multi-byte key names. Now uses `key.chars().take(3)`.
  • HIGH: LLM forward_stream missing query: auth mode: Streaming requests to providers using query-string authentication (e.g., Google Generative AI) failed because the `query:` prefix was not handled, sending it as a literal HTTP header instead.
  • HIGH: yield_engine U256-to-u64 panic: `real_a_token_balance` panicked via `U256::to::<u64>()` if an aToken balance exceeded `u64::MAX`. Now uses safe `try_into::<u128>()`.
  • HIGH: yield_engine amount_to_raw saturation: `amount_to_raw` silently saturated USDC amounts above ~$18.4B via unchecked `f64 -> u64` cast. Now explicitly clamps.
  • MED: Email adapter SMTP relay panic: `EmailAdapter::new` panicked via `.expect()` on invalid SMTP hostname. Now returns `Result`.
  • MED: Email adapter mutex panics: `push_message`/`recv` used `.expect("mutex poisoned")`. Now uses `.unwrap_or_else(|e| e.into_inner())` for poison recovery, matching other adapters.
FIXCRIT: Cron jobs silently never firing: `run_cron_worker` timestamp format lacked timezone suffix (`Z`), causing `evaluate_cron` RFC 3339 parse to always fail — all cron-scheduled jobs were silently skipped.
FIXHIGH: Telegram chunk_message UTF-8 panic: Byte-level string slicing in `chunk_message` panicked on multi-byte characters (emoji, CJK). Now uses `floor_char_boundary()` matching the Discord adapter.
FIXHIGH: Keystore redact_key_name UTF-8 panic: Byte-level `&key[..3]` slicing panicked on multi-byte key names. Now uses `key.chars().take(3)`.
FIXHIGH: LLM forward_stream missing query: auth mode: Streaming requests to providers using query-string authentication (e.g., Google Generative AI) failed because the `query:` prefix was not handled, sending it as a literal HTTP header instead.
FIXHIGH: yield_engine U256-to-u64 panic: `real_a_token_balance` panicked via `U256::to::<u64>()` if an aToken balance exceeded `u64::MAX`. Now uses safe `try_into::<u128>()`.
FIXHIGH: yield_engine amount_to_raw saturation: `amount_to_raw` silently saturated USDC amounts above ~$18.4B via unchecked `f64 -> u64` cast. Now explicitly clamps.
FIXMED: Email adapter SMTP relay panic: `EmailAdapter::new` panicked via `.expect()` on invalid SMTP hostname. Now returns `Result`.
FIXMED: Email adapter mutex panics: `push_message`/`recv` used `.expect("mutex poisoned")`. Now uses `.unwrap_or_else(|e| e.into_inner())` for poison recovery, matching other adapters.
FIXMED: Discord GatewayConnection mutex panics: All 4 accessor methods used `.expect("mutex poisoned")`. Now uses poison recovery matching the rest of the Discord adapter.
FIXMED: CDP client initialization panic: `CdpClient::new` panicked via `.expect()` on TLS cert issues. Now returns `Result`.
FIXMED: Embedding URL double API key: When both Google format and `query:` auth were active, the API key was appended twice. Made the two paths mutually exclusive.
FIXMED: Embedding URL missing percent-encoding: API keys were interpolated into URLs without encoding. Now uses `pct_encode_query_value`.
FIXMED: Hippocampus Unicode/ASCII mismatch: `create_agent_table` allowed Unicode alphanumeric characters but `drop_agent_table` required ASCII-only, creating undeletable tables. Both now require ASCII.
FIXMED: Skills reload counters wrong on failure: `added`/`updated` counters incremented even when DB operations failed. Now only increment on success.
FIXMED: Skills rollback silent failures: File rollback operations used `let _ =` silently. Now log errors at error level.
FIXLOW: sanitize_platform mixed byte/char units: Truncation used `.chars().take()` (char count) after a `.len()` (byte count) guard. Now truncates at byte boundary consistently.
FIXLOW: mock_tx_hash f64 saturation: Used `amount * 1e18` (overflows u64 above ~18.4). Changed to USDC scale (1e6).
FIXLOW: Session model column never populated: `update_model()` was not called after LLM routing, leaving the `sessions.model` column perpetually NULL.
FIXLOW: Moonshot/Kimi tier misclassified: `classify()` in `tier.rs` did not match `moonshot` or `kimi` substrings, causing Kimi K2 models to fall through to the T2 default instead of T3.
FEATRelease notes for v0.8.5 and v0.8.6 (missing from previous releases, blocking release doc gate).
FEATRoadmap section 1.24: Built-in CLI Agent Skills (Claude Code + Codex CLI).

v0.8.6

Security Hardening & More2026-02-28

Security: 9 changes. Fixed: 14 changes. Added: 2 changes. Key changes: CRIT: Unauthenticated rate-limit actor identity: Removed `x-user-id` header as rate-limit actor identity — it was unauthenticated and trivially spoofable. CRIT: Stable token fingerprinting: Replaced `DefaultHasher` with SHA-256 for token fingerprinting, since `DefaultHasher` is not stable across processes and could cause cache/rate-limit bypasses. Windows daemon error propagation: `schtasks /Create` errors now propagate instead of being silently dropped; post-spawn verification added; `schtasks /Delete` errors during uninstall handled correctly. CLI API key headers: Added `--api-key`/`ROBOTICUS_API_KEY` global CLI argument. All 22 bare `reqwest` calls replaced with `http_client()` helper that injects API key as default header.

Highlights

  • CRIT: Unauthenticated rate-limit actor identity: Removed `x-user-id` header as rate-limit actor identity — it was unauthenticated and trivially spoofable.
  • CRIT: Stable token fingerprinting: Replaced `DefaultHasher` with SHA-256 for token fingerprinting, since `DefaultHasher` is not stable across processes and could cause cache/rate-limit bypasses.
  • HIGH: Rate-limit IP fallback: IP extraction now uses `ConnectInfo<SocketAddr>` (real TCP peer address) instead of a hardcoded `127.0.0.1` fallback.
  • HIGH: ASCII-only identifiers: `validate_identifier` now restricts to ASCII alphanumeric characters, closing Unicode homoglyph and normalization attacks.
  • HIGH: Memory search query cap: `/api/memory/search` query parameter capped at 512 characters to prevent regex-based DoS.
  • HIGH: Error message sanitization: Added SQLite schema-leaking prefixes (`no such table`, `no such column`, etc.) to the error sanitization blocklist.
  • MED: Rate-limit counter ordering: Global rate-limit counter now incremented after per-IP/per-actor checks pass, preventing global exhaustion from blocked IPs.
  • MED: Symlink-safe directory traversal: `collect_findings_recursive` now uses `entry.file_type()` and skips symlinks, preventing symlink-following attacks.
FIXCRIT: Unauthenticated rate-limit actor identity: Removed `x-user-id` header as rate-limit actor identity — it was unauthenticated and trivially spoofable.
FIXCRIT: Stable token fingerprinting: Replaced `DefaultHasher` with SHA-256 for token fingerprinting, since `DefaultHasher` is not stable across processes and could cause cache/rate-limit bypasses.
FIXHIGH: Rate-limit IP fallback: IP extraction now uses `ConnectInfo<SocketAddr>` (real TCP peer address) instead of a hardcoded `127.0.0.1` fallback.
FIXHIGH: ASCII-only identifiers: `validate_identifier` now restricts to ASCII alphanumeric characters, closing Unicode homoglyph and normalization attacks.
FIXHIGH: Memory search query cap: `/api/memory/search` query parameter capped at 512 characters to prevent regex-based DoS.
FIXHIGH: Error message sanitization: Added SQLite schema-leaking prefixes (`no such table`, `no such column`, etc.) to the error sanitization blocklist.
FIXMED: Rate-limit counter ordering: Global rate-limit counter now incremented after per-IP/per-actor checks pass, preventing global exhaustion from blocked IPs.
FIXMED: Symlink-safe directory traversal: `collect_findings_recursive` now uses `entry.file_type()` and skips symlinks, preventing symlink-following attacks.
FIXMED: WhatsApp HMAC raw byte comparison: HMAC verification now compares raw bytes instead of hex string representations, closing timing side-channels from variable-length hex comparison.
FIXWindows daemon error propagation: `schtasks /Create` errors now propagate instead of being silently dropped; post-spawn verification added; `schtasks /Delete` errors during uninstall handled correctly.
FIXCLI API key headers: Added `--api-key`/`ROBOTICUS_API_KEY` global CLI argument. All 22 bare `reqwest` calls replaced with `http_client()` helper that injects API key as default header.
FIXFlaky test elimination: Replaced TOCTOU ephemeral port test with RFC 5737 TEST-NET-1 address (192.0.2.1) for deterministic unreachable-port testing.
FIXBundled providers parse failure (F5): Changed `.unwrap_or_default()` to `.expect()` — bundled TOML is build-time data; parse failure means the binary is broken and should panic fast.
FIXUpdate state save errors (F3): Three `state.save().ok()` sites now log errors before discarding, plus update state load now logs parse/read failures.
FIXLegacy Windows service cleanup (F7): `sc.exe stop/delete` errors during legacy cleanup now logged at debug level instead of silently dropped.
FIXOAuth token resolution (F8): `resolve_token().ok()` now logs failures, surfacing OAuth refresh errors that were previously invisible.
FIXTranslate request error propagation (F9): `translate_request` errors now return HTTP 400 instead of falling back to an empty JSON body.
FIXCorrupted cost row logging (F10): `filter_map(|r| r.ok())` on cost query rows now logs dropped rows.
FIXEmbedding failure logging (F12): Three `embed_single().ok()` sites now log failures, making RAG degradation visible.
FIXDefrag stdout write errors (F14): JSON stdout writes now propagate `io::Error` instead of silently dropping.
FIXSession nickname update (F19): `update_nickname().ok()` now logs failures.
FIXRecommendation inference cost (F20): `record_inference_cost().ok()` now logs failures.
FIXAgent status query errors: Tool call and turn queries in agent status now log errors at debug level.
FEATAuth middleware roundtrip tests: wrong key rejection, no-auth passthrough, POST method coverage.
FEATSSE streaming endpoint validation tests: empty content, oversized content, missing fields.

v0.8.5

Bug Fixes & Stability (28 changes)2026-02-28

Security: 6 changes. Fixed: 22 changes. Key changes: WASM preemptive timeout (BUG-101): WASM plugin execution now runs on a dedicated thread with `recv_timeout`, providing true preemptive timeout instead of the previous post-hoc elapsed-time check that allowed malicious modules to run indefinitely. Script runner orphan kill (BUG-102): Script runner now captures the child PID before `wait_with_output()` and sends `kill -9` on timeout, preventing orphan process accumulation. reqwest Client panic (BUG-105): `VectorDbSource::new()` and `GraphSource::new()` now return `Result` instead of panicking via `.expect()` when TLS initialization fails. Signal handler crash (BUG-108): SIGTERM handler installation now falls back to SIGINT-only mode instead of crashing via `.expect()` in containerized environments.

Highlights

  • WASM preemptive timeout (BUG-101): WASM plugin execution now runs on a dedicated thread with `recv_timeout`, providing true preemptive timeout instead of the previous post-hoc elapsed-time check that allowed malicious modules to run indefinitely.
  • Script runner orphan kill (BUG-102): Script runner now captures the child PID before `wait_with_output()` and sends `kill -9` on timeout, preventing orphan process accumulation.
  • Rate limiter memory bounds (BUG-103): Per-IP and per-actor rate limit maps are now capped at 10,000 and 5,000 entries respectively, preventing unbounded memory growth during distributed floods. Throttle tracking maps are also cleared on window reset.
  • Knowledge/Obsidian bounded reads (BUG-104, BUG-110): `DirectorySource::query()` and `parse_note()` now enforce 10 MB and 5 MB file size limits respectively, preventing OOM on oversized files.
  • Config secret allowlist (BUG-106): Admin config endpoint now uses an allowlist (`ALLOWED_FIELDS`) instead of a blocklist for field filtering, ensuring new secret fields are safe by default.
  • Interview turn cap (BUG-107): Interview sessions now enforce a 200-turn maximum to prevent unbounded memory growth within the 3600s TTL.
  • reqwest Client panic (BUG-105): `VectorDbSource::new()` and `GraphSource::new()` now return `Result` instead of panicking via `.expect()` when TLS initialization fails.
  • Signal handler crash (BUG-108): SIGTERM handler installation now falls back to SIGINT-only mode instead of crashing via `.expect()` in containerized environments.
FIXWASM preemptive timeout (BUG-101): WASM plugin execution now runs on a dedicated thread with `recv_timeout`, providing true preemptive timeout instead of the previous post-hoc elapsed-time check that allowed malicious modules to run indefinitely.
FIXScript runner orphan kill (BUG-102): Script runner now captures the child PID before `wait_with_output()` and sends `kill -9` on timeout, preventing orphan process accumulation.
FIXRate limiter memory bounds (BUG-103): Per-IP and per-actor rate limit maps are now capped at 10,000 and 5,000 entries respectively, preventing unbounded memory growth during distributed floods. Throttle tracking maps are also cleared on window reset.
FIXKnowledge/Obsidian bounded reads (BUG-104, BUG-110): `DirectorySource::query()` and `parse_note()` now enforce 10 MB and 5 MB file size limits respectively, preventing OOM on oversized files.
FIXConfig secret allowlist (BUG-106): Admin config endpoint now uses an allowlist (`ALLOWED_FIELDS`) instead of a blocklist for field filtering, ensuring new secret fields are safe by default.
FIXInterview turn cap (BUG-107): Interview sessions now enforce a 200-turn maximum to prevent unbounded memory growth within the 3600s TTL.
FIXreqwest Client panic (BUG-105): `VectorDbSource::new()` and `GraphSource::new()` now return `Result` instead of panicking via `.expect()` when TLS initialization fails.
FIXSignal handler crash (BUG-108): SIGTERM handler installation now falls back to SIGINT-only mode instead of crashing via `.expect()` in containerized environments.
FIXHeartbeat unreachable panic (BUG-109): `interval_for_tier()` catch-all arm now returns a safe default (`interval_ms * 2`) instead of `unreachable!()`, preventing runtime panics if new `SurvivalTier` variants are added.
FIXRegex recompilation (BUG-111): Obsidian tag and wikilink regexes are now `LazyLock` statics instead of being recompiled on every invocation.
FIXBudget float precision (BUG-112): `record_spending()` now uses epsilon-aware comparison to avoid IEEE 754 rounding errors causing spurious over-budget rejections.
FIXSub-agent lifecycle failures (SF-15–SF-20): All `let _ =` patterns on `registry.register()`, `start_agent()`, `stop_agent()`, `unregister()`, and `assign_agent()` now log errors at appropriate levels.
FIXAPI key env var diagnostics (SF-21, SF-22): Empty and missing API key / email password environment variables now produce warn-level log messages instead of silently returning empty strings.
FIXSub-agent list errors (SF-23): `list_sub_agents` DB errors now propagate at the delegation entry point and log at remaining fallback sites.
FIXSkills list errors (SF-24): `list_skills` DB failure now logged before fallback.
FIXMCP discovery failure (SF-25): MCP client discovery errors at startup now logged at warn level.
FIXSemantic cache load failure (SF-26): Cache load errors now logged before fallback to empty.
FIXProvider key resolution (SF-27): Missing provider keys for non-local providers now produce warn-level diagnostics.
FIXBundled providers parse failure (SF-28): TOML parse errors for bundled providers now logged.
FIXConfig backup restore (SF-29): Failed hot-reload backup restoration now logged at error level.
FIXMigration SQL errors (SF-30): SQL execution failures during migration now surfaced as warnings.
FIXThinking indicator failures (SF-31): Channel thinking indicator send failures now logged at debug level across all 4 platforms.
FIXSession candidates JSON (SF-32): Model selection candidate deserialization errors now logged.
FIXTelegram API errors (SF-33): Typing indicator and message delete HTTP failures now logged at debug level.
FIXSession counts fallback (SF-34): Sub-agent session count DB errors now logged before fallback.
FIXSubtask JSON parse (SF-35): Malformed `subtasks` parameter (non-array) now produces a warning instead of silently returning empty.
FIX19 additional MEDIUM silent failures (SF-36–SF-52): Error logging added across oauth, plugin-sdk, retrieval, digest, skills, signal, discord, whatsapp, sessions, defrag, embedding, main CLI, keystore, and obsidian modules.
FIXMigration export cascade (SF-48): Channel export now properly reports file read failures and JSON serialization errors instead of silently producing empty output.

v0.8.4

Bug Fixes & Stability & More2026-02-28

Security: 3 changes. Fixed: 16 changes. Changed: 1 change. Key changes: WebSocket message size limit: Unauthenticated WebSocket connections now enforce a 4 KiB inbound message limit and no longer echo full message bodies, closing a ~3x memory amplification DoS vector. Hippocampus TOCTOU fix: `drop_agent_table` auth check and DROP are now wrapped in a single transaction, preventing race-condition bypasses. Agent amnesia on DB error (SF-2): `list_messages` calls in agent routes now propagate errors instead of silently returning empty history via `.unwrap_or_default()`. Governor silent write failures (SF-1): Session expiry and compaction errors are now logged at warn/error level; `tick()` returns an accurate expired count instead of silently swallowing failures with `.ok()`.

Highlights

  • WebSocket message size limit: Unauthenticated WebSocket connections now enforce a 4 KiB inbound message limit and no longer echo full message bodies, closing a ~3x memory amplification DoS vector.
  • Hippocampus TOCTOU fix: `drop_agent_table` auth check and DROP are now wrapped in a single transaction, preventing race-condition bypasses.
  • Script runner bounded reads: Shebang detection now uses `BufReader::take(512)` instead of `read_to_string`, preventing OOM on oversized script files.
  • Agent amnesia on DB error (SF-2): `list_messages` calls in agent routes now propagate errors instead of silently returning empty history via `.unwrap_or_default()`.
  • Governor silent write failures (SF-1): Session expiry and compaction errors are now logged at warn/error level; `tick()` returns an accurate expired count instead of silently swallowing failures with `.ok()`.
  • Money::from_dollars NaN panic (BUG-2): `from_dollars` now returns `Result`, rejecting NaN and Infinity inputs instead of panicking via `assert!`.
  • Delivery queue recovery (SF-7): `recover_from_store` is now async with proper `.lock().await`, replacing a `try_lock()` that silently dropped recovered messages.
  • Agent loop detection enforcement (BUG-3): `is_looping()` is now called inside `transition()` and forces `Done` state, preventing callers from bypassing loop detection.
FIXWebSocket message size limit: Unauthenticated WebSocket connections now enforce a 4 KiB inbound message limit and no longer echo full message bodies, closing a ~3x memory amplification DoS vector.
FIXHippocampus TOCTOU fix: `drop_agent_table` auth check and DROP are now wrapped in a single transaction, preventing race-condition bypasses.
FIXScript runner bounded reads: Shebang detection now uses `BufReader::take(512)` instead of `read_to_string`, preventing OOM on oversized script files.
FIXAgent amnesia on DB error (SF-2): `list_messages` calls in agent routes now propagate errors instead of silently returning empty history via `.unwrap_or_default()`.
FIXGovernor silent write failures (SF-1): Session expiry and compaction errors are now logged at warn/error level; `tick()` returns an accurate expired count instead of silently swallowing failures with `.ok()`.
FIXMoney::from_dollars NaN panic (BUG-2): `from_dollars` now returns `Result`, rejecting NaN and Infinity inputs instead of panicking via `assert!`.
FIXDelivery queue recovery (SF-7): `recover_from_store` is now async with proper `.lock().await`, replacing a `try_lock()` that silently dropped recovered messages.
FIXAgent loop detection enforcement (BUG-3): `is_looping()` is now called inside `transition()` and forces `Done` state, preventing callers from bypassing loop detection.
FIXDigit-leading SQL identifiers (BUG-7): `validate_identifier` now rejects names starting with digits, which would produce invalid SQL.
FIXEmbedding API key error message (SF-4): Missing API key env var now returns a clear error message instead of a cryptic 401 via `.unwrap_or_default()`.
FIXANN index corruption paths (SF-6, SF-10): Corrupt embedding JSON is now logged and skipped; RwLock poison on write returns an error instead of silently recovering with stale data.
FIXAdmin dashboard false empties (SF-3): DB read errors in dashboard endpoints are now logged with `inspect_err` before falling back to defaults, enabling diagnosis.
FIXSession tool call queries (SF-9): Tool call endpoints now propagate DB errors with proper HTTP 500 responses instead of returning empty arrays.
FIXEventBus publish logging (SF-5): `let _ =` on channel send replaced with debug-level logging when no subscribers are active.
FIXDelivery queue timestamp fallback (SF-11): Failed timestamp parse now falls back to `UNIX_EPOCH` (safe backoff) instead of `Utc::now()` (immediate retry).
FIXDead letter false empties (SF-8): `dead_letters_from_store` errors now logged before fallback.
FIXAdmin config serialization (SF-12): Config endpoint returns HTTP 500 on serialization failure instead of null body.
FIXEfficiency report serialization (SF-13): Efficiency endpoint returns HTTP 500 on serialization failure instead of null body.
FIXWebhook body bytes (SF-14): Failed body extraction now logs a warning instead of silently discarding the payload.
CHORECrate publish ordering: Release workflow now publishes crates in correct topological dependency order with increased index propagation wait times, fixing the v0.8.3 publish failure.

v0.8.3

Security Hardening & More2026-02-27

Security: 4 changes. Fixed: 4 changes. Added: 1 change. Key changes: Auth bypass when no API key: Requests to non-exempt API routes now fail closed when no API key is configured — only loopback connections are allowed. Previously, missing API key config silently allowed all traffic. A2A replay protection: Added nonce registry with TTL-based expiry to the A2A protocol, preventing message replay attacks within the nonce window. UTF-8 panic in memory truncation: Replaced unsafe byte-level string slicing with `floor_char_boundary()` to prevent panics on multi-byte characters (emoji, CJK) near the 200-char truncation point. Script plugin zombie processes: Script timeout now explicitly kills the child process and reaps it, preventing zombie accumulation.

Highlights

  • Auth bypass when no API key: Requests to non-exempt API routes now fail closed when no API key is configured — only loopback connections are allowed. Previously, missing API key config silently allowed all traffic.
  • A2A replay protection: Added nonce registry with TTL-based expiry to the A2A protocol, preventing message replay attacks within the nonce window.
  • Plugin permission enforcement: New `strict_permissions` and `allowed_permissions` config fields for plugin policy. In strict mode, undeclared permissions are blocked; in permissive mode (default), they produce a warning.
  • Ethereum signature recovery ID: EIP-191 signatures now include the recovery byte (v = 27 or 28), producing correct 65-byte signatures instead of 64-byte truncated ones.
  • UTF-8 panic in memory truncation: Replaced unsafe byte-level string slicing with `floor_char_boundary()` to prevent panics on multi-byte characters (emoji, CJK) near the 200-char truncation point.
  • Script plugin zombie processes: Script timeout now explicitly kills the child process and reaps it, preventing zombie accumulation.
  • Script plugin unbounded output: stdout/stderr from plugin scripts are now capped at 10 MB via `AsyncReadExt::take()`.
  • Keystore lock ordering: Consolidated two separate mutexes into a single `KeystoreState` mutex, eliminating potential deadlock scenarios.
FIXAuth bypass when no API key: Requests to non-exempt API routes now fail closed when no API key is configured — only loopback connections are allowed. Previously, missing API key config silently allowed all traffic.
FIXA2A replay protection: Added nonce registry with TTL-based expiry to the A2A protocol, preventing message replay attacks within the nonce window.
FIXPlugin permission enforcement: New `strict_permissions` and `allowed_permissions` config fields for plugin policy. In strict mode, undeclared permissions are blocked; in permissive mode (default), they produce a warning.
FIXEthereum signature recovery ID: EIP-191 signatures now include the recovery byte (v = 27 or 28), producing correct 65-byte signatures instead of 64-byte truncated ones.
FIXUTF-8 panic in memory truncation: Replaced unsafe byte-level string slicing with `floor_char_boundary()` to prevent panics on multi-byte characters (emoji, CJK) near the 200-char truncation point.
FIXScript plugin zombie processes: Script timeout now explicitly kills the child process and reaps it, preventing zombie accumulation.
FIXScript plugin unbounded output: stdout/stderr from plugin scripts are now capped at 10 MB via `AsyncReadExt::take()`.
FIXKeystore lock ordering: Consolidated two separate mutexes into a single `KeystoreState` mutex, eliminating potential deadlock scenarios.
FEAT`roboticus defrag` command: New workspace coherence scanner with 6 passes — refs (dead reference elimination), drift (config drift detection), artifacts (orphaned file cleanup), stale (ghost state entry removal), identity (brand consistency), and scripts (script health validation). Supports `--fix` for auto-repair, `--yes` for non-interactive mode, and `--json` for machine-readable output.

v0.8.2

New Features2026-02-27

Added: 3 changes. Fixed: 5 changes. Key changes: 100+ API route integration tests: Comprehensive test coverage for sessions, turns, interviews, feedback, skills, model selection, channels, webhooks, dead letters, admin, memory, cron, context, and approvals endpoints. Tests exercise both success and error paths including validation, 404s, auth, and edge cases. Workspace test count now at 3,316. Homebrew tap distribution: macOS/Linux users can install via `brew install robot-accomplice/tap/roboticus`. 29 stabilization bug fixes: Resolved input validation gaps, API error format inconsistencies, query parameter hardening, security headers, dashboard trailing content, model persistence, cron field naming, and Windows TOML path issues discovered during exhaustive hands-on testing of v0.8.1. HTML injection prevention: Closed remaining sanitization coverage gaps in API write endpoints.

Highlights

  • 100+ API route integration tests: Comprehensive test coverage for sessions, turns, interviews, feedback, skills, model selection, channels, webhooks, dead letters, admin, memory, cron, context, and approvals endpoints. Tests exercise both success and error paths including validation, 404s, auth, and edge cases. Workspace test count now at 3,316.
  • Homebrew tap distribution: macOS/Linux users can install via `brew install robot-accomplice/tap/roboticus`.
  • Winget package distribution: Windows users can install via Winget package manager.
  • 29 stabilization bug fixes: Resolved input validation gaps, API error format inconsistencies, query parameter hardening, security headers, dashboard trailing content, model persistence, cron field naming, and Windows TOML path issues discovered during exhaustive hands-on testing of v0.8.1.
  • HTML injection prevention: Closed remaining sanitization coverage gaps in API write endpoints.
  • Dashboard SPA cleanup: Removed duplicate trailing content after `</html>` close tag.
  • Model change persistence: Fixed model selection not persisting across server restarts.
  • Config serialization: Fixed TOML config serialization on Windows paths.
FEAT100+ API route integration tests: Comprehensive test coverage for sessions, turns, interviews, feedback, skills, model selection, channels, webhooks, dead letters, admin, memory, cron, context, and approvals endpoints. Tests exercise both success and error paths including validation, 404s, auth, and edge cases. Workspace test count now at 3,316.
FEATHomebrew tap distribution: macOS/Linux users can install via `brew install robot-accomplice/tap/roboticus`.
FEATWinget package distribution: Windows users can install via Winget package manager.
FIX29 stabilization bug fixes: Resolved input validation gaps, API error format inconsistencies, query parameter hardening, security headers, dashboard trailing content, model persistence, cron field naming, and Windows TOML path issues discovered during exhaustive hands-on testing of v0.8.1.
FIXHTML injection prevention: Closed remaining sanitization coverage gaps in API write endpoints.
FIXDashboard SPA cleanup: Removed duplicate trailing content after `</html>` close tag.
FIXModel change persistence: Fixed model selection not persisting across server restarts.
FIXConfig serialization: Fixed TOML config serialization on Windows paths.

v0.8.1

Bug Fixes & Stability (14 changes)2026-02-27

Fixed: 12 changes. Changed: 2 changes. Key changes: 40 smoke/UAT bug fixes: Resolved 40 bugs (5 critical, 6 high, 15 medium, 14 low/UX) discovered during comprehensive smoke testing of all 85 REST routes, 32 CLI commands, and 13 dashboard pages. Input validation hardening: Added field-length limits, HTML sanitization, and null-byte rejection across all API write endpoints. CI scripts use POSIX grep: Replaced all `rg` (ripgrep) invocations with standard `grep -E`/`grep -qE` in CI scripts for broader runner compatibility. Windows compilation: Added conditional `allow(unused_mut)` for platform-gated mutation in security audit command.

Highlights

  • 40 smoke/UAT bug fixes: Resolved 40 bugs (5 critical, 6 high, 15 medium, 14 low/UX) discovered during comprehensive smoke testing of all 85 REST routes, 32 CLI commands, and 13 dashboard pages.
  • Input validation hardening: Added field-length limits, HTML sanitization, and null-byte rejection across all API write endpoints.
  • JSON error responses: All API error paths now return structured `{"error": "..."}` JSON instead of plain text.
  • Memory search deduplication: FTS memory search no longer returns duplicate entries; results are now structured with category/timestamp metadata.
  • Cron scheduler accuracy: `next_run_at` is now persisted after computation; heartbeat no longer floods logs with virtual job IDs; jobs use actual agent IDs.
  • Cost display precision: Floating-point noise eliminated from cost/efficiency metrics (rounded to 6 decimal places with division-by-zero guard).
  • Skills metadata: `risk_level` is now parameterized (not hardcoded "Caution"); skills track `last_loaded_at` timestamp.
  • CLI resilience: `roboticus check` no longer crashes with raw Rust IO errors; shows friendly messages with config path suggestions.
FIX40 smoke/UAT bug fixes: Resolved 40 bugs (5 critical, 6 high, 15 medium, 14 low/UX) discovered during comprehensive smoke testing of all 85 REST routes, 32 CLI commands, and 13 dashboard pages.
FIXInput validation hardening: Added field-length limits, HTML sanitization, and null-byte rejection across all API write endpoints.
FIXJSON error responses: All API error paths now return structured `{"error": "..."}` JSON instead of plain text.
FIXMemory search deduplication: FTS memory search no longer returns duplicate entries; results are now structured with category/timestamp metadata.
FIXCron scheduler accuracy: `next_run_at` is now persisted after computation; heartbeat no longer floods logs with virtual job IDs; jobs use actual agent IDs.
FIXCost display precision: Floating-point noise eliminated from cost/efficiency metrics (rounded to 6 decimal places with division-by-zero guard).
FIXSkills metadata: `risk_level` is now parameterized (not hardcoded "Caution"); skills track `last_loaded_at` timestamp.
FIXCLI resilience: `roboticus check` no longer crashes with raw Rust IO errors; shows friendly messages with config path suggestions.
FIXDashboard UX: Fixed 14 display bugs including schedule text duplication, raw-seconds uptime, missing pagination, broken status indicators, and external font dependency removal.
FIXFilesystem path exposure: Skills API no longer leaks `source_path`/`script_path` in responses.
FIXSession creation response: `POST /api/sessions` now returns the full session object instead of just the ID.
FIX404 fallback handler: Unknown API routes now return JSON `{"error": "not found"}` instead of empty 404.
CHORECI scripts use POSIX grep: Replaced all `rg` (ripgrep) invocations with standard `grep -E`/`grep -qE` in CI scripts for broader runner compatibility.
CHOREWindows compilation: Added conditional `allow(unused_mut)` for platform-gated mutation in security audit command.

v0.8.0

Security Hardening & More2026-02-26

Security: 17 changes. Fixed: 22 changes. Added: 16 changes. Changed: 4 changes. Key changes: CORS hardening: Removed wildcard `Access-Control-Allow-Origin: *` fallback when no API key is configured; CORS now always restricts to the configured bind address origin. Wallet key zeroing: Decrypted API keys in the keystore and child agent wallet secrets are now wrapped in `Zeroizing<String>` so key material is zeroed on drop. Telegram invalid-token resilience: Telegram `404/401` poll failures are now classified as likely invalid/revoked bot-token errors with explicit repair guidance and adaptive backoff to reduce noisy tight-loop logging. Subagent runtime activation sync: Taskable subagents are now auto-started at boot and kept in sync with create/update/toggle/delete operations, fixing the `enabled > 0, running = 0` stall where configured subagents stayed idle.

Highlights

  • CORS hardening: Removed wildcard `Access-Control-Allow-Origin: *` fallback when no API key is configured; CORS now always restricts to the configured bind address origin.
  • Wallet key zeroing: Decrypted API keys in the keystore and child agent wallet secrets are now wrapped in `Zeroizing<String>` so key material is zeroed on drop.
  • WalletFile Debug redaction: `WalletFile` no longer derives `Debug`; a manual impl redacts `private_key_hex` to prevent accidental key leakage in logs or panics.
  • Plaintext wallet detection: Loading an unencrypted wallet file now emits a `SECURITY` warning at `warn!` level instead of silently succeeding.
  • Webhook signature enforcement: WhatsApp webhook verification now rejects requests with an error when `app_secret` is unconfigured, instead of silently skipping verification.
  • OAuth token persistence errors surfaced: `OAuthManager::persist()` now returns `Result<()>` and callers log failures at `error!` level instead of silently swallowing write errors.
  • Skill catalog path traversal prevention: Skill download filenames from remote registries are now validated and canonicalized to prevent `../` path traversal.
  • API key URL encoding: The `query:` auth mode now percent-encodes API keys before appending to URLs, preventing malformed requests and log leakage.
FIXCORS hardening: Removed wildcard `Access-Control-Allow-Origin: *` fallback when no API key is configured; CORS now always restricts to the configured bind address origin.
FIXWallet key zeroing: Decrypted API keys in the keystore and child agent wallet secrets are now wrapped in `Zeroizing<String>` so key material is zeroed on drop.
FIXWalletFile Debug redaction: `WalletFile` no longer derives `Debug`; a manual impl redacts `private_key_hex` to prevent accidental key leakage in logs or panics.
FIXPlaintext wallet detection: Loading an unencrypted wallet file now emits a `SECURITY` warning at `warn!` level instead of silently succeeding.
FIXWebhook signature enforcement: WhatsApp webhook verification now rejects requests with an error when `app_secret` is unconfigured, instead of silently skipping verification.
FIXOAuth token persistence errors surfaced: `OAuthManager::persist()` now returns `Result<()>` and callers log failures at `error!` level instead of silently swallowing write errors.
FIXSkill catalog path traversal prevention: Skill download filenames from remote registries are now validated and canonicalized to prevent `../` path traversal.
FIXAPI key URL encoding: The `query:` auth mode now percent-encodes API keys before appending to URLs, preventing malformed requests and log leakage.
FIXScript runner absolute path rejection: `resolve_script_path` now unconditionally rejects absolute paths instead of accepting them.
FIXScript file permission check: Script runner validates that script files are not world-writable on Unix before execution.
FIXSubagent name validation: Subagent names are now restricted to max 128 characters, alphanumeric + hyphens + underscores only.
FIXPlugin name/version validation: Plugin manifest validation now enforces character restrictions on plugin names and versions matching tool name rules.
FIXAudit log key redaction: Keystore audit log entries now redact key names to first 3 characters instead of logging full key identifiers.
FIXx402 recipient address validation: Payment authorization now validates that recipient addresses match Ethereum address format (0x + 40 hex chars).
FIXJSON merge depth limit: `update_config` recursive merge is now bounded to 10 levels of nesting to prevent stack overflow.
FIXError message sanitization: `sanitize_error_message` now strips content after common sensitive prefixes (file paths, SQLite errors, stack traces).
FIXDecided-by field sanitization: Approval decision `decided_by` field is now limited to 256 characters with control characters stripped.
FIXTelegram invalid-token resilience: Telegram `404/401` poll failures are now classified as likely invalid/revoked bot-token errors with explicit repair guidance and adaptive backoff to reduce noisy tight-loop logging.
FIXSubagent runtime activation sync: Taskable subagents are now auto-started at boot and kept in sync with create/update/toggle/delete operations, fixing the `enabled > 0, running = 0` stall where configured subagents stayed idle.
FIXFTS duplicate row accumulation: `store_semantic` and `store_working` now delete existing FTS entries before re-inserting, preventing unbounded duplicate growth in `memory_fts` on upserts.
FIXSSE stream UTF-8 corruption: `SseChunkStream` now uses proper incremental UTF-8 decoding instead of `from_utf8_lossy`, preserving multi-byte characters split across HTTP chunks.
FIXSSE buffer unbounded growth: SSE chunk stream buffer is now capped at 10 MB to prevent unbounded memory growth from long SSE lines.
FIXHeartbeat interval recovery: Heartbeat daemon interval now recovers to the original configured value when the survival tier returns to Normal, instead of permanently remaining at the degraded rate.
FIXAgentCardRefresh task activation: `HeartbeatTask::AgentCardRefresh` is now included in `default_tasks()` instead of being a dead variant.
FIXHippocampus identifier consistency: Table name validation in `create_agent_table` no longer allows hyphens, matching `validate_identifier` behavior.
FIXNegative hours SQL comment injection: `query_transactions` now clamps `hours` to positive values, preventing negative values from producing SQL comments.
FIXPRAGMA identifier quoting: `has_column` now quotes table names in `PRAGMA table_info` statements.
FIXCron lease identity verification: `release_lease` now requires the `lease_holder` parameter and verifies ownership before releasing.
FIXCoverage gate alignment: Local `justfile` coverage threshold now matches CI at 80% minimum.
FIX`just run-release` binary name: Fixed reference from `roboticus-server` to `roboticus`.
FIXSmoke test default port: `run-smoke.sh` default port corrected from 8787 to 18789.
FIXCORS fallback logging: Invalid CORS origin parse now logs a warning and falls back to `127.0.0.1` loopback instead of silently becoming wildcard `*`.
FIXCrypto function error propagation: `derive_key`, `encrypt_wallet_data` in wallet now return `Result` instead of panicking with `expect`.
FIXCapacityTracker mutex resilience: All `expect("mutex poisoned")` calls replaced with `unwrap_or_else(|e| e.into_inner())` for graceful recovery.
FIXRate limit / approval mutex resilience: Same mutex poison recovery applied to policy engine and approval manager.
FIXCron lease/run error logging: `acquire_lease`, `record_run`, and `release_lease` errors are now logged at `warn` level instead of silently discarded.
FIXInterval expression UTF-8 safety: `parse_interval_expr_to_ms` now uses `char_indices()` for correct byte-offset slicing of multi-byte characters.
FIXTOML serialization error propagation: `generate_operator_toml` and `generate_directives_toml` now return `Result<String>` instead of silently returning empty strings.
FIXFloating-point tier threshold: `SurvivalTier::from_balance` uses 0.999 epsilon for the `hours_below_zero` check to handle floating-point rounding.
FEATv0.8.0 zero-regression release gate: Added canonical `just test-v080-go-live` orchestration and release-blocking CI/release jobs for workspace tests, integration/regression batteries, bounded soak/fuzz checks, CLI+web UAT smoke, and release-doc/provenance consistency checks.
FEATWASM execution timeout enforcement: WASM plugin execution now tracks elapsed time against the configured `execution_timeout_ms` and logs warnings when exceeded.
FEATWASM memory bounds validation: WASM input writes check memory size before writing; output reads validate `ptr + len` against module memory bounds.
FEATBrowser evaluate length limit: `BrowserAction::Evaluate` rejects expressions exceeding 100,000 characters.
FEATEmail body size limit: Email adapter truncates message bodies exceeding 1 MB.
FEATA2a session establishment check: Added `is_established()` method and documentation for session key typestate.
FEATA2a rate window eviction: Rate limit windows now evict stale entries (>1 hour idle) when exceeding 1,000 tracked peers.
FEATInboundMessage platform sanitization: Added `sanitize_platform()` to strip control characters and enforce 64-char limit.
FEATYieldEngine field encapsulation: All fields made private with getter methods.
FEATTreasuryPolicy field encapsulation: All fields made private with constructor and getter methods.
FEATZero-amount deposit/withdraw rejection: `YieldEngine::deposit()` and `withdraw()` now reject amounts <= 0.
FEATPlugin registry unregister: Added `unregister()` method to fully remove plugin entries.
FEATScript shebang validation: Extensionless script files now require a recognized shebang line.
FEATDocker HEALTHCHECK: Dockerfile now includes a health check against `/api/health`.
FEATDocker build reproducibility: Dockerfile now uses `--locked`, MSRV-pinned Rust image, and dependency layer caching.
FEATRelease CI supply-chain hardening: `cross` installation pinned to versioned release instead of git HEAD.
CHOREWhatsApp client initialization: `reqwest::Client` builder now uses `expect()` instead of `unwrap_or_default()` to surface TLS initialization failures.
CHORECDP client initialization: Same `expect()` change applied to browser CDP HTTP client.
CHORESemantic search scan limit: `search_similar` now includes `LIMIT 10000` to bound memory usage pending AnnIndex integration.
CHORESemanticCache thread safety documentation: Documented `&mut self` requirement and external synchronization expectations.

v0.7.1

Hotfixes & Reliability2026-02-25

Fixed: 6 changes. Key changes: Windows daemon startup and binary update reliability fixes, dashboard render boundary hardening, and loopback-proxy migration safeguards with explicit deprecation guidance for v0.8.0 removal.

Highlights

  • Windows daemon startup reliability: Replaced the broken `sc.exe` service launch path with a detached user-process daemon flow.
  • Windows binary update guardrail: `roboticus update binary` now blocks in-process self-update on Windows and prints safe manual upgrade steps.
  • Dashboard JS bleed-through fix: Dashboard rendering is clipped to the canonical HTML document boundary.
  • In-process provider routing metadata: `/api/models/available` reports in-process proxy mode and provider diagnostics for clearer operator visibility.
  • Loopback proxy deprecation guidance: `0.7.x` warns that `127.0.0.1:8788/<provider>` is deprecated and will be removed in `v0.8.0`.
FIXWindows daemon startup reliability: Replaced the broken `sc.exe` service launch path with a detached user-process daemon flow.
FIXWindows binary update guardrail: `roboticus update binary` now blocks in-process self-update on Windows and prints safe manual upgrade steps.
FIXDashboard JS bleed-through fix: Dashboard rendering is clipped to the canonical HTML document boundary.
FIXIn-process provider routing metadata: `/api/models/available` reports in-process proxy mode and provider diagnostics for clearer operator visibility.
DOCSLoopback proxy deprecation guidance: `0.7.x` warns that `127.0.0.1:8788/<provider>` is deprecated and will be removed in `v0.8.0`.

v0.7.0

New Features2026-02-25

Added: 4 changes. Changed: 3 changes. Key changes: Subagent contract enforcement: Added explicit `subagent` vs `model-proxy` role validation, fixed-skills persistence/validation, and strict rejection of personality payloads for taskable subagents. Model-selection forensics pipeline: Added persistent `model_selection_events` storage, turn-linked forensics APIs (`GET /api/turns/{id}/model-selection`, `GET /api/models/selections`), and live dashboard views for candidate evaluation details. Roster and status semantics: `/api/roster`, `/api/agent/status`, and dashboard agent views now distinguish taskable subagents from model proxies and report taskable counts with clearer operator-facing terminology. Subagent model assignment options: Added support for `auto` (router-controlled) and `commander` (primary-agent-assigned) model modes for taskable subagents, including runtime model resolution behavior.

Highlights

  • Subagent contract enforcement: Added explicit `subagent` vs `model-proxy` role validation, fixed-skills persistence/validation, and strict rejection of personality payloads for taskable subagents.
  • Model-selection forensics pipeline: Added persistent `model_selection_events` storage, turn-linked forensics APIs (`GET /api/turns/{id}/model-selection`, `GET /api/models/selections`), and live dashboard views for candidate evaluation details.
  • Streaming turn traceability: `POST /api/agent/message/stream` now emits stable `turn_id` values from stream start through completion and records per-turn model-selection audits for streamed responses.
  • Subagent ubiquitous-language architecture doc: Added `docs/architecture/subagent-ubiquitous-language.md` with canonical terminology, gap audit, and dataflow diagrams.
  • Roster and status semantics: `/api/roster`, `/api/agent/status`, and dashboard agent views now distinguish taskable subagents from model proxies and report taskable counts with clearer operator-facing terminology.
  • Subagent model assignment options: Added support for `auto` (router-controlled) and `commander` (primary-agent-assigned) model modes for taskable subagents, including runtime model resolution behavior.
  • Context forensics UX: Context Explorer now supports live stream-turn handoff and direct forensic drill-down using active `turn_id` metadata.
FEATSubagent contract enforcement: Added explicit `subagent` vs `model-proxy` role validation, fixed-skills persistence/validation, and strict rejection of personality payloads for taskable subagents.
FEATModel-selection forensics pipeline: Added persistent `model_selection_events` storage, turn-linked forensics APIs (`GET /api/turns/{id}/model-selection`, `GET /api/models/selections`), and live dashboard views for candidate evaluation details.
FEATStreaming turn traceability: `POST /api/agent/message/stream` now emits stable `turn_id` values from stream start through completion and records per-turn model-selection audits for streamed responses.
FEATSubagent ubiquitous-language architecture doc: Added `docs/architecture/subagent-ubiquitous-language.md` with canonical terminology, gap audit, and dataflow diagrams.
CHORERoster and status semantics: `/api/roster`, `/api/agent/status`, and dashboard agent views now distinguish taskable subagents from model proxies and report taskable counts with clearer operator-facing terminology.
CHORESubagent model assignment options: Added support for `auto` (router-controlled) and `commander` (primary-agent-assigned) model modes for taskable subagents, including runtime model resolution behavior.
CHOREContext forensics UX: Context Explorer now supports live stream-turn handoff and direct forensic drill-down using active `turn_id` metadata.

v0.6.1

Bug Fixes & Stability2026-02-24

Fixed: 3 changes. Key changes: Release integrity follow-up: Merged post-tag regression fixes from the 0.6.0 release branch into `develop`, including web peer-scope identity validation, dashboard WebSocket token encoding, and release-gate compile/test stabilization. Session creation stability: Restored explicit default agent scope behavior in DB session creation paths to avoid `500` failures in session lifecycle APIs/tests.

Highlights

  • Release integrity follow-up: Merged post-tag regression fixes from the 0.6.0 release branch into `develop`, including web peer-scope identity validation, dashboard WebSocket token encoding, and release-gate compile/test stabilization.
  • Session creation stability: Restored explicit default agent scope behavior in DB session creation paths to avoid `500` failures in session lifecycle APIs/tests.
  • Routing test alignment: Updated router integration expectations to reflect current fallback behavior when primary providers are breaker-blocked.
FIXRelease integrity follow-up: Merged post-tag regression fixes from the 0.6.0 release branch into `develop`, including web peer-scope identity validation, dashboard WebSocket token encoding, and release-gate compile/test stabilization.
FIXSession creation stability: Restored explicit default agent scope behavior in DB session creation paths to avoid `500` failures in session lifecycle APIs/tests.
FIXRouting test alignment: Updated router integration expectations to reflect current fallback behavior when primary providers are breaker-blocked.

v0.6.0

New Features2026-02-24

Added: 4 changes. Changed: 5 changes. Key changes: Capacity headroom telemetry: New `GET /api/stats/capacity` endpoint exposes per-provider headroom, utilization, and sustained-pressure flags for operator visibility. Capacity-aware circuit preemption: Circuit breakers now accept soft capacity pressure signals and expose preemptive `half_open` state before hard failure trips. Routing quality now capacity-weighted: `select_for_complexity()` scores candidates by model quality and provider headroom, rather than binary near-capacity fallback behavior. Inference feedback loop now records capacity usage: both non-stream and stream response paths record provider token/request usage and update capacity pressure signals.

Highlights

  • Capacity headroom telemetry: New `GET /api/stats/capacity` endpoint exposes per-provider headroom, utilization, and sustained-pressure flags for operator visibility.
  • Capacity-aware circuit preemption: Circuit breakers now accept soft capacity pressure signals and expose preemptive `half_open` state before hard failure trips.
  • Session scope backfill migration: Added `012_session_scope_backfill_unique.sql` to normalize legacy sessions to explicit scope and enforce unique active scoped sessions.
  • Safe markdown rendering in dashboard sessions: Session chat and Context Explorer now render markdown with strict URL sanitization and no raw HTML execution.
  • Routing quality now capacity-weighted: `select_for_complexity()` scores candidates by model quality and provider headroom, rather than binary near-capacity fallback behavior.
  • Inference feedback loop now records capacity usage: both non-stream and stream response paths record provider token/request usage and update capacity pressure signals.
  • Session scoping defaults to explicit agent scope: `find_or_create()` now uses `agent` scope by default and channel/web paths pass scoped keys for peer/group isolation.
  • Channel session affinity: Channel dedup and session selection now use resolved chat/channel identity instead of platform-only sender affinity.
FEATCapacity headroom telemetry: New `GET /api/stats/capacity` endpoint exposes per-provider headroom, utilization, and sustained-pressure flags for operator visibility.
FEATCapacity-aware circuit preemption: Circuit breakers now accept soft capacity pressure signals and expose preemptive `half_open` state before hard failure trips.
FEATSession scope backfill migration: Added `012_session_scope_backfill_unique.sql` to normalize legacy sessions to explicit scope and enforce unique active scoped sessions.
FEATSafe markdown rendering in dashboard sessions: Session chat and Context Explorer now render markdown with strict URL sanitization and no raw HTML execution.
CHORERouting quality now capacity-weighted: `select_for_complexity()` scores candidates by model quality and provider headroom, rather than binary near-capacity fallback behavior.
CHOREInference feedback loop now records capacity usage: both non-stream and stream response paths record provider token/request usage and update capacity pressure signals.
CHORESession scoping defaults to explicit agent scope: `find_or_create()` now uses `agent` scope by default and channel/web paths pass scoped keys for peer/group isolation.
CHOREChannel session affinity: Channel dedup and session selection now use resolved chat/channel identity instead of platform-only sender affinity.
CHOREHeartbeat now runs SessionGovernor: stale sessions are expired with compaction draft capture; optional hourly rotation is triggered when `session.reset_schedule` is configured.

v0.5.0

New Features (25 changes)2026-02-23

Added: 18 changes. Changed: 7 changes. Key changes: Addressability Filter: Composable filter chain for group chat addressability detection. Agent only responds when mentioned by name, replied to, or in a DM. Configurable via `[addressability]` config section with alias names support. Response Transform Pipeline: Three-stage pipeline applied to all LLM responses -- `ReasoningExtractor` (captures `<think>` blocks), `FormatNormalizer` (whitespace/fence cleanup), `ContentGuard` (injection defense). Replaces the previous inline `scan_output` approach. All 10 crate READMEs updated to v0.5.0 with expanded descriptions and key types. All 10 `lib.rs` files now have `//!` crate-level doc comments.

Highlights

  • Addressability Filter: Composable filter chain for group chat addressability detection. Agent only responds when mentioned by name, replied to, or in a DM. Configurable via `[addressability]` config section with alias names support.
  • Response Transform Pipeline: Three-stage pipeline applied to all LLM responses -- `ReasoningExtractor` (captures `<think>` blocks), `FormatNormalizer` (whitespace/fence cleanup), `ContentGuard` (injection defense). Replaces the previous inline `scan_output` approach.
  • Flexible Network Binding: Interface-based binding (`bind_interface`), optional TLS via `axum-server` with rustls, and `advertise_url` for agent card generation.
  • Approval Workflow Loop Integration: Agent pauses on gated tool calls, publishes `pending_approval` events via WebSocket, and resumes after admin approve/deny. Dashboard "Approvals" panel with real-time updates.
  • Browser as Agent Tool: `BrowserTool` adapter wrapping the 12-action `roboticus-browser` crate, registered in `ToolRegistry`. Tool schemas injected into system prompt so the LLM can request browser actions.
  • Context Observatory: Full turn inspector and analytics suite:
  • Turn recording with `context_snapshots` table capturing token allocation, memory tier breakdown, complexity level, and model for every LLM call
  • Turn & Context API: `GET /api/sessions/{id}/turns`, `GET /api/turns/{id}`, `GET /api/turns/{id}/context`, `GET /api/turns/{id}/tools`
FEATAddressability Filter: Composable filter chain for group chat addressability detection. Agent only responds when mentioned by name, replied to, or in a DM. Configurable via `[addressability]` config section with alias names support.
FEATResponse Transform Pipeline: Three-stage pipeline applied to all LLM responses -- `ReasoningExtractor` (captures `<think>` blocks), `FormatNormalizer` (whitespace/fence cleanup), `ContentGuard` (injection defense). Replaces the previous inline `scan_output` approach.
FEATFlexible Network Binding: Interface-based binding (`bind_interface`), optional TLS via `axum-server` with rustls, and `advertise_url` for agent card generation.
FEATApproval Workflow Loop Integration: Agent pauses on gated tool calls, publishes `pending_approval` events via WebSocket, and resumes after admin approve/deny. Dashboard "Approvals" panel with real-time updates.
FEATBrowser as Agent Tool: `BrowserTool` adapter wrapping the 12-action `roboticus-browser` crate, registered in `ToolRegistry`. Tool schemas injected into system prompt so the LLM can request browser actions.
FEATContext Observatory: Full turn inspector and analytics suite:
FEATTurn recording with `context_snapshots` table capturing token allocation, memory tier breakdown, complexity level, and model for every LLM call
FEATTurn & Context API: `GET /api/sessions/{id}/turns`, `GET /api/turns/{id}`, `GET /api/turns/{id}/context`, `GET /api/turns/{id}/tools`
FEATDashboard per-message context expansion (token allocation bar, memory breakdown, reasoning trace, tool calls)
FEATContext Explorer tab with session selector, turn timeline, and aggregate charts
FEATHeuristic context analyzer with 12 per-turn rules and 10 session-aggregate rules across Budget, Memory, Prompt, Tools, Cost, and Quality categories
FEATLLM-powered deep analysis stub for on-demand qualitative context evaluation
FEATPrompt efficiency metrics per model: output density, budget utilization, memory ROI, cache hit rate, context pressure, cost attribution
FEATEfficiency dashboard with model comparison cards, time series charts, period selector, and auto-generated cost optimization tips
FEATOutcome grading: 1-5 star ratings on assistant responses via `turn_feedback` table, with quality-adjusted metrics (cost per quality point, quality by complexity, memory impact analysis)
FEATBehavioral recommendations engine: ~14 heuristic rules across 7 categories (query crafting, model selection, session management, memory leverage, cost optimization, tool usage, configuration) with evidence and estimated impact
FEATStreaming LLM Responses: `SseChunkStream` adapter for token-by-token streaming. `POST /api/agent/message/stream` SSE endpoint. WebSocket forwarding via EventBus. Dashboard incremental rendering with typing indicator.
FEATNew reference documents: `docs/CONFIGURATION.md`, `docs/CLI.md`, `docs/API.md`, `docs/DEPLOYMENT.md`, `docs/ENV.md`
CHOREAll 10 crate READMEs updated to v0.5.0 with expanded descriptions and key types
CHOREAll 10 `lib.rs` files now have `//!` crate-level doc comments
CHORE10 new dataflow diagrams added to `roboticus-dataflow.md` (approval, browser, context, transform, streaming, addressability, observatory, plugin SDK, OAuth, channel lifecycle)
CHORE6 new sequence diagrams added to `roboticus-sequences.md` (approval, streaming, turn recording, grading, TLS, CDP)
CHOREAll 6 C4 component diagrams updated with ~40 previously undocumented modules
CHOREDocumentation standards added to CONTRIBUTING.md
CHORE`cargo doc` CI gate added with `-D warnings` to prevent future documentation drift

v0.4.3

New Features & More2026-02-23

Added: 6 changes. Fixed: 3 changes. Changed: 2 changes. Key changes: Slash commands for agent chat: `/model`, `/models`, `/breaker`, `/retry` for runtime LLM control. Runtime model override via `/model set <model>` — temporarily forces a specific model, bypassing routing. Credit/billing errors now permanently trip the circuit breaker (no auto-recovery to HalfOpen) — providers with exhausted credits are never probed again until explicitly reset via `/breaker reset`. Dashboard "Save to keystore" button now sends `Content-Type: application/json` header — previously failed with "Expected request with 'Content-Type: application/json'".

Highlights

  • Slash commands for agent chat: `/model`, `/models`, `/breaker`, `/retry` for runtime LLM control
  • Runtime model override via `/model set <model>` — temporarily forces a specific model, bypassing routing
  • Circuit breaker status and reset via `/breaker` and `/breaker reset [provider]` slash commands
  • Breaker-aware model routing — `select_for_complexity` and `select_cheapest_qualified` now skip providers with tripped circuit breakers
  • Pre-flight API key check in `infer_with_fallback` — cloud providers with no configured key are skipped before sending a doomed request
  • Dashboard settings inputs show a dimmed "none" placeholder instead of literal "null" for empty fields
  • Credit/billing errors now permanently trip the circuit breaker (no auto-recovery to HalfOpen) — providers with exhausted credits are never probed again until explicitly reset via `/breaker reset`
  • Dashboard "Save to keystore" button now sends `Content-Type: application/json` header — previously failed with "Expected request with 'Content-Type: application/json'"
FEATSlash commands for agent chat: `/model`, `/models`, `/breaker`, `/retry` for runtime LLM control
FEATRuntime model override via `/model set <model>` — temporarily forces a specific model, bypassing routing
FEATCircuit breaker status and reset via `/breaker` and `/breaker reset [provider]` slash commands
FEATBreaker-aware model routing — `select_for_complexity` and `select_cheapest_qualified` now skip providers with tripped circuit breakers
FEATPre-flight API key check in `infer_with_fallback` — cloud providers with no configured key are skipped before sending a doomed request
FEATDashboard settings inputs show a dimmed "none" placeholder instead of literal "null" for empty fields
FIXCredit/billing errors now permanently trip the circuit breaker (no auto-recovery to HalfOpen) — providers with exhausted credits are never probed again until explicitly reset via `/breaker reset`
FIXDashboard "Save to keystore" button now sends `Content-Type: application/json` header — previously failed with "Expected request with 'Content-Type: application/json'"
FIXSettings form no longer renders `"null"` as a literal value in input fields; empty fields display a styled placeholder and save as `null`
CHOREMerged "Roster" and "Agents" into a single "Agents" page with tabbed Roster/List views
CHORERemoved CLI typing sound effects (`start_typing_sound` / `SoundHandle`) from banner rendering

v0.4.2

Bug Fixes & Stability2026-02-23

Fixed: 3 changes. Key changes: `roboticus daemon start` now verifies the service is actually running after `launchctl load` — previously reported "Daemon started" even when the service crashed immediately. `roboticus daemon install` resolves the config path to absolute before embedding in the plist — previously used the relative path which launchd couldn't resolve.

Highlights

  • `roboticus daemon start` now verifies the service is actually running after `launchctl load` — previously reported "Daemon started" even when the service crashed immediately
  • `roboticus daemon install` resolves the config path to absolute before embedding in the plist — previously used the relative path which launchd couldn't resolve
  • Captures launchctl stderr and checks `LastExitStatus` / PID to give actionable error messages on daemon start failure
FIX`roboticus daemon start` now verifies the service is actually running after `launchctl load` — previously reported "Daemon started" even when the service crashed immediately
FIX`roboticus daemon install` resolves the config path to absolute before embedding in the plist — previously used the relative path which launchd couldn't resolve
FIXCaptures launchctl stderr and checks `LastExitStatus` / PID to give actionable error messages on daemon start failure

v0.4.1

Security Hardening & More2026-02-23

Added: 5 changes. Fixed: 7 changes. Security: 4 changes. Key changes: `roboticus daemon start|stop|restart` subcommands for full daemon lifecycle management. Interactive prompt after `roboticus daemon install` asking whether to start immediately. Replaced stale `[providers.local]` (localhost:8080) with `[providers.moonshot]` in bundled and registry provider configs. Added `moonshot/kimi-k2.5` to dashboard known-models list for settings autocomplete.

Highlights

  • `roboticus daemon start|stop|restart` subcommands for full daemon lifecycle management
  • Interactive prompt after `roboticus daemon install` asking whether to start immediately
  • `--start` flag on `roboticus daemon install` for non-interactive use
  • Dashboard keystore management: save/remove provider API keys from the settings page
  • Session nicknames in dashboard sessions table with click-to-copy session ID
  • Replaced stale `[providers.local]` (localhost:8080) with `[providers.moonshot]` in bundled and registry provider configs
  • Added `moonshot/kimi-k2.5` to dashboard known-models list for settings autocomplete
  • `roboticus daemon install` now actually offers to load the service (previously only wrote the plist/unit file)
FEAT`roboticus daemon start|stop|restart` subcommands for full daemon lifecycle management
FEATInteractive prompt after `roboticus daemon install` asking whether to start immediately
FEAT`--start` flag on `roboticus daemon install` for non-interactive use
FEATDashboard keystore management: save/remove provider API keys from the settings page
FEATSession nicknames in dashboard sessions table with click-to-copy session ID
FIXReplaced stale `[providers.local]` (localhost:8080) with `[providers.moonshot]` in bundled and registry provider configs
FIXAdded `moonshot/kimi-k2.5` to dashboard known-models list for settings autocomplete
FIX`roboticus daemon install` now actually offers to load the service (previously only wrote the plist/unit file)
FIX`roboticus daemon uninstall` now stops the running service before removing the file
FIX`roboticus daemon status` distinguishes between "not installed" and "installed but not running"
FIXRegistry URL restored to correct `roboticus.ai/registry` path (not subdomain)
FIXEmpty env vars no longer falsely reported as "configured" in key status checks
FIX`delete_provider_key` endpoint now validates provider exists before allowing keystore deletion
FIXUnified key resolution via `KeySource` enum eliminates 3 duplicated cascade implementations
FIX`resolve_provider_key` returns `Option<String>` instead of silently sending empty auth headers
FIXReplace secret-looking test placeholders to prevent false GitGuardian alerts

v0.4.0

New Features & More2026-02-23

Added: 10 changes. Changed: 5 changes. Fixed: 2 changes. Key changes: Signal channel adapter backed by signal-cli JSON-RPC daemon (`roboticus-channels::signal`). Unified thinking indicator (🤖🧠…) for all chat channels (Telegram, WhatsApp, Discord, Signal). `thinking_threshold_seconds` moved from per-channel (`TelegramConfig`) to `ChannelsConfig` level. Channel message processing is now platform-agnostic via `send_typing_indicator` / `send_thinking_indicator` helpers.

Highlights

  • Signal channel adapter backed by signal-cli JSON-RPC daemon (`roboticus-channels::signal`)
  • Unified thinking indicator (🤖🧠…) for all chat channels (Telegram, WhatsApp, Discord, Signal)
  • Configurable `thinking_threshold_seconds` on `[channels]` — estimated latency gate for thinking indicator (default: 30s)
  • `send_typing` and `send_ephemeral` on WhatsApp and Discord adapters
  • Latency estimator based on model tier, input length, and circuit-breaker state
  • LLM fallback chain: `infer_with_fallback` helper retries across configured providers on transient errors
  • Permanent error detection in delivery queue — 403/401/400 and "bot blocked" errors dead-letter immediately
  • Config auto-discovery: `roboticus start` checks `~/.roboticus/roboticus.toml` when no `--config` flag is given
FEATSignal channel adapter backed by signal-cli JSON-RPC daemon (`roboticus-channels::signal`)
FEATUnified thinking indicator (🤖🧠…) for all chat channels (Telegram, WhatsApp, Discord, Signal)
FEATConfigurable `thinking_threshold_seconds` on `[channels]` — estimated latency gate for thinking indicator (default: 30s)
FEAT`send_typing` and `send_ephemeral` on WhatsApp and Discord adapters
FEATLatency estimator based on model tier, input length, and circuit-breaker state
FEATLLM fallback chain: `infer_with_fallback` helper retries across configured providers on transient errors
FEATPermanent error detection in delivery queue — 403/401/400 and "bot blocked" errors dead-letter immediately
FEATConfig auto-discovery: `roboticus start` checks `~/.roboticus/roboticus.toml` when no `--config` flag is given
FEATObsidian vault integration module with read, search, and write tools
FEATGitHub Actions release workflow for cross-platform binaries and crates.io publishing
CHORE`thinking_threshold_seconds` moved from per-channel (`TelegramConfig`) to `ChannelsConfig` level
CHOREChannel message processing is now platform-agnostic via `send_typing_indicator` / `send_thinking_indicator` helpers
CHOREDelivery queue `mark_failed` checks for permanent errors before scheduling retries
CHOREChannel router `send_to` and `drain_retry_queue` skip retry enqueue for permanent errors
CHORECircuit breaker test updated to reflect fallback-first behavior
FIXLLM inference no longer returns a static error when the primary provider is down — falls through to configured fallbacks
FIXTelegram bot no longer retries messages to chats it was removed from (permanent error dead-lettering)

v0.3.0

Security Hardening & More2026-02-23

Security: 8 changes. Fixed: 11 changes. Changed: 10 changes. Added: 1 change. Key changes: Plugin sandbox: validate tool names against allowlist; reject path-traversal payloads; add `shutdown_all` for graceful teardown. Browser restrictions: block `file://`, `javascript:`, `data:` URI schemes in CDP navigation; harden Chrome launch flags. Telegram adapter now processes all updates in a batch, not just the first. Cron worker dispatches jobs instead of unconditionally marking success.

Highlights

  • Plugin sandbox: validate tool names against allowlist; reject path-traversal payloads; add `shutdown_all` for graceful teardown
  • Browser restrictions: block `file://`, `javascript:`, `data:` URI schemes in CDP navigation; harden Chrome launch flags
  • Session role validation: reject messages with roles outside `{user, assistant, system, tool}`
  • Channel message authority: trusted sender IDs config for elevated `ChannelAuthority`
  • WhatsApp webhook signature verification via HMAC-SHA256
  • Docker: run as non-root `roboticus` user
  • Wallet: encrypt private keys with machine-derived passphrase; never store plaintext
  • API key `#[serde(skip_serializing)]` prevents accidental serialization leakage
FIXPlugin sandbox: validate tool names against allowlist; reject path-traversal payloads; add `shutdown_all` for graceful teardown
FIXBrowser restrictions: block `file://`, `javascript:`, `data:` URI schemes in CDP navigation; harden Chrome launch flags
FIXSession role validation: reject messages with roles outside `{user, assistant, system, tool}`
FIXChannel message authority: trusted sender IDs config for elevated `ChannelAuthority`
FIXWhatsApp webhook signature verification via HMAC-SHA256
FIXDocker: run as non-root `roboticus` user
FIXWallet: encrypt private keys with machine-derived passphrase; never store plaintext
FIXAPI key `#[serde(skip_serializing)]` prevents accidental serialization leakage
FIXTelegram adapter now processes all updates in a batch, not just the first
FIXCron worker dispatches jobs instead of unconditionally marking success
FIXCron expressions use the `cron` crate for full syntax support (ranges, lists, steps)
FIXPer-IP rate-limit HashMap evicted on window reset, preventing unbounded growth
FIXInterview sessions capped at 100 with 1-hour TTL; expired sessions evicted
FIX`Cargo.lock` committed; CI builds use `--locked` for reproducible builds
FIXGraceful shutdown handler (SIGINT + SIGTERM) via `with_graceful_shutdown()`
FIXDuplicate migration version numbers renumbered to unique sequential IDs
FIXMigrations wrapped in transactions for atomicity
FIXSQL `LIKE` patterns escape user-supplied wildcards
FIXMemory query endpoints clamp limit to 1000
CHOREDeduplicated `Optional<T>` trait across 5 DB modules; use `rusqlite::OptionalExtension`
CHORE`SessionStatus` and `MessageRole` enums added for future type-safe migration
CHORERegex allocation in `decode_common_encodings` hoisted to static `LazyLock`
CHORESilent `.ok()` calls in `ingest_turn()` replaced with `tracing::warn!` logging
CHOREReusable `reqwest::Client` stored in `Wallet` for connection pooling
CHOREA2A sessions made private with TTL eviction and 256-session cap
CHOREPlugin registry releases lock before tool execution (`Arc<Mutex<Box<dyn Plugin>>>`)
CHORE`CdpSession::set_timeout` now functional (was a documented no-op)
CHOREDaemon logs written to `~/.roboticus/logs/` instead of world-readable `/tmp/`
CHOREDeduplicated `collect_string_values` across policy rules
FEATPre-commit hook for fast format checks (`hooks/pre-commit`)

v0.2.0

Alpha Release2026-02-23

Full roadmap implementation — 35 items across 7 phases. ReAct agent loop, RAG retrieval pipeline, embedding provider integration, ANN index, persistent semantic cache, sub-agent framework, and comprehensive bug fixes from code review.

Highlights

  • ReAct agent loop with idle/loop detection
  • 5-tier hybrid RAG retrieval (FTS5 + vector cosine)
  • Embedding provider integration (OpenAI, Ollama, Google)
  • HNSW approximate nearest neighbor index
  • Persistent semantic cache (SQLite-backed, auto-eviction)
  • Sub-agent framework with isolated tool registries
  • 22 code review issues resolved (6 critical, 12 high, 4 medium)
  • RwLock deadlock fix in circuit breaker path
  • UTF-8 safety, atomic OAuth persistence, poison recovery
FEATImplement full Roboticus roadmap (35 items across 7 phases)
FEATApplication layer — ReAct agent, RAG retrieval, sub-agents, full server wiring
FEATFoundation layer — embeddings, keystore, ANN index, cache persistence
FIXResolve RwLock deadlock in circuit breaker path
FIXResolve all 22 code review issues (6 CRITICAL, 12 HIGH, 4 MEDIUM)
FIXReplace all placeholder code with real implementations
FIXAuto-restart on port conflict during serve
FIXUpdate bootstrap sequence to 13 steps with cache-load step
FIXGoogle batch endpoint, parse error propagation, query auth in embedding
FIXBlocking read in async, UTF-8 safe chunking, dedup release on send failure
FIXChannel L4 filter, survival tier, dedup leaks, interview deadlock
FIXUTF-8 safety, atomic OAuth persist, poison recovery, embedding errors
FIXWire BOOT_6B node, remove OpenClaw refs, reorder roadmap sections
FIXLint errors from merge and update coverage baseline
DOCSUpdate architecture diagrams, roadmap, and crate READMEs
CHOREBump version to 0.2.0 for Roboticus alpha release

v0.1.0

New Features & More2026-02-22

Added: 5 changes. Changed: 1 change. Fixed: 1 change. Key changes: Initial Project Roboticus baseline for Roboticus. Multi-crate Rust workspace foundation (runtime crates + integration test crate). Prepared packaging/publish metadata for early release workflows. Early release stabilization fixes for binary packaging, startup wiring, and quality gates.

Highlights

  • Initial Project Roboticus baseline for Roboticus.
  • Multi-crate Rust workspace foundation (runtime crates + integration test crate).
  • Core SQLite persistence layer with schema/migrations and operational defaults.
  • Early HTTP API, CLI surface, and embedded dashboard scaffolding.
  • Initial architecture and reference documentation set.
  • Prepared packaging/publish metadata for early release workflows.
  • Early release stabilization fixes for binary packaging, startup wiring, and quality gates.
FEATInitial Project Roboticus baseline for Roboticus.
FEATMulti-crate Rust workspace foundation (runtime crates + integration test crate).
FEATCore SQLite persistence layer with schema/migrations and operational defaults.
FEATEarly HTTP API, CLI surface, and embedded dashboard scaffolding.
FEATInitial architecture and reference documentation set.
CHOREPrepared packaging/publish metadata for early release workflows.
FIXEarly release stabilization fixes for binary packaging, startup wiring, and quality gates.