Skip to content

Instantly share code, notes, and snippets.

@lhl
Created March 30, 2026 04:02
Show Gist options
  • Select an option

  • Save lhl/2c7403c298b5edd45095a7c5a7ed94f5 to your computer and use it in GitHub Desktop.

Select an option

Save lhl/2c7403c298b5edd45095a7c5a7ed94f5 to your computer and use it in GitHub Desktop.

ANALYSIS: IronClaw — NEAR AI's Rust Rewrite of OpenClaw

Date: 2026-03-29

Reference snapshot: reference/ironclaw/ (branch: staging, tip ≈ 2026-03-29)

Prior art: See ANALYSIS-openclaw-followon-2026-03-27.md and docs/_dev/v0.3/ANALYSIS-v0.1-vs-openclaw.md for the TypeScript OpenClaw baseline.

Security framework: Comparison grounded in ANALYSIS-shisad-security-design-analysis.md (shisad security gap analysis) and ~/agentic-security/ANALYSIS.md (78-paper survey with 5-layer defense stack taxonomy).


Executive Summary

IronClaw is a ground-up Rust reimplementation of the OpenClaw personal AI assistant, developed by NEAR AI under the lead of Illia Polosukhin (NEAR Protocol co-founder, co-author of Attention Is All You Need). First commit landed February 2, 2026; it was announced publicly at NEARCON 2026 on February 24. It has accumulated 837 commits across 20+ tagged releases in under 8 weeks — an impressive development velocity. The project is now at v0.22.0 with ~279K lines of Rust.

IronClaw is not a port. It is a strategic rewrite that uses OpenClaw's feature surface as a parity target while making fundamentally different architectural choices — most notably WASM-based tool/channel sandboxing, dual-backend persistence (PostgreSQL + libSQL), an extracted safety crate, and a credential isolation model where secrets never enter untrusted code. It ships with 11 WASM tools, 5 WASM channels, prompt-based skills with trust gating, Docker sandbox with proxy-mediated egress, and a comprehensive network security posture documented in a 550-line internal audit (src/NETWORK_SECURITY.md).

Why this matters for shisad: IronClaw is the first full-stack agent framework we've seen that treats security as a first-class engineering concern — shipped credential isolation, WASM sandboxing, fuzz-tested safety modules, and a thorough network surface audit. However, mapped against the agentic-security survey's 5-layer defense stack (§8), IronClaw invests heavily in Layers 2 and 5 (access control and detection/filtering) while having no Layer 1 defense (no instruction/data separation, no taint tracking, no privilege separation in the reasoning loop). Per "The Attacker Moves Second" (joint OpenAI/Anthropic/DeepMind, Oct 2025), detection-only defenses fail at >90% ASR under adaptive attack. IronClaw's safety crate is strong engineering but architecturally insufficient. shisad's designed primitives (COMMAND/TASK, PEP, taint tracking) are the right answer — but they need to be shipped. IronClaw's "working code beats design docs" advantage is real and growing.


1. Project Identity & Provenance

Field Value
Name IronClaw
Tagline "Your secure personal AI assistant, always on your side"
Organization NEAR AI (nearai)
Repository github.com/nearai/ironclaw
License MIT OR Apache-2.0
Language Rust (Edition 2024, MSRV 1.92)
Current version 0.22.0 (2026-03-25)
First commit 2026-02-02
Relationship to OpenClaw Inspired by / tracks feature parity with; not a fork or port

Key Contributors

Contributor Commits Notes
Illia Polosukhin ~491 (multiple aliases) NEAR co-founder, Transformer paper co-author
Henry Park 285 Primary Rust engineer (henry.park@near.ai)
Zaki Manian ~155 (multiple aliases) Cosmos ecosystem figure
Coffee 47 Chinese-speaking contributor
Nick Pismenkov 40
Claude (Anthropic) 24 AI-assisted development
italic-jinxin 31
Nige 30

Total unique contributors: 30+ across the US, Russia, China, Japan, Turkey, and elsewhere. The multilingual READMEs (English, Chinese, Russian, Japanese) reflect this international team.

Notable Context

  • Announced at NEARCON 2026 (2026-02-24, NEAR AI tweet). Positioned as a NEAR AI product, not an OpenClaw derivative.
  • Rust ecosystem — most agent framework development is happening in TypeScript/Python; a ground-up Rust rewrite is unusual and signals long-term investment.
  • NEAR Protocol / crypto association — IronClaw launched alongside NEAR AI's blockchain/AI narrative. This is a significant negative signal for some enterprise audiences; for others (web3-native teams) it's a draw.
  • No CVE history yet — unlike OpenClaw, IronClaw hasn't had public security incidents. At 279K LOC and 131 dependencies, this is more a function of age than quality.

2. Development Velocity

Commit Cadence

Month Commits
February 2026 262
March 2026 (to date) 575
Total 837

This is ~15 commits/day sustained over 8 weeks, accelerating from ~9/day in February to ~19/day in March.

Release Cadence

20 tagged releases (v0.2.0 through v0.21.0) in 56 days — roughly one release every 2.8 days. Recent releases (v0.19.0 through v0.22.0) shipped within a 12-day window, suggesting sprint-and-ship cycles.

Security Commit Density

182 out of 837 commits (21.7%) touch security-related code (matching keywords: security, auth, credential, secret, inject, sandbox, leak, vuln, csrf, xss). This is unusually high and suggests security is treated as a continuous concern, not a periodic audit.


3. Codebase Profile

Scale

Component Files Lines of Code
src/ (core) 342 229,044
tests/ 66 24,203
crates/ (extracted libs) 14 5,244
tools-src/ (WASM tools) 30 11,690
channels-src/ (WASM channels) 5 8,271
Total Rust 457 278,908

Dependencies

  • 131 direct dependencies in root Cargo.toml
  • 8,882 lines in Cargo.lock (substantial dependency tree)
  • Notable deps: wasmtime 28 (WASM sandbox), bollard (Docker API), aes-gcm/hkdf/blake3/subtle (crypto), axum (HTTP), rig-core (multi-LLM)
  • Platform-specific: security-framework (macOS Keychain), secret-service/zbus (Linux GNOME Keyring/KWallet), pty-process (Unix PTY)

Supply Chain Controls

cargo-deny configuration (deny.toml):

  • Advisory tracking: 7 known advisories tracked with explicit justification for each ignore
  • License allowlist: Permissive licenses only (MIT, Apache-2.0, BSD, ISC, etc.)
  • Source restrictions: Unknown registries denied, unknown git denied, only crates.io allowed
  • Bans: Wildcard version specs denied, multiple versions warned

4. Architecture Overview

Core Design

┌─────────────────────────────────────────────────┐
│                  Channels                       │
│  CLI/TUI │ Web Gateway │ WASM (Telegram, etc.)  │
│  HTTP Webhook │ Signal │ REPL                   │
└─────────────────┬───────────────────────────────┘
                  │ IncomingMessage
          ┌───────▼───────┐
          │  Agent Loop   │ ← Sessions, Scheduler, Context
          │  (Dispatcher) │
          └───────┬───────┘
                  │ Tool calls
    ┌─────────────┼─────────────────┐
    │             │                 │
┌───▼────┐  ┌─────▼──────┐  ┌───────▼──────┐
│Built-in│  │WASM Sandbox│  │Docker Sandbox│
│ Tools  │  │(wasmtime)  │  │(bollard)     │
└────────┘  └────────────┘  └──────────────┘
                   │                 │
            ┌──────▼──────┐   ┌──────▼───────┐
            │ Credential  │   │  HTTP Proxy  │
            │  Injector   │   │  (allowlist) │
            └─────────────┘   └──────────────┘

Key Subsystems

Subsystem Source Description
Agent loop src/agent/ Message dispatch, job scheduling, session management, context compaction
Channels src/channels/ Multi-channel input: CLI/TUI (Ratatui), Web (axum+SSE/WS), HTTP webhooks, WASM channels, Signal, REPL
Tools src/tools/ Built-in tools + WASM sandbox (wasmtime) + MCP client + dynamic tool builder
Safety crates/ironclaw_safety/ Extracted crate: prompt injection defense, input validation, secret leak detection, policy enforcement
Sandbox src/sandbox/ Docker container orchestration with HTTP proxy for egress control
Secrets src/secrets/ AES-256-GCM encryption with OS keychain master key (macOS Keychain, Linux secret-service)
LLM src/llm/ 13+ providers (NEAR AI, Anthropic, OpenAI, Gemini, Ollama, Bedrock, Mistral, Tinfoil, GitHub Copilot, OpenRouter)
Persistence src/db/ Dual-backend: PostgreSQL (with pgvector) + libSQL/Turso
Workspace/Memory src/workspace/ Hybrid search (FTS + vector via RRF), identity files, heartbeat system
Skills src/skills/ SKILL.md prompt extensions with trust model (Trusted vs Installed) and token budgeting
Routines src/routines/ Cron scheduling, event triggers, heartbeat (30-min default)
Hooks src/hooks/ 6 lifecycle points: BeforeInbound, BeforeToolCall, BeforeOutbound, OnSessionStart, OnSessionEnd, TransformResponse
Registry src/registry/ Extension catalog: manifest validation, artifact download, WASM bundle install
Orchestrator src/orchestrator/ Internal API for Docker sandbox containers (per-job auth, LLM proxy, credential grants)

5. Security Architecture — Deep Dive

This is the most interesting aspect for shisad. IronClaw has a layered security model that goes significantly beyond what OpenClaw offers.

5.1 Credential Isolation (Host-Boundary Injection)

The core principle: untrusted code never sees credential values.

Two enforcement points:

  1. WASM tools (src/tools/wasm/credential_injector.rs): The host runtime intercepts HTTP requests from WASM guests and injects credentials (Bearer tokens, API keys, custom headers, query parameters) at the host boundary. The WASM module receives a placeholder reference, not the actual secret. Injection supports multiple methods:

    • Authorization: Bearer <token> header
    • HTTP Basic auth
    • Custom header (e.g., X-API-Key)
    • Query parameter
  2. Docker sandbox (src/sandbox/proxy/http.rs): Containers route all HTTP through a localhost proxy. The proxy injects credentials into plain HTTP requests to allowed hosts. For HTTPS (CONNECT tunnels), credentials cannot be injected (no MITM by design) — containers must use the orchestrator's /worker/{job_id}/credentials endpoint.

Per-job credential grants (src/orchestrator/auth.rs):

  • Credentials are granted as (secret_name, env_var) pairs scoped to a specific job
  • Stored alongside the job's bearer token in an in-memory TokenStore
  • Revoked when the job token is revoked (container cleanup)
  • Decrypted on-demand only when the worker requests them

5.2 WASM Sandbox (wasmtime Component Model)

Source: src/tools/wasm/, wit/

Tools and channels execute as WASM components with explicit capability declarations:

  • Endpoint allowlisting (src/tools/wasm/allowlist.rs): Each tool declares allowed HTTP endpoints in capabilities.json. The validator enforces:

    • Host matching (exact or wildcard)
    • Path prefix matching
    • HTTP method restriction
    • HTTPS required by default
    • Userinfo rejection (user:pass@host blocked)
    • Path traversal normalization and blocking (../, %2e%2e/)
    • Invalid percent-encoding rejection
  • Resource limits (src/tools/wasm/limits.rs): Fuel metering for CPU, memory caps, execution time limits

  • Rate limiting (src/tools/wasm/rate_limiter.rs): Per-tool sliding-window rate limits

  • Secret leak detection (crates/ironclaw_safety/src/leak_detector.rs): Aho-Corasick multi-pattern scanner runs on both outbound requests and inbound responses

5.3 Docker Sandbox

Source: src/sandbox/

Containers run with defense-in-depth hardening:

Control Setting
Capabilities Drop ALL, add only CHOWN
Privilege escalation no-new-privileges:true
Root filesystem Read-only (except FullAccess policy)
User Non-root (UID 1000:1000)
Network Bridge mode (isolated), egress via proxy
Tmpfs /tmp (512 MB), /home/sandbox/.cargo/registry (1 GB)
Auto-remove Enabled
Output limits Configurable max stdout/stderr
Timeout Enforced with forced container removal

Egress proxy (src/sandbox/proxy/):

  • Domain allowlisting (fail-closed: empty allowlist = deny all)
  • Credential injection for HTTP (not HTTPS — by design)
  • Hop-by-hop header stripping
  • CONNECT tunnel timeout (30 min)

5.4 Extracted Safety Crate

crates/ironclaw_safety/ (4,612 LOC) provides:

Module LOC Function
leak_detector.rs 1,336 Secret pattern scanning (Aho-Corasick) on requests/responses
validator.rs 776 Input validation, prompt injection pattern detection
sanitizer.rs 725 Content sanitization and escaping
credential_detect.rs 637 Credential pattern recognition in text
lib.rs 603 SafetyLayer orchestration, policy pipeline
policy.rs 535 Policy rules with severity levels (Block/Warn/Review/Sanitize)

Fuzz testing for the safety crate: 5 dedicated fuzz targets:

  • fuzz_config_env — configuration parsing
  • fuzz_credential_detect — credential pattern matching
  • fuzz_leak_detector — leak scanning
  • fuzz_safety_sanitizer — sanitization
  • fuzz_safety_validator — input validation

Plus 1 additional fuzz target in the root: fuzz_tool_params (tool parameter parsing).

Benchmarks: benches/safety_check.rs and benches/safety_pipeline.rs for hot-path performance validation.

5.5 Network Security Posture

IronClaw maintains a 550-line internal network security audit (src/NETWORK_SECURITY.md) documenting:

5 network listeners:

Listener Default Bind Auth Rate Limit
Web Gateway (:3000) 127.0.0.1 Bearer token (constant-time) 30/60s
HTTP Webhook (:8080) 0.0.0.0 Shared secret (constant-time) 60/min
Orchestrator API (:50051) 127.0.0.1 (macOS) / 0.0.0.0 (Linux) Per-job bearer (constant-time) None
OAuth Callback (:9876) 127.0.0.1 None (ephemeral, 5-min timeout) N/A
Sandbox HTTP Proxy (:0) 127.0.0.1 None (loopback-only) N/A

All token comparisons use subtle::ConstantTimeEq — no timing side-channels.

Built-in HTTP tool SSRF protections:

  • HTTPS-only
  • Localhost blocked
  • Private IP blocked (RFC 1918, loopback, link-local, multicast, cloud metadata 169.254.169.254)
  • DNS rebinding defense (resolved IPs checked)
  • Redirect blocking (3xx returns error)
  • Response size limit (5 MB)
  • Outbound leak scan
  • Requires user approval

Open findings documented:

  • F-2: No TLS at application layer (expected: reverse proxy in production)
  • F-3: Orchestrator binds 0.0.0.0 on Linux (mitigated by per-job tokens)
  • F-6: SSE/WS connection limit (100 max)
  • F-7: No orchestrator rate limiting (mitigated by token scoping + timeout)
  • F-8: No orchestrator graceful shutdown

5.6 Secrets Management

Source: src/secrets/

  • Encryption: AES-256-GCM for stored secrets
  • Key derivation: HKDF
  • Master key storage: OS keychain (macOS Keychain via security-framework, Linux GNOME Keyring/KWallet via secret-service/zbus)
  • Crypto primitives: subtle for constant-time comparisons, blake3 for hashing, ed25519-dalek for signatures

5.7 Cross-Channel Authorization

Recent commits show active work on preventing cross-channel approval thread hijacking (#1485, #1701):

  • Source channel persisted to DB
  • Cross-channel authorization checks
  • Approval thread hijack prevention

5.8 Additional Security Measures

  • Tool error sanitization before LLM injection (#1639)
  • API response error redaction — internal error details stripped (#1711, #1702)
  • PTY injection prevention — replaced script -qfc with pty-process for injection-safe PTY (#1678)
  • Webhook auth enforcement — Feishu webhook authentication required (#1638)
  • LLM API key handling — keys stored in encrypted secrets store, not plaintext (#1625)
  • Sensitive path detection — unified protection across shell and file tools

6. Tools & Extensions Inventory

WASM Tools (tools-src/)

Tool Description Security Surface
github GitHub API integration OAuth credentials, API egress
gmail Gmail API OAuth credentials, email egress
google-calendar Google Calendar API OAuth credentials
google-docs Google Docs API OAuth credentials, document access
google-drive Google Drive API OAuth credentials, file access
google-sheets Google Sheets API OAuth credentials
google-slides Google Slides API OAuth credentials
slack Slack API OAuth/bot token, messaging egress
telegram Telegram Bot API Bot token, messaging egress
web-search Web search Search API key, egress
llm-context LLM context management LLM API credentials

WASM Channels (channels-src/)

Channel Description
discord Discord gateway (WebSocket)
feishu Feishu/Lark messaging
slack Slack events
telegram Telegram Bot API long-polling
whatsapp WhatsApp Business API

Built-in Tools (src/tools/builtin/)

echo, time, json, http, web_fetch, file, shell, memory, message, job, routine, extension_tools, skill_tools, secrets_tools

Skills (skills/)

Skill Description
delegation Task delegation patterns
ironclaw-workflow-orchestrator Multi-agent workflow
local-test Local testing skill
review-checklist Code review automation
routine-advisor Routine setup guidance
web-ui-test Web UI testing

7. Security Comparison: IronClaw vs shisad vs Academic SOTA

Cross-references: ANALYSIS-shisad-security-design-analysis.md (shisad security gap analysis mapped to 78-paper agentic-security taxonomy), ~/agentic-security/ANALYSIS.md (the taxonomy itself), ~/agentic-security/analysis/RESEARCH-secure-agentic-frameworks-landscape.md (framework implementations).

7.1 Defense Stack Mapping

The agentic-security survey (ANALYSIS.md §6) defines a 5-layer recommended defense stack. This table maps IronClaw, shisad, and the leading academic systems against each layer.

Layer Description IronClaw shisad CaMeL Progent
L1: Architecture Separate trusted planning from untrusted data (non-negotiable) Partial: single agent loop processes both trusted and untrusted content in the same context; no COMMAND/TASK split; no typed data separation Strong (design): COMMAND/TASK privilege separation; stateless context forking; artifact-based handoffs; ArtifactLedger (designed, not yet shipped) Strong: P-LLM/Q-LLM dual split; capability-tagged variables; Q-LLM never gets tool access N/A (not an architecture)
L2: Access Control Authorization in control plane, not prompts (non-negotiable) Strong (shipped): WASM capability declarations; Docker per-job bearer tokens; proxy-mediated domain allowlists; tool-level rate limiting Strong (design): 8-layer PEP pipeline; policy monotonicity; provenance-aware egress; per-call enforcement Partial: security policies as Python functions evaluated before each tool call Strong: JSON Schema policy DSL; deterministic per-call enforcement; 0% ASR
L3: Model Hardening Instruction-data separation at model level Minimal: no evidence of instruction hierarchy or model-level hardening Moderate: spotlighting + three-tier context placement; treats model as untrusted (correct framing) Implicit: Q-LLM isolation means the compromised model can't reach tools N/A
L4: Runtime Monitoring Verify-before-commit, drift detection Minimal: hooks system (6 lifecycle points) provides extension points but no shipped verification logic; no plan commitment or trace analysis Strong (design): plan commitment; differential execution (3-tier); trace verification; graduated response ladder; consensus voting Moderate: capability token checking per execution step Moderate: policy violation logging
L5: Detection/Filtering Sanitize inputs, detect injection Moderate (shipped): extracted safety crate with pattern-based injection detection, leak scanning (Aho-Corasick), content sanitization, 5 fuzz targets Strong (design): double-pass content firewall (ingress + TASK→COMMAND summary barrier); ingress normalization/classification/sanitization N/A (delegates to architecture) N/A (delegates to policies)
Cross-cutting: Memory Trust Split memory into trust zones Minimal: flat workspace memory; identity files (SOUL.md, USER.md) injected into system prompt without trust differentiation Strong (design): trust-tiered memory; gated writes; provenance retention; reversible updates; memory treated as attack surface N/A (stateless) N/A

7.2 Primitive-Level Comparison (IronClaw vs shisad)

This extends the ANALYSIS-shisad-security-design-analysis.md §8 framework to include IronClaw.

7.2.1 Instruction/Data Boundary

shisad: Explicit control plane / data plane separation. Three trust levels (TRUSTED / SEMI_TRUSTED / UNTRUSTED) with immutable taint labels. Content placed in different prompt tiers based on trust. Taint propagates through processing — summaries of untrusted content are SEMI_TRUSTED, not TRUSTED.

IronClaw: No explicit instruction/data separation at the architecture level. The agent loop processes user messages and tool outputs in the same context. Identity files (SOUL.md, AGENTS.md) are injected alongside conversation history without structural boundary enforcement. The safety crate's validator.rs performs pattern-based injection detection, but this is a classifier (Layer 5) not an architectural boundary (Layer 1).

Assessment: This is IronClaw's most significant architectural gap. The agentic-security survey's core finding is that probabilistic defenses (classifiers, pattern matching) fail under adaptive attack (>90% ASR per "The Attacker Moves Second"). IronClaw's safety crate falls into this category. shisad's taint tracking and COMMAND/TASK separation address the problem architecturally.

7.2.2 Privilege Separation

shisad: COMMAND agent (orchestrator, stays clean) dispatches ephemeral TASK agents (workers, process untrusted content). Scoped task envelopes define tool access, egress, and time limits. Structured return boundary controls what crosses from TASK to COMMAND. Analogous to CaMeL's P-LLM/Q-LLM split.

IronClaw: Docker sandbox provides process-level isolation for heavy workloads. WASM sandbox provides lightweight isolation for tools. However, the agent itself is a single loop — no equivalent of COMMAND/TASK separation. The agent that plans is the same agent that processes untrusted tool output. The worker system (src/worker/) is for job execution, not privilege separation.

Assessment: IronClaw has strong execution isolation (WASM/Docker) but no reasoning isolation. A web page processed by a WASM tool returns its output into the same agent context that makes tool-call decisions. shisad's COMMAND/TASK split prevents this: untrusted content never enters the orchestrator's context.

7.2.3 Per-Call Enforcement

shisad: 8-layer PEP pipeline evaluates every tool call against: schema validation, tool allowlist, capability check, taint-flow check, credential scan, egress destination check, risk scoring, rate limiting. Operates on metadata only — prompt injection in content cannot influence PEP decisions.

IronClaw: Tool calls go through the WASM sandbox's capability system (declared in capabilities.json) or Docker's proxy-mediated domain allowlist. The hooks system provides 6 lifecycle points (BeforeInbound, BeforeToolCall, BeforeOutbound, etc.) but these are extension points, not a deterministic enforcement pipeline. No evidence of taint-flow checking, credential scanning, or risk scoring at the tool-call level. The requires_approval() flag on individual tools (e.g., the HTTP tool) is a per-tool boolean, not a per-call analysis.

Assessment: shisad's PEP is fundamentally richer. IronClaw's enforcement is at the sandbox boundary (what can this tool access?), not at the decision level (should this specific call proceed given its provenance and arguments?). This is a critical difference: IronClaw can prevent a tool from reaching unauthorized endpoints but cannot detect when a legitimate tool is being misused by injected instructions.

7.2.4 Credential Isolation

shisad: Credential proxying via opaque credential_ref — LLM never sees secrets. Future plan: proxy-level injection.

IronClaw: Shipped and working. Two implementations:

  • WASM: CredentialInjector intercepts HTTP requests from WASM guests and injects credentials at the host boundary (Bearer, Basic, custom header, query param). WASM code receives placeholder references.
  • Docker: Sandbox HTTP proxy injects credentials into plain HTTP requests to allowed hosts. For HTTPS, containers must use the orchestrator's per-job credential endpoint.
  • Per-job credential grants with automatic revocation on job cleanup.

Assessment: IronClaw is concretely ahead on credential isolation. shisad has the right design (and the proxy-level injection plan maps closely to IronClaw's approach), but IronClaw has working code. This is the single most adoptable pattern from IronClaw.

7.2.5 Egress Control

shisad: Provenance-aware egress in the PEP: "Who asked for this URL?" User-requested URLs auto-approve; untrusted-content-derived URLs require confirmation; unattributed URLs are blocked. Five-level graduated response. This is shisad's most novel contribution (§8.2.1 of ANALYSIS-shisad-security-design-analysis.md) — no other system tracks egress provenance this way.

IronClaw: Domain allowlisting at two boundaries:

  • WASM: Endpoint allowlist per tool (host, path prefix, HTTP method, HTTPS required)
  • Docker: Proxy-mediated domain allowlist (fail-closed: empty = deny all)
  • Built-in HTTP tool: SSRF protections (private IP blocking, DNS rebinding defense, cloud metadata blocking, redirect blocking)
  • Leak detection scans outbound requests for secrets

Assessment: Both have strong egress controls but with fundamentally different models. IronClaw's is a static allowlist ("this tool can reach these hosts"). shisad's is dynamic and provenance-aware ("this request traces to untrusted content, so it needs confirmation even though the destination is allowed"). IronClaw's approach prevents unauthorized destinations but cannot detect when an authorized destination is being used for exfiltration driven by injected instructions. shisad's approach can, in principle, catch that case.

7.2.6 Prompt Injection Defense

shisad: Multi-layer: spotlighting + three-tier context placement (inner), content firewall with double-pass (ingress + summary barrier), taint tracking + PEP enforcement (structural guarantee). Treats detection as helpful outer layer, not security boundary. Adversarial test suite with prompt injection scenarios.

IronClaw: Extracted safety crate (ironclaw_safety, 4,612 LOC) with pattern-based detection (validator.rs), content sanitization (sanitizer.rs), credential detection (credential_detect.rs), leak detection (leak_detector.rs), and policy severity levels (policy.rs). 5 fuzz targets. 2 benchmarks. Integrated into the agent pipeline.

Assessment: IronClaw has more mature implementation (shipped, fuzzed, benchmarked) of what is fundamentally a Layer 5 defense (detection/filtering). shisad has a more mature architecture (Layer 1) that provides structural guarantees even when detection fails. Per "The Attacker Moves Second", detection-based defenses alone are insufficient under adaptive attack. IronClaw's safety crate would need to be complemented with architectural separation to resist adaptive adversaries.

7.2.7 Memory Security

shisad: Memory treated as an attack surface. Trust-tiered storage. Gated writes with provenance retention. Reversible updates. Content from untrusted sources stored with immutable taint labels that persist across sessions.

IronClaw: Flat workspace with hybrid search (FTS + vector). Identity files (SOUL.md, AGENTS.md, USER.md, IDENTITY.md, HEARTBEAT.md) injected into system prompt. Skills have a trust model (Trusted vs Installed) with token budgeting and attenuation. No evidence of memory trust zones, taint-on-write, or provenance tracking for stored content.

Assessment: Memory poisoning is identified as a critical gap in the agentic-security survey (§7). IronClaw's identity files (especially agent-editable SOUL.md) are a known attack surface — the Zenity Labs SOUL.md attack against OpenClaw demonstrated this class of vulnerability. shisad's trust-tiered memory with provenance is the more robust approach.

7.3 What IronClaw Gets Right (shisad should adopt)

  1. Credential isolation is shipped, not planned. IronClaw's host-boundary injection (WASM CredentialInjector + Docker proxy injection + per-job grants with revocation) is a concrete implementation of what shisad has designed. The credential_ref → proxy-level injection upgrade path in shisad's plan maps directly to IronClaw's proxy approach. Priority: high.

  2. Extracted safety module with fuzz testing. The ironclaw_safety crate (independent crate, own fuzz targets, own benchmarks, clean API boundary) is a strong engineering pattern. Even though detection is Layer 5 (not a security boundary), having it be fast, well-tested, and independently verifiable is valuable. shisad should consider extracting its content firewall and DLP components into a standalone, fuzz-tested module. Priority: medium.

  3. Network security audit document. IronClaw's NETWORK_SECURITY.md (550 lines: threat model, 5-listener inventory, auth mechanisms, SSRF protections, open findings, review checklists) is the most thorough network surface audit we've seen in an open-source agent framework. shisad should create an equivalent document. Priority: medium.

  4. Dual sandbox model. WASM for lightweight tool isolation + Docker for heavy/untrusted execution is a complementary approach. WASM gives fast, capability-gated execution; Docker gives full process isolation with network-level controls. Neither alone covers all use cases. Priority: medium (design consideration for shisad's tool execution model).

  5. Per-job bearer tokens with automatic revocation. Cryptographically random, job-scoped, constant-time comparison, ephemeral (in-memory only), revoked on job cleanup. A clean pattern for the orchestrator→worker boundary. Priority: low (implementation detail, but well-executed).

  6. Safety benchmarks on hot paths. benches/safety_check.rs and benches/safety_pipeline.rs ensure that safety enforcement doesn't become a latency bottleneck. If shisad's PEP pipeline adds overhead to every tool call, benchmarking the critical path is important. Priority: medium.

7.4 What IronClaw Gets Wrong (or is missing)

  1. No instruction/data boundary. The agent processes user instructions and tool outputs (including content from untrusted web pages) in the same context. This is the fundamental architectural gap that "The Attacker Moves Second" showed is insufficient — pattern-based detection alone doesn't survive adaptive attack. IronClaw's safety crate is a good Layer 5 defense but is not backed by a Layer 1 architectural guarantee.

  2. No taint tracking or provenance. Content flows freely through the system without provenance labels. A summary of a web page enters the same trust domain as a user message. The PEP-equivalent enforcement (WASM capabilities, Docker allowlists) cannot distinguish "the user asked to fetch reuters.com" from "injected instructions said to fetch evil.com" because both arrive as tool calls from the same agent context.

  3. No plan commitment or trace verification. The agent loop has no mechanism to commit to a plan before seeing untrusted content, or to verify that subsequent tool calls are justified by the committed plan. This leaves it vulnerable to the "ROP-style composition" attacks identified in the survey — chaining individually-allowed tool calls into a malicious sequence.

  4. Identity files as attack surface. SOUL.md, AGENTS.md, IDENTITY.md, and HEARTBEAT.md are injected into the system prompt. If the workspace memory is poisoned (the Zenity Labs OpenClaw attack), these files become a persistence mechanism for prompt injection. IronClaw trusts "Trusted skills" (user-placed in ~/.ironclaw/skills/) at full tool access — there is no scanning or validation for user-placed content, which assumes the user's machine is uncompromised.

  5. No differential execution. No mechanism to detect whether untrusted content is influencing the agent's behavior by comparing outputs with and without the suspect content. shisad's three-tier differential execution design (and MELON/AgentSentry from the literature) addresses this gap.

  6. No graduated response ladder. IronClaw has binary controls: approved or not (requires_approval() returning true/false). No equivalent of shisad's auto-approve → confirm → deny → lockdown gradient. Risk-based shell command approval (Low/Medium/High) is a recent addition but applies only to shell commands, not to the general tool-call pipeline.

  7. Single-pass detection. The safety crate runs at ingress but there is no equivalent of shisad's "summary barrier" — the second firewall pass at the TASK→COMMAND boundary that catches "taint laundering through summarization." Since IronClaw has no TASK→COMMAND boundary at all, this gap follows from gap #1.

7.5 Feature Comparison Matrix

Dimension shisad IronClaw Assessment
Language Python Rust Different tradeoffs; Rust gives memory safety + performance, Python gives ecosystem + iteration speed
L1: Architecture COMMAND/TASK split; taint tracking; stateless context forking Single agent loop; no privilege separation in reasoning shisad architecturally stronger
L2: Access control 8-layer PEP; provenance-aware egress; policy monotonicity WASM capabilities + Docker allowlists + hooks shisad richer enforcement; IronClaw more concrete implementation
L3: Model hardening Spotlighting + 3-tier context; treats model as untrusted Pattern-based injection detection in safety crate Both treat model as fallible; different approaches
L4: Runtime monitoring Plan commitment; differential execution; graduated response Hooks system (extension points); no shipped verification shisad ahead in design; neither has shipped runtime verification
L5: Detection/filtering Double-pass firewall (ingress + summary barrier) Extracted safety crate with fuzz testing + benchmarks IronClaw ahead in implementation maturity; shisad ahead in architecture
Credential isolation Designed (credential_ref + future proxy) Shipped (WASM injector + Docker proxy + per-job grants) IronClaw ahead
Sandbox execution Policy enforcement pipeline WASM + Docker dual sandbox IronClaw ahead
Leak detection Designed Shipped (Aho-Corasick, request/response scanning) IronClaw ahead
Network security audit Informal 550-line internal audit with findings + checklists IronClaw ahead
Memory trust zones Designed (trust-tiered, provenance-retained) Flat workspace with identity file injection shisad ahead
Adversarial testing Dedicated tests/adversarial/ with PI scenarios Safety crate fuzz testing (6 targets) Different focus; both have gaps
Behavioral correctness Hard gate (tests must pass for milestone closure) E2E tests but no equivalent "product works" gate shisad ahead
Multi-channel Designed Shipped (Telegram, Discord, Slack, Signal, WhatsApp, Web, CLI) IronClaw ahead
Observability Append-only audit logs; training-ready trace recorder Pluggable observers (noop, log, multi) Similar maturity

8. Landscape Risk Assessment

Relevance to shisad: Medium-High

IronClaw occupies the same space as shisad — a security-first AI agent framework. Key factors:

  • NEAR AI backing — funded organization with 30+ active contributors including Illia Polosukhin (Transformer paper co-author, NEAR co-founder) and Zaki Manian (Cosmos)
  • Velocity — 837 commits in 8 weeks, 20 releases; impressive pace for a ground-up Rust rewrite
  • Rust credibility — "rewrite it in Rust for security" is a compelling narrative for security-conscious users
  • Working security features — credential isolation, WASM sandbox, leak detection, and fuzz testing are all shipped, not designed
  • Multi-channel — already supports Telegram, Discord, Slack, Signal, WhatsApp, Web, CLI
  • International reach — docs in 4 languages suggests deliberate global push
  • OpenClaw import — actively targeting OpenClaw's existing user base for migration

Differentiators in shisad's Favor:

  • Architecture gap is fundamental. IronClaw has no instruction/data separation, no taint tracking, no privilege separation in the reasoning loop. Per the agentic-security survey's core finding, this means IronClaw's safety defenses will not survive adaptive attack. This is not a feature gap — it's a design-level limitation that cannot be fixed incrementally.
  • No formal threat model. IronClaw's NETWORK_SECURITY.md is excellent for network-layer threats but there is no equivalent document for the agent-level threat model (prompt injection, memory poisoning, confused deputy, exfiltration via tool calls). The agentic-security taxonomy identifies these as the primary threats.
  • Safety crate is Layer 5 only. Pattern-based detection with no architectural backing. "The Attacker Moves Second" showed >90% bypass rates against detection-only defenses.
  • Dependency risk. 279K LOC, 131 direct dependencies, 7 tracked advisories already. First public CVE is a matter of when, not if.
  • NEAR AI association. Crypto/web3 branding may be a liability in enterprise security contexts.
  • Pre-1.0 instability. v0.22.0 with breaking API changes still happening; not production-stable yet.

Net Assessment:

IronClaw is strong on implementation maturity (shipped features, working code, multi-channel, multi-provider) but architecturally weaker on the core agent security problem (prompt injection → exfiltration). If shisad ships its designed primitives (PEP, COMMAND/TASK, taint tracking, credential proxy), the architectural advantage is decisive. If shisad's designs remain unimplemented, IronClaw's "working code beats design docs" position becomes compelling regardless of architectural quality.


9. Recommendations

P0 — Close the implementation gap on shipped IronClaw features:

  1. Ship credential isolation. IronClaw's host-boundary injection is a concrete implementation of shisad's credential_ref + future proxy design. Every day this remains unshipped, IronClaw can point to working code vs shisad's design docs. Adopt the proxy-level injection pattern (intercept tool HTTP, inject credentials at host boundary) directly.

  2. Ship leak detection. IronClaw's Aho-Corasick scanner on outbound requests/responses is straightforward and shipped. shisad's egress pipeline should include equivalent outbound secret scanning.

  3. Ship the PEP enforcement pipeline end-to-end. The 8-layer PEP is shisad's strongest architectural advantage over IronClaw. It must be wired, not designed. Priority: scheduler→PEP integration (the highest-risk enforcement bypass per ANALYSIS-shisad-security-design-analysis.md §4.1).

P1 — Adopt IronClaw's engineering patterns:

  1. Create a network security audit document modeled on IronClaw's NETWORK_SECURITY.md format: threat model, listener inventory, auth mechanisms, SSRF protections, open findings with severity, review checklists for PRs.

  2. Extract and fuzz-test the safety module. Follow IronClaw's pattern of an extracted, independently testable safety module with dedicated fuzz targets and hot-path benchmarks. Even as a Layer 5 defense, it should be fast and well-tested.

  3. Evaluate WASM-based tool sandboxing as a complement to the PEP. IronClaw's wasmtime component model with capability declarations provides execution isolation that the PEP (a policy layer) does not. The PEP decides whether a tool call should proceed; a WASM sandbox constrains what the tool can do if it misbehaves. These are complementary.

P2 — Strategic positioning:

  1. Frame the architectural difference clearly. IronClaw invests in sandbox-level security (WASM isolation, Docker hardening, network proxying). shisad invests in reasoning-level security (taint tracking, privilege separation, per-call provenance-aware enforcement). The agentic-security survey's finding is that sandbox-level defenses are necessary but not sufficient — the agent's reasoning must also be protected. This is shisad's thesis.

  2. Monitor IronClaw's CVE trajectory. At 279K LOC and 131 deps, the first public vulnerability is likely. When it happens, analyze whether it's in the safety crate (expected — classifier bypass), the sandbox boundary (serious), or the core agent loop (confirms the architectural gap).

  3. Track NEAR AI's roadmap. IronClaw's FEATURE_PARITY.md and CHANGELOG show the development direction. Watch for: instruction/data separation (would close the architectural gap), plan commitment (would add L4 defense), memory trust zones (would address poisoning).


Appendix A: Technology Stack

Layer Technology
Language Rust 1.92, Edition 2024
Async runtime tokio (full features)
HTTP framework axum 0.8 + tower
WASM runtime wasmtime 28 (component model)
Docker API bollard 0.18
Database (primary) PostgreSQL + pgvector + deadpool
Database (embedded) libSQL/Turso
Cryptography aes-gcm, hkdf, blake3, ed25519-dalek, subtle
Keychain security-framework (macOS), secret-service/zbus (Linux)
LLM abstraction rig-core 0.30
TUI Ratatui + crossterm
CLI clap 4
Serialization serde + serde_json
Testing testcontainers (PG), Playwright (e2e), cargo-fuzz, criterion (benches)
CI/CD GitHub Actions, cargo-dist (multi-platform release)
Supply chain cargo-deny (advisories, licenses, bans, sources)

Appendix B: Release History (Selected)

Version Date Notable
v0.2.0 ~2026-02-05 Early tagged release
v0.9.0 ~2026-02-20 First feature-complete milestone?
v0.15.0 ~2026-03-10 Mid-march acceleration
v0.19.0 2026-03-17
v0.20.0 2026-03-19
v0.21.0 2026-03-20
v0.22.0 2026-03-25 Current: multi-tenant, OAuth hardening

Appendix C: Test Infrastructure

  • 66 integration test files in tests/ covering: workspace, heartbeat, WS gateway, pairing, WASM channels, multi-tenant, provider chaos, shell risk regression, safety layer e2e, OpenClaw import, tool schema validation, identity scope isolation
  • Playwright e2e tests (tests/e2e/) for web UI flows
  • 6 fuzz targets (5 in safety crate + 1 root)
  • 2 criterion benchmarks (safety_check, safety_pipeline)
  • Snapshot testing via insta crate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment