Date: 2026-03-29
Reference snapshot:
reference/ironclaw/(branch: staging, tip ≈ 2026-03-29)Prior art: See
ANALYSIS-openclaw-followon-2026-03-27.mdanddocs/_dev/v0.3/ANALYSIS-v0.1-vs-openclaw.mdfor the TypeScript OpenClaw baseline.Security framework: Comparison grounded in
ANALYSIS-shisad-security-design-analysis.md(shisad security gap analysis) and~/agentic-security/ANALYSIS.md(78-paper survey with 5-layer defense stack taxonomy).
IronClaw is a ground-up Rust reimplementation of the OpenClaw personal AI assistant, developed by NEAR AI under the lead of Illia Polosukhin (NEAR Protocol co-founder, co-author of Attention Is All You Need). First commit landed February 2, 2026; it was announced publicly at NEARCON 2026 on February 24. It has accumulated 837 commits across 20+ tagged releases in under 8 weeks — an impressive development velocity. The project is now at v0.22.0 with ~279K lines of Rust.
IronClaw is not a port. It is a strategic rewrite that uses OpenClaw's feature surface as a parity target while making fundamentally different architectural choices — most notably WASM-based tool/channel sandboxing, dual-backend persistence (PostgreSQL + libSQL), an extracted safety crate, and a credential isolation model where secrets never enter untrusted code. It ships with 11 WASM tools, 5 WASM channels, prompt-based skills with trust gating, Docker sandbox with proxy-mediated egress, and a comprehensive network security posture documented in a 550-line internal audit (src/NETWORK_SECURITY.md).
Why this matters for shisad: IronClaw is the first full-stack agent framework we've seen that treats security as a first-class engineering concern — shipped credential isolation, WASM sandboxing, fuzz-tested safety modules, and a thorough network surface audit. However, mapped against the agentic-security survey's 5-layer defense stack (§8), IronClaw invests heavily in Layers 2 and 5 (access control and detection/filtering) while having no Layer 1 defense (no instruction/data separation, no taint tracking, no privilege separation in the reasoning loop). Per "The Attacker Moves Second" (joint OpenAI/Anthropic/DeepMind, Oct 2025), detection-only defenses fail at >90% ASR under adaptive attack. IronClaw's safety crate is strong engineering but architecturally insufficient. shisad's designed primitives (COMMAND/TASK, PEP, taint tracking) are the right answer — but they need to be shipped. IronClaw's "working code beats design docs" advantage is real and growing.
| Field | Value |
|---|---|
| Name | IronClaw |
| Tagline | "Your secure personal AI assistant, always on your side" |
| Organization | NEAR AI (nearai) |
| Repository | github.com/nearai/ironclaw |
| License | MIT OR Apache-2.0 |
| Language | Rust (Edition 2024, MSRV 1.92) |
| Current version | 0.22.0 (2026-03-25) |
| First commit | 2026-02-02 |
| Relationship to OpenClaw | Inspired by / tracks feature parity with; not a fork or port |
| Contributor | Commits | Notes |
|---|---|---|
| Illia Polosukhin | ~491 (multiple aliases) | NEAR co-founder, Transformer paper co-author |
| Henry Park | 285 | Primary Rust engineer (henry.park@near.ai) |
| Zaki Manian | ~155 (multiple aliases) | Cosmos ecosystem figure |
| Coffee | 47 | Chinese-speaking contributor |
| Nick Pismenkov | 40 | |
| Claude (Anthropic) | 24 | AI-assisted development |
| italic-jinxin | 31 | |
| Nige | 30 |
Total unique contributors: 30+ across the US, Russia, China, Japan, Turkey, and elsewhere. The multilingual READMEs (English, Chinese, Russian, Japanese) reflect this international team.
- Announced at NEARCON 2026 (2026-02-24, NEAR AI tweet). Positioned as a NEAR AI product, not an OpenClaw derivative.
- Rust ecosystem — most agent framework development is happening in TypeScript/Python; a ground-up Rust rewrite is unusual and signals long-term investment.
- NEAR Protocol / crypto association — IronClaw launched alongside NEAR AI's blockchain/AI narrative. This is a significant negative signal for some enterprise audiences; for others (web3-native teams) it's a draw.
- No CVE history yet — unlike OpenClaw, IronClaw hasn't had public security incidents. At 279K LOC and 131 dependencies, this is more a function of age than quality.
| Month | Commits |
|---|---|
| February 2026 | 262 |
| March 2026 (to date) | 575 |
| Total | 837 |
This is ~15 commits/day sustained over 8 weeks, accelerating from ~9/day in February to ~19/day in March.
20 tagged releases (v0.2.0 through v0.21.0) in 56 days — roughly one release every 2.8 days. Recent releases (v0.19.0 through v0.22.0) shipped within a 12-day window, suggesting sprint-and-ship cycles.
182 out of 837 commits (21.7%) touch security-related code (matching keywords: security, auth, credential, secret, inject, sandbox, leak, vuln, csrf, xss). This is unusually high and suggests security is treated as a continuous concern, not a periodic audit.
| Component | Files | Lines of Code |
|---|---|---|
src/ (core) |
342 | 229,044 |
tests/ |
66 | 24,203 |
crates/ (extracted libs) |
14 | 5,244 |
tools-src/ (WASM tools) |
30 | 11,690 |
channels-src/ (WASM channels) |
5 | 8,271 |
| Total Rust | 457 | 278,908 |
- 131 direct dependencies in root Cargo.toml
- 8,882 lines in Cargo.lock (substantial dependency tree)
- Notable deps:
wasmtime28 (WASM sandbox),bollard(Docker API),aes-gcm/hkdf/blake3/subtle(crypto),axum(HTTP),rig-core(multi-LLM) - Platform-specific:
security-framework(macOS Keychain),secret-service/zbus(Linux GNOME Keyring/KWallet),pty-process(Unix PTY)
cargo-deny configuration (deny.toml):
- Advisory tracking: 7 known advisories tracked with explicit justification for each ignore
- License allowlist: Permissive licenses only (MIT, Apache-2.0, BSD, ISC, etc.)
- Source restrictions: Unknown registries denied, unknown git denied, only crates.io allowed
- Bans: Wildcard version specs denied, multiple versions warned
┌─────────────────────────────────────────────────┐
│ Channels │
│ CLI/TUI │ Web Gateway │ WASM (Telegram, etc.) │
│ HTTP Webhook │ Signal │ REPL │
└─────────────────┬───────────────────────────────┘
│ IncomingMessage
┌───────▼───────┐
│ Agent Loop │ ← Sessions, Scheduler, Context
│ (Dispatcher) │
└───────┬───────┘
│ Tool calls
┌─────────────┼─────────────────┐
│ │ │
┌───▼────┐ ┌─────▼──────┐ ┌───────▼──────┐
│Built-in│ │WASM Sandbox│ │Docker Sandbox│
│ Tools │ │(wasmtime) │ │(bollard) │
└────────┘ └────────────┘ └──────────────┘
│ │
┌──────▼──────┐ ┌──────▼───────┐
│ Credential │ │ HTTP Proxy │
│ Injector │ │ (allowlist) │
└─────────────┘ └──────────────┘
| Subsystem | Source | Description |
|---|---|---|
| Agent loop | src/agent/ |
Message dispatch, job scheduling, session management, context compaction |
| Channels | src/channels/ |
Multi-channel input: CLI/TUI (Ratatui), Web (axum+SSE/WS), HTTP webhooks, WASM channels, Signal, REPL |
| Tools | src/tools/ |
Built-in tools + WASM sandbox (wasmtime) + MCP client + dynamic tool builder |
| Safety | crates/ironclaw_safety/ |
Extracted crate: prompt injection defense, input validation, secret leak detection, policy enforcement |
| Sandbox | src/sandbox/ |
Docker container orchestration with HTTP proxy for egress control |
| Secrets | src/secrets/ |
AES-256-GCM encryption with OS keychain master key (macOS Keychain, Linux secret-service) |
| LLM | src/llm/ |
13+ providers (NEAR AI, Anthropic, OpenAI, Gemini, Ollama, Bedrock, Mistral, Tinfoil, GitHub Copilot, OpenRouter) |
| Persistence | src/db/ |
Dual-backend: PostgreSQL (with pgvector) + libSQL/Turso |
| Workspace/Memory | src/workspace/ |
Hybrid search (FTS + vector via RRF), identity files, heartbeat system |
| Skills | src/skills/ |
SKILL.md prompt extensions with trust model (Trusted vs Installed) and token budgeting |
| Routines | src/routines/ |
Cron scheduling, event triggers, heartbeat (30-min default) |
| Hooks | src/hooks/ |
6 lifecycle points: BeforeInbound, BeforeToolCall, BeforeOutbound, OnSessionStart, OnSessionEnd, TransformResponse |
| Registry | src/registry/ |
Extension catalog: manifest validation, artifact download, WASM bundle install |
| Orchestrator | src/orchestrator/ |
Internal API for Docker sandbox containers (per-job auth, LLM proxy, credential grants) |
This is the most interesting aspect for shisad. IronClaw has a layered security model that goes significantly beyond what OpenClaw offers.
The core principle: untrusted code never sees credential values.
Two enforcement points:
-
WASM tools (
src/tools/wasm/credential_injector.rs): The host runtime intercepts HTTP requests from WASM guests and injects credentials (Bearer tokens, API keys, custom headers, query parameters) at the host boundary. The WASM module receives a placeholder reference, not the actual secret. Injection supports multiple methods:Authorization: Bearer <token>header- HTTP Basic auth
- Custom header (e.g.,
X-API-Key) - Query parameter
-
Docker sandbox (
src/sandbox/proxy/http.rs): Containers route all HTTP through a localhost proxy. The proxy injects credentials into plain HTTP requests to allowed hosts. For HTTPS (CONNECT tunnels), credentials cannot be injected (no MITM by design) — containers must use the orchestrator's/worker/{job_id}/credentialsendpoint.
Per-job credential grants (src/orchestrator/auth.rs):
- Credentials are granted as
(secret_name, env_var)pairs scoped to a specific job - Stored alongside the job's bearer token in an in-memory
TokenStore - Revoked when the job token is revoked (container cleanup)
- Decrypted on-demand only when the worker requests them
Source: src/tools/wasm/, wit/
Tools and channels execute as WASM components with explicit capability declarations:
-
Endpoint allowlisting (
src/tools/wasm/allowlist.rs): Each tool declares allowed HTTP endpoints incapabilities.json. The validator enforces:- Host matching (exact or wildcard)
- Path prefix matching
- HTTP method restriction
- HTTPS required by default
- Userinfo rejection (
user:pass@hostblocked) - Path traversal normalization and blocking (
../,%2e%2e/) - Invalid percent-encoding rejection
-
Resource limits (
src/tools/wasm/limits.rs): Fuel metering for CPU, memory caps, execution time limits -
Rate limiting (
src/tools/wasm/rate_limiter.rs): Per-tool sliding-window rate limits -
Secret leak detection (
crates/ironclaw_safety/src/leak_detector.rs): Aho-Corasick multi-pattern scanner runs on both outbound requests and inbound responses
Source: src/sandbox/
Containers run with defense-in-depth hardening:
| Control | Setting |
|---|---|
| Capabilities | Drop ALL, add only CHOWN |
| Privilege escalation | no-new-privileges:true |
| Root filesystem | Read-only (except FullAccess policy) |
| User | Non-root (UID 1000:1000) |
| Network | Bridge mode (isolated), egress via proxy |
| Tmpfs | /tmp (512 MB), /home/sandbox/.cargo/registry (1 GB) |
| Auto-remove | Enabled |
| Output limits | Configurable max stdout/stderr |
| Timeout | Enforced with forced container removal |
Egress proxy (src/sandbox/proxy/):
- Domain allowlisting (fail-closed: empty allowlist = deny all)
- Credential injection for HTTP (not HTTPS — by design)
- Hop-by-hop header stripping
- CONNECT tunnel timeout (30 min)
crates/ironclaw_safety/ (4,612 LOC) provides:
| Module | LOC | Function |
|---|---|---|
leak_detector.rs |
1,336 | Secret pattern scanning (Aho-Corasick) on requests/responses |
validator.rs |
776 | Input validation, prompt injection pattern detection |
sanitizer.rs |
725 | Content sanitization and escaping |
credential_detect.rs |
637 | Credential pattern recognition in text |
lib.rs |
603 | SafetyLayer orchestration, policy pipeline |
policy.rs |
535 | Policy rules with severity levels (Block/Warn/Review/Sanitize) |
Fuzz testing for the safety crate: 5 dedicated fuzz targets:
fuzz_config_env— configuration parsingfuzz_credential_detect— credential pattern matchingfuzz_leak_detector— leak scanningfuzz_safety_sanitizer— sanitizationfuzz_safety_validator— input validation
Plus 1 additional fuzz target in the root: fuzz_tool_params (tool parameter parsing).
Benchmarks: benches/safety_check.rs and benches/safety_pipeline.rs for hot-path performance validation.
IronClaw maintains a 550-line internal network security audit (src/NETWORK_SECURITY.md) documenting:
5 network listeners:
| Listener | Default Bind | Auth | Rate Limit |
|---|---|---|---|
| Web Gateway (:3000) | 127.0.0.1 |
Bearer token (constant-time) | 30/60s |
| HTTP Webhook (:8080) | 0.0.0.0 |
Shared secret (constant-time) | 60/min |
| Orchestrator API (:50051) | 127.0.0.1 (macOS) / 0.0.0.0 (Linux) |
Per-job bearer (constant-time) | None |
| OAuth Callback (:9876) | 127.0.0.1 |
None (ephemeral, 5-min timeout) | N/A |
| Sandbox HTTP Proxy (:0) | 127.0.0.1 |
None (loopback-only) | N/A |
All token comparisons use subtle::ConstantTimeEq — no timing side-channels.
Built-in HTTP tool SSRF protections:
- HTTPS-only
- Localhost blocked
- Private IP blocked (RFC 1918, loopback, link-local, multicast, cloud metadata 169.254.169.254)
- DNS rebinding defense (resolved IPs checked)
- Redirect blocking (3xx returns error)
- Response size limit (5 MB)
- Outbound leak scan
- Requires user approval
Open findings documented:
- F-2: No TLS at application layer (expected: reverse proxy in production)
- F-3: Orchestrator binds
0.0.0.0on Linux (mitigated by per-job tokens) - F-6: SSE/WS connection limit (100 max)
- F-7: No orchestrator rate limiting (mitigated by token scoping + timeout)
- F-8: No orchestrator graceful shutdown
Source: src/secrets/
- Encryption: AES-256-GCM for stored secrets
- Key derivation: HKDF
- Master key storage: OS keychain (macOS Keychain via
security-framework, Linux GNOME Keyring/KWallet viasecret-service/zbus) - Crypto primitives:
subtlefor constant-time comparisons,blake3for hashing,ed25519-dalekfor signatures
Recent commits show active work on preventing cross-channel approval thread hijacking (#1485, #1701):
- Source channel persisted to DB
- Cross-channel authorization checks
- Approval thread hijack prevention
- Tool error sanitization before LLM injection (#1639)
- API response error redaction — internal error details stripped (#1711, #1702)
- PTY injection prevention — replaced
script -qfcwithpty-processfor injection-safe PTY (#1678) - Webhook auth enforcement — Feishu webhook authentication required (#1638)
- LLM API key handling — keys stored in encrypted secrets store, not plaintext (#1625)
- Sensitive path detection — unified protection across shell and file tools
| Tool | Description | Security Surface |
|---|---|---|
github |
GitHub API integration | OAuth credentials, API egress |
gmail |
Gmail API | OAuth credentials, email egress |
google-calendar |
Google Calendar API | OAuth credentials |
google-docs |
Google Docs API | OAuth credentials, document access |
google-drive |
Google Drive API | OAuth credentials, file access |
google-sheets |
Google Sheets API | OAuth credentials |
google-slides |
Google Slides API | OAuth credentials |
slack |
Slack API | OAuth/bot token, messaging egress |
telegram |
Telegram Bot API | Bot token, messaging egress |
web-search |
Web search | Search API key, egress |
llm-context |
LLM context management | LLM API credentials |
| Channel | Description |
|---|---|
discord |
Discord gateway (WebSocket) |
feishu |
Feishu/Lark messaging |
slack |
Slack events |
telegram |
Telegram Bot API long-polling |
whatsapp |
WhatsApp Business API |
echo, time, json, http, web_fetch, file, shell, memory, message, job, routine, extension_tools, skill_tools, secrets_tools
| Skill | Description |
|---|---|
delegation |
Task delegation patterns |
ironclaw-workflow-orchestrator |
Multi-agent workflow |
local-test |
Local testing skill |
review-checklist |
Code review automation |
routine-advisor |
Routine setup guidance |
web-ui-test |
Web UI testing |
Cross-references: ANALYSIS-shisad-security-design-analysis.md (shisad security gap analysis mapped to 78-paper agentic-security taxonomy), ~/agentic-security/ANALYSIS.md (the taxonomy itself), ~/agentic-security/analysis/RESEARCH-secure-agentic-frameworks-landscape.md (framework implementations).
The agentic-security survey (ANALYSIS.md §6) defines a 5-layer recommended defense stack. This table maps IronClaw, shisad, and the leading academic systems against each layer.
| Layer | Description | IronClaw | shisad | CaMeL | Progent |
|---|---|---|---|---|---|
| L1: Architecture | Separate trusted planning from untrusted data (non-negotiable) | Partial: single agent loop processes both trusted and untrusted content in the same context; no COMMAND/TASK split; no typed data separation | Strong (design): COMMAND/TASK privilege separation; stateless context forking; artifact-based handoffs; ArtifactLedger (designed, not yet shipped) | Strong: P-LLM/Q-LLM dual split; capability-tagged variables; Q-LLM never gets tool access | N/A (not an architecture) |
| L2: Access Control | Authorization in control plane, not prompts (non-negotiable) | Strong (shipped): WASM capability declarations; Docker per-job bearer tokens; proxy-mediated domain allowlists; tool-level rate limiting | Strong (design): 8-layer PEP pipeline; policy monotonicity; provenance-aware egress; per-call enforcement | Partial: security policies as Python functions evaluated before each tool call | Strong: JSON Schema policy DSL; deterministic per-call enforcement; 0% ASR |
| L3: Model Hardening | Instruction-data separation at model level | Minimal: no evidence of instruction hierarchy or model-level hardening | Moderate: spotlighting + three-tier context placement; treats model as untrusted (correct framing) | Implicit: Q-LLM isolation means the compromised model can't reach tools | N/A |
| L4: Runtime Monitoring | Verify-before-commit, drift detection | Minimal: hooks system (6 lifecycle points) provides extension points but no shipped verification logic; no plan commitment or trace analysis | Strong (design): plan commitment; differential execution (3-tier); trace verification; graduated response ladder; consensus voting | Moderate: capability token checking per execution step | Moderate: policy violation logging |
| L5: Detection/Filtering | Sanitize inputs, detect injection | Moderate (shipped): extracted safety crate with pattern-based injection detection, leak scanning (Aho-Corasick), content sanitization, 5 fuzz targets | Strong (design): double-pass content firewall (ingress + TASK→COMMAND summary barrier); ingress normalization/classification/sanitization | N/A (delegates to architecture) | N/A (delegates to policies) |
| Cross-cutting: Memory Trust | Split memory into trust zones | Minimal: flat workspace memory; identity files (SOUL.md, USER.md) injected into system prompt without trust differentiation | Strong (design): trust-tiered memory; gated writes; provenance retention; reversible updates; memory treated as attack surface | N/A (stateless) | N/A |
This extends the ANALYSIS-shisad-security-design-analysis.md §8 framework to include IronClaw.
shisad: Explicit control plane / data plane separation. Three trust levels (TRUSTED / SEMI_TRUSTED / UNTRUSTED) with immutable taint labels. Content placed in different prompt tiers based on trust. Taint propagates through processing — summaries of untrusted content are SEMI_TRUSTED, not TRUSTED.
IronClaw: No explicit instruction/data separation at the architecture level. The agent loop processes user messages and tool outputs in the same context. Identity files (SOUL.md, AGENTS.md) are injected alongside conversation history without structural boundary enforcement. The safety crate's validator.rs performs pattern-based injection detection, but this is a classifier (Layer 5) not an architectural boundary (Layer 1).
Assessment: This is IronClaw's most significant architectural gap. The agentic-security survey's core finding is that probabilistic defenses (classifiers, pattern matching) fail under adaptive attack (>90% ASR per "The Attacker Moves Second"). IronClaw's safety crate falls into this category. shisad's taint tracking and COMMAND/TASK separation address the problem architecturally.
shisad: COMMAND agent (orchestrator, stays clean) dispatches ephemeral TASK agents (workers, process untrusted content). Scoped task envelopes define tool access, egress, and time limits. Structured return boundary controls what crosses from TASK to COMMAND. Analogous to CaMeL's P-LLM/Q-LLM split.
IronClaw: Docker sandbox provides process-level isolation for heavy workloads. WASM sandbox provides lightweight isolation for tools. However, the agent itself is a single loop — no equivalent of COMMAND/TASK separation. The agent that plans is the same agent that processes untrusted tool output. The worker system (src/worker/) is for job execution, not privilege separation.
Assessment: IronClaw has strong execution isolation (WASM/Docker) but no reasoning isolation. A web page processed by a WASM tool returns its output into the same agent context that makes tool-call decisions. shisad's COMMAND/TASK split prevents this: untrusted content never enters the orchestrator's context.
shisad: 8-layer PEP pipeline evaluates every tool call against: schema validation, tool allowlist, capability check, taint-flow check, credential scan, egress destination check, risk scoring, rate limiting. Operates on metadata only — prompt injection in content cannot influence PEP decisions.
IronClaw: Tool calls go through the WASM sandbox's capability system (declared in capabilities.json) or Docker's proxy-mediated domain allowlist. The hooks system provides 6 lifecycle points (BeforeInbound, BeforeToolCall, BeforeOutbound, etc.) but these are extension points, not a deterministic enforcement pipeline. No evidence of taint-flow checking, credential scanning, or risk scoring at the tool-call level. The requires_approval() flag on individual tools (e.g., the HTTP tool) is a per-tool boolean, not a per-call analysis.
Assessment: shisad's PEP is fundamentally richer. IronClaw's enforcement is at the sandbox boundary (what can this tool access?), not at the decision level (should this specific call proceed given its provenance and arguments?). This is a critical difference: IronClaw can prevent a tool from reaching unauthorized endpoints but cannot detect when a legitimate tool is being misused by injected instructions.
shisad: Credential proxying via opaque credential_ref — LLM never sees secrets. Future plan: proxy-level injection.
IronClaw: Shipped and working. Two implementations:
- WASM:
CredentialInjectorintercepts HTTP requests from WASM guests and injects credentials at the host boundary (Bearer, Basic, custom header, query param). WASM code receives placeholder references. - Docker: Sandbox HTTP proxy injects credentials into plain HTTP requests to allowed hosts. For HTTPS, containers must use the orchestrator's per-job credential endpoint.
- Per-job credential grants with automatic revocation on job cleanup.
Assessment: IronClaw is concretely ahead on credential isolation. shisad has the right design (and the proxy-level injection plan maps closely to IronClaw's approach), but IronClaw has working code. This is the single most adoptable pattern from IronClaw.
shisad: Provenance-aware egress in the PEP: "Who asked for this URL?" User-requested URLs auto-approve; untrusted-content-derived URLs require confirmation; unattributed URLs are blocked. Five-level graduated response. This is shisad's most novel contribution (§8.2.1 of ANALYSIS-shisad-security-design-analysis.md) — no other system tracks egress provenance this way.
IronClaw: Domain allowlisting at two boundaries:
- WASM: Endpoint allowlist per tool (host, path prefix, HTTP method, HTTPS required)
- Docker: Proxy-mediated domain allowlist (fail-closed: empty = deny all)
- Built-in HTTP tool: SSRF protections (private IP blocking, DNS rebinding defense, cloud metadata blocking, redirect blocking)
- Leak detection scans outbound requests for secrets
Assessment: Both have strong egress controls but with fundamentally different models. IronClaw's is a static allowlist ("this tool can reach these hosts"). shisad's is dynamic and provenance-aware ("this request traces to untrusted content, so it needs confirmation even though the destination is allowed"). IronClaw's approach prevents unauthorized destinations but cannot detect when an authorized destination is being used for exfiltration driven by injected instructions. shisad's approach can, in principle, catch that case.
shisad: Multi-layer: spotlighting + three-tier context placement (inner), content firewall with double-pass (ingress + summary barrier), taint tracking + PEP enforcement (structural guarantee). Treats detection as helpful outer layer, not security boundary. Adversarial test suite with prompt injection scenarios.
IronClaw: Extracted safety crate (ironclaw_safety, 4,612 LOC) with pattern-based detection (validator.rs), content sanitization (sanitizer.rs), credential detection (credential_detect.rs), leak detection (leak_detector.rs), and policy severity levels (policy.rs). 5 fuzz targets. 2 benchmarks. Integrated into the agent pipeline.
Assessment: IronClaw has more mature implementation (shipped, fuzzed, benchmarked) of what is fundamentally a Layer 5 defense (detection/filtering). shisad has a more mature architecture (Layer 1) that provides structural guarantees even when detection fails. Per "The Attacker Moves Second", detection-based defenses alone are insufficient under adaptive attack. IronClaw's safety crate would need to be complemented with architectural separation to resist adaptive adversaries.
shisad: Memory treated as an attack surface. Trust-tiered storage. Gated writes with provenance retention. Reversible updates. Content from untrusted sources stored with immutable taint labels that persist across sessions.
IronClaw: Flat workspace with hybrid search (FTS + vector). Identity files (SOUL.md, AGENTS.md, USER.md, IDENTITY.md, HEARTBEAT.md) injected into system prompt. Skills have a trust model (Trusted vs Installed) with token budgeting and attenuation. No evidence of memory trust zones, taint-on-write, or provenance tracking for stored content.
Assessment: Memory poisoning is identified as a critical gap in the agentic-security survey (§7). IronClaw's identity files (especially agent-editable SOUL.md) are a known attack surface — the Zenity Labs SOUL.md attack against OpenClaw demonstrated this class of vulnerability. shisad's trust-tiered memory with provenance is the more robust approach.
-
Credential isolation is shipped, not planned. IronClaw's host-boundary injection (WASM
CredentialInjector+ Docker proxy injection + per-job grants with revocation) is a concrete implementation of what shisad has designed. Thecredential_ref→ proxy-level injection upgrade path in shisad's plan maps directly to IronClaw's proxy approach. Priority: high. -
Extracted safety module with fuzz testing. The
ironclaw_safetycrate (independent crate, own fuzz targets, own benchmarks, clean API boundary) is a strong engineering pattern. Even though detection is Layer 5 (not a security boundary), having it be fast, well-tested, and independently verifiable is valuable. shisad should consider extracting its content firewall and DLP components into a standalone, fuzz-tested module. Priority: medium. -
Network security audit document. IronClaw's
NETWORK_SECURITY.md(550 lines: threat model, 5-listener inventory, auth mechanisms, SSRF protections, open findings, review checklists) is the most thorough network surface audit we've seen in an open-source agent framework. shisad should create an equivalent document. Priority: medium. -
Dual sandbox model. WASM for lightweight tool isolation + Docker for heavy/untrusted execution is a complementary approach. WASM gives fast, capability-gated execution; Docker gives full process isolation with network-level controls. Neither alone covers all use cases. Priority: medium (design consideration for shisad's tool execution model).
-
Per-job bearer tokens with automatic revocation. Cryptographically random, job-scoped, constant-time comparison, ephemeral (in-memory only), revoked on job cleanup. A clean pattern for the orchestrator→worker boundary. Priority: low (implementation detail, but well-executed).
-
Safety benchmarks on hot paths.
benches/safety_check.rsandbenches/safety_pipeline.rsensure that safety enforcement doesn't become a latency bottleneck. If shisad's PEP pipeline adds overhead to every tool call, benchmarking the critical path is important. Priority: medium.
-
No instruction/data boundary. The agent processes user instructions and tool outputs (including content from untrusted web pages) in the same context. This is the fundamental architectural gap that "The Attacker Moves Second" showed is insufficient — pattern-based detection alone doesn't survive adaptive attack. IronClaw's safety crate is a good Layer 5 defense but is not backed by a Layer 1 architectural guarantee.
-
No taint tracking or provenance. Content flows freely through the system without provenance labels. A summary of a web page enters the same trust domain as a user message. The PEP-equivalent enforcement (WASM capabilities, Docker allowlists) cannot distinguish "the user asked to fetch reuters.com" from "injected instructions said to fetch evil.com" because both arrive as tool calls from the same agent context.
-
No plan commitment or trace verification. The agent loop has no mechanism to commit to a plan before seeing untrusted content, or to verify that subsequent tool calls are justified by the committed plan. This leaves it vulnerable to the "ROP-style composition" attacks identified in the survey — chaining individually-allowed tool calls into a malicious sequence.
-
Identity files as attack surface. SOUL.md, AGENTS.md, IDENTITY.md, and HEARTBEAT.md are injected into the system prompt. If the workspace memory is poisoned (the Zenity Labs OpenClaw attack), these files become a persistence mechanism for prompt injection. IronClaw trusts "Trusted skills" (user-placed in
~/.ironclaw/skills/) at full tool access — there is no scanning or validation for user-placed content, which assumes the user's machine is uncompromised. -
No differential execution. No mechanism to detect whether untrusted content is influencing the agent's behavior by comparing outputs with and without the suspect content. shisad's three-tier differential execution design (and MELON/AgentSentry from the literature) addresses this gap.
-
No graduated response ladder. IronClaw has binary controls: approved or not (
requires_approval()returning true/false). No equivalent of shisad's auto-approve → confirm → deny → lockdown gradient. Risk-based shell command approval (Low/Medium/High) is a recent addition but applies only to shell commands, not to the general tool-call pipeline. -
Single-pass detection. The safety crate runs at ingress but there is no equivalent of shisad's "summary barrier" — the second firewall pass at the TASK→COMMAND boundary that catches "taint laundering through summarization." Since IronClaw has no TASK→COMMAND boundary at all, this gap follows from gap #1.
| Dimension | shisad | IronClaw | Assessment |
|---|---|---|---|
| Language | Python | Rust | Different tradeoffs; Rust gives memory safety + performance, Python gives ecosystem + iteration speed |
| L1: Architecture | COMMAND/TASK split; taint tracking; stateless context forking | Single agent loop; no privilege separation in reasoning | shisad architecturally stronger |
| L2: Access control | 8-layer PEP; provenance-aware egress; policy monotonicity | WASM capabilities + Docker allowlists + hooks | shisad richer enforcement; IronClaw more concrete implementation |
| L3: Model hardening | Spotlighting + 3-tier context; treats model as untrusted | Pattern-based injection detection in safety crate | Both treat model as fallible; different approaches |
| L4: Runtime monitoring | Plan commitment; differential execution; graduated response | Hooks system (extension points); no shipped verification | shisad ahead in design; neither has shipped runtime verification |
| L5: Detection/filtering | Double-pass firewall (ingress + summary barrier) | Extracted safety crate with fuzz testing + benchmarks | IronClaw ahead in implementation maturity; shisad ahead in architecture |
| Credential isolation | Designed (credential_ref + future proxy) | Shipped (WASM injector + Docker proxy + per-job grants) | IronClaw ahead |
| Sandbox execution | Policy enforcement pipeline | WASM + Docker dual sandbox | IronClaw ahead |
| Leak detection | Designed | Shipped (Aho-Corasick, request/response scanning) | IronClaw ahead |
| Network security audit | Informal | 550-line internal audit with findings + checklists | IronClaw ahead |
| Memory trust zones | Designed (trust-tiered, provenance-retained) | Flat workspace with identity file injection | shisad ahead |
| Adversarial testing | Dedicated tests/adversarial/ with PI scenarios |
Safety crate fuzz testing (6 targets) | Different focus; both have gaps |
| Behavioral correctness | Hard gate (tests must pass for milestone closure) | E2E tests but no equivalent "product works" gate | shisad ahead |
| Multi-channel | Designed | Shipped (Telegram, Discord, Slack, Signal, WhatsApp, Web, CLI) | IronClaw ahead |
| Observability | Append-only audit logs; training-ready trace recorder | Pluggable observers (noop, log, multi) | Similar maturity |
IronClaw occupies the same space as shisad — a security-first AI agent framework. Key factors:
- NEAR AI backing — funded organization with 30+ active contributors including Illia Polosukhin (Transformer paper co-author, NEAR co-founder) and Zaki Manian (Cosmos)
- Velocity — 837 commits in 8 weeks, 20 releases; impressive pace for a ground-up Rust rewrite
- Rust credibility — "rewrite it in Rust for security" is a compelling narrative for security-conscious users
- Working security features — credential isolation, WASM sandbox, leak detection, and fuzz testing are all shipped, not designed
- Multi-channel — already supports Telegram, Discord, Slack, Signal, WhatsApp, Web, CLI
- International reach — docs in 4 languages suggests deliberate global push
- OpenClaw import — actively targeting OpenClaw's existing user base for migration
- Architecture gap is fundamental. IronClaw has no instruction/data separation, no taint tracking, no privilege separation in the reasoning loop. Per the agentic-security survey's core finding, this means IronClaw's safety defenses will not survive adaptive attack. This is not a feature gap — it's a design-level limitation that cannot be fixed incrementally.
- No formal threat model. IronClaw's NETWORK_SECURITY.md is excellent for network-layer threats but there is no equivalent document for the agent-level threat model (prompt injection, memory poisoning, confused deputy, exfiltration via tool calls). The agentic-security taxonomy identifies these as the primary threats.
- Safety crate is Layer 5 only. Pattern-based detection with no architectural backing. "The Attacker Moves Second" showed >90% bypass rates against detection-only defenses.
- Dependency risk. 279K LOC, 131 direct dependencies, 7 tracked advisories already. First public CVE is a matter of when, not if.
- NEAR AI association. Crypto/web3 branding may be a liability in enterprise security contexts.
- Pre-1.0 instability. v0.22.0 with breaking API changes still happening; not production-stable yet.
IronClaw is strong on implementation maturity (shipped features, working code, multi-channel, multi-provider) but architecturally weaker on the core agent security problem (prompt injection → exfiltration). If shisad ships its designed primitives (PEP, COMMAND/TASK, taint tracking, credential proxy), the architectural advantage is decisive. If shisad's designs remain unimplemented, IronClaw's "working code beats design docs" position becomes compelling regardless of architectural quality.
-
Ship credential isolation. IronClaw's host-boundary injection is a concrete implementation of shisad's
credential_ref+ future proxy design. Every day this remains unshipped, IronClaw can point to working code vs shisad's design docs. Adopt the proxy-level injection pattern (intercept tool HTTP, inject credentials at host boundary) directly. -
Ship leak detection. IronClaw's Aho-Corasick scanner on outbound requests/responses is straightforward and shipped. shisad's egress pipeline should include equivalent outbound secret scanning.
-
Ship the PEP enforcement pipeline end-to-end. The 8-layer PEP is shisad's strongest architectural advantage over IronClaw. It must be wired, not designed. Priority: scheduler→PEP integration (the highest-risk enforcement bypass per
ANALYSIS-shisad-security-design-analysis.md§4.1).
-
Create a network security audit document modeled on IronClaw's
NETWORK_SECURITY.mdformat: threat model, listener inventory, auth mechanisms, SSRF protections, open findings with severity, review checklists for PRs. -
Extract and fuzz-test the safety module. Follow IronClaw's pattern of an extracted, independently testable safety module with dedicated fuzz targets and hot-path benchmarks. Even as a Layer 5 defense, it should be fast and well-tested.
-
Evaluate WASM-based tool sandboxing as a complement to the PEP. IronClaw's wasmtime component model with capability declarations provides execution isolation that the PEP (a policy layer) does not. The PEP decides whether a tool call should proceed; a WASM sandbox constrains what the tool can do if it misbehaves. These are complementary.
-
Frame the architectural difference clearly. IronClaw invests in sandbox-level security (WASM isolation, Docker hardening, network proxying). shisad invests in reasoning-level security (taint tracking, privilege separation, per-call provenance-aware enforcement). The agentic-security survey's finding is that sandbox-level defenses are necessary but not sufficient — the agent's reasoning must also be protected. This is shisad's thesis.
-
Monitor IronClaw's CVE trajectory. At 279K LOC and 131 deps, the first public vulnerability is likely. When it happens, analyze whether it's in the safety crate (expected — classifier bypass), the sandbox boundary (serious), or the core agent loop (confirms the architectural gap).
-
Track NEAR AI's roadmap. IronClaw's FEATURE_PARITY.md and CHANGELOG show the development direction. Watch for: instruction/data separation (would close the architectural gap), plan commitment (would add L4 defense), memory trust zones (would address poisoning).
| Layer | Technology |
|---|---|
| Language | Rust 1.92, Edition 2024 |
| Async runtime | tokio (full features) |
| HTTP framework | axum 0.8 + tower |
| WASM runtime | wasmtime 28 (component model) |
| Docker API | bollard 0.18 |
| Database (primary) | PostgreSQL + pgvector + deadpool |
| Database (embedded) | libSQL/Turso |
| Cryptography | aes-gcm, hkdf, blake3, ed25519-dalek, subtle |
| Keychain | security-framework (macOS), secret-service/zbus (Linux) |
| LLM abstraction | rig-core 0.30 |
| TUI | Ratatui + crossterm |
| CLI | clap 4 |
| Serialization | serde + serde_json |
| Testing | testcontainers (PG), Playwright (e2e), cargo-fuzz, criterion (benches) |
| CI/CD | GitHub Actions, cargo-dist (multi-platform release) |
| Supply chain | cargo-deny (advisories, licenses, bans, sources) |
| Version | Date | Notable |
|---|---|---|
| v0.2.0 | ~2026-02-05 | Early tagged release |
| v0.9.0 | ~2026-02-20 | First feature-complete milestone? |
| v0.15.0 | ~2026-03-10 | Mid-march acceleration |
| v0.19.0 | 2026-03-17 | |
| v0.20.0 | 2026-03-19 | |
| v0.21.0 | 2026-03-20 | |
| v0.22.0 | 2026-03-25 | Current: multi-tenant, OAuth hardening |
- 66 integration test files in
tests/covering: workspace, heartbeat, WS gateway, pairing, WASM channels, multi-tenant, provider chaos, shell risk regression, safety layer e2e, OpenClaw import, tool schema validation, identity scope isolation - Playwright e2e tests (
tests/e2e/) for web UI flows - 6 fuzz targets (5 in safety crate + 1 root)
- 2 criterion benchmarks (safety_check, safety_pipeline)
- Snapshot testing via
instacrate