You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Claude Agent Skill that measures how long individual Claude Code tool calls take — Bash, Edit, MCP tools, everything.
It uses Claude Code's built-in OpenTelemetry instrumentation with the console exporter, so no Docker collector or observability backend is needed. Spans land in stdout, a small awk pipeline extracts tool.execution durations.
Why
When you're scripting lots of tiny tool calls, it helps to know the per-call floor. For example, the benchmark below shows that Bash "echo hello" has an overhead ~20× higher than an MCP stdio tool — which changes how you'd design a multi-step workflow.
Bash averages 3,410 ms per call; the MCP screenshot averages 190 ms. That ~18× gap is all overhead in the Bash tool path (subprocess spawn, hooks, output capture, IPC back to the CLI) — not platform cost. If you need many tiny shell operations, chain them with && in one Bash call.
Files in this gist
SKILL.md — the skill manifest and instructions (this is what Claude Code reads)
run-experiment.sh — runs a prompt N times and extracts per-run tool.execution durations
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Measure how long Claude Code tool calls take using the built-in OpenTelemetry console exporter. Use when asked to profile a tool, benchmark a tool, measure tool duration, time a tool call, compare tool overhead, or investigate why a tool feels slow.
Time individual tool calls (Bash, Edit, MCP, etc.) via Claude Code's built-in OTel instrumentation. No Docker, no collector — console exporter dumps spans to stdout, then a small awk extractor pulls out tool.execution durations.
When to reach for this
User says "measure how long X takes" where X is a Claude tool
Comparing overhead between two tools (e.g. Bash vs an MCP tool)
Checking if a tool call's cost justifies a workflow choice (e.g. batching vs. many small calls)
Not for: profiling user scripts, profiling LLM latency (it's captured but isn't the focus), production observability (use a real OTel collector instead).
Quick start
Run the prompt 5 times, extract durations, print stats:
bash ~/.claude/skills/claude-tool-profiling/run-experiment.sh \
5 \
"Run echo hello via Bash, then say done" \
Bash
Args: <runs> <prompt> <tool_name_substring>.
Output: per-run tool.execution ms, then min/max/avg across runs.
How it works
Claude Code emits OTel spans when these env vars are set:
Bash has a ~3s floor per call from subprocess + hook + IPC overhead. MCP stdio tools don't pay this cost. Rule of thumb: for many tiny shell ops, chain with && in a single Bash call rather than making many calls.
Interpreting results
A full run decomposes into:
claude_code.interaction ← wall clock
├── claude_code.llm_request × N ← model thinking (usually dominates)
└── claude_code.tool × M ← each tool call
├── .blocked_on_user ← permission wait (~ms)
└── .execution ← the number that matters
If .execution is fast but wall time is slow, the overhead is in the LLM, not the tool. Don't blame the tool for round-trip latency.