Use this skill when:
get_page_textreturns empty or only navigation chromeread_pagereturns an empty accessibility tree- The page is known to use canvas, WebGL, or a custom rendering engine
- You need to visually read or interact with content that DOM tools cannot access
Some web UIs render all visible content onto an HTML <canvas> element rather than the DOM. The Chrome MCP's text and accessibility tools see nothing useful — but the content IS on screen. The solution combines two separate tool sets with different capabilities.
Talks directly to the browser via the Claude extension. Use this for:
- Navigating to URLs
- Reading normal HTML pages (
get_page_text,read_page) - Running JavaScript in the page (
javascript_tool) - Clicking DOM elements, filling forms
Controls the OS and sees the actual screen pixels. Use this for:
- Taking screenshots that capture canvas content
- Zooming into screen regions to read fine detail
- Clicking visible elements on screen (
left_clickwith coordinates from a screenshot)
These two tool sets complement each other — use Chrome MCP to drive the browser and computer-use to see and interact with what's rendered.
mcp__computer-use__request_access
apps: ["Google Chrome"]
reason: <one sentence describing what you need to see>
Check the response for windowLocations — it tells you which monitor Chrome is on.
If Chrome is on a secondary monitor, switch to it before screenshotting:
mcp__computer-use__switch_display
display: "ASUS VS239 (2)" ← whatever name appeared in windowLocations
mcp__Claude_in_Chrome__navigate
tabId: <id>
url: <target url>
mcp__computer-use__screenshot
Full-screen screenshots are often too small to read tables or fine text. Zoom into a region — coordinates are always from the last full screenshot:
mcp__computer-use__zoom
region: [x0, y0, x1, y1]
Option A — JavaScript click (for nav elements that are real HTML under the canvas):
Even on canvas-heavy pages, tabs, sidebar links, and buttons are often real HTML elements that trigger canvas redraws. Find and click them by label:
const all = document.querySelectorAll('*');
let el = null;
for (const e of all) {
if (e.children.length === 0 && e.innerText && e.innerText.trim() === 'TARGET LABEL') {
el = e; break;
}
}
if (el) { el.click(); 'clicked: ' + el.outerHTML.substring(0, 150); }
else { 'not found' }Option B — computer-use left_click (for clicking canvas pixels directly):
Use coordinates from the most recent screenshot to click anywhere on screen, including inside a canvas:
mcp__computer-use__left_click
coordinate: [x, y]
After any click, take a fresh screenshot to see what changed.
If the canvas reads from a backend endpoint, bypass the UI entirely:
(async () => {
const r = await fetch('/some/data/endpoint', {credentials: 'include'});
window._scraped = await r.text();
})();
// Second javascript_tool call:
window._scraped| Situation | Tool |
|---|---|
| Page renders as normal HTML | Chrome MCP get_page_text or read_page |
| Need to see canvas content | computer-use screenshot |
| Need to read small text | computer-use zoom |
| Click a nav/tab that's a real HTML element | Chrome MCP javascript_tool → el.click() |
| Click a pixel location on the canvas | computer-use left_click with screenshot coordinates |
| Navigate to a URL | Chrome MCP navigate |
| Chrome is on a different monitor | computer-use switch_display |
- Navigate first, screenshot second. Use Chrome MCP to get to the right page, then screenshot to read it.
- Zoom coordinates are always from the last full screenshot, never from a previous zoom.
- JS dropdowns may not be real
<select>elements on canvas pages — if querying selects returns unrelated results, the control is canvas. Read it visually and useleft_clickto interact. - Canvas redraws are asynchronous. After a click, wait briefly and take a fresh screenshot before reading the result.
- Coordinates from screenshots map to screen pixels — use zoom to locate exact positions before clicking.