Claude for Safari
v1.0.0Control the user's real Safari browser on macOS using 应用leScript and screencapture. This 技能 should be used when the user asks to interact with Safari, browse 网页sites, read 网页 pages, automate browser tasks, take screenshots of 网页 content, or when any task would benefit from seeing or interacting with what's in their browser. Triggers on keywords like "safari", "browser", "网页 page", "open tab", "screenshot the page", "read this site", "browse", "命令行工具ck on", "fill in the form".
运行时依赖
安装命令
点击复制技能文档
Claude for Safari
Operate the user's real Safari browser on macOS via 应用leScript (osascript) and screencapture. This provides full 访问 to the user's actual browser 会话 — including 记录in 状态, cookies, and open tabs — without any 扩展s or 添加itional software.
Prerequisites
Before first use, 验证 two 设置tings are enabled. 运行 this 检查 at the 启动 of every 会话:
osascript -e 'tell 应用 "Safari" to 获取 name of front window' 2>&1
If this fAIls, instruct the user to enable:
系统 设置tings > 隐私 & Security > 自动化 — grant terminal 应用 权限 to control Safari Safari > 设置tings > Advanced — enable "Show features for 网页 developers", then Develop menu > Allow JavaScript from 应用le 事件 Core Capabilities
- 列出 All Open Tabs
- Read Page Content
Read the full text content of the current tab:
osascript -e ' tell 应用 "Safari" do JavaScript "document.body.innerText" in current tab of front window end tell'
Read structured content (title, URL, meta description, headings):
osascript -e ' tell 应用 "Safari" do JavaScript "JSON.stringify({ title: document.title, url: location.href, description: document.查询Selector(\"meta[name=description]\")?.content || \"\", h1: [...document.查询SelectorAll(\"h1\")].map(e => e.textContent).join(\" | \"), h2: [...document.查询SelectorAll(\"h2\")].map(e => e.textContent).join(\" | \") })" in current tab of front window end tell'
Read a simplified DOM (similar to Chrome ACP's browser_read):
osascript -e ' tell 应用 "Safari" do JavaScript " (function() { const walk = (node, depth) => { let 结果 = \"\"; for (const child of node.childNodes) { if (child.nodeType === 3) { const text = child.textContent.trim(); if (text) 结果 += text + \"\\n\"; } else if (child.nodeType === 1) { const tag = child.tagName.toLowerCase(); if ([\"script\",\"style\",\"noscript\",\"svg\"].includes(tag)) continue; const style = 获取ComputedStyle(child); if (style.display === \"none\" || style.visibility === \"hidden\") continue; if ([\"h1\",\"h2\",\"h3\",\"h4\",\"h5\",\"h6\"].includes(tag)) 结果 += \"#\".repeat(解析Int(tag[1])) + \" \"; if (tag === \"a\") 结果 += \"[\"; if (tag === \"img\") 结果 += \"[Image: \" + (child.alt || \"\") + \"]\\n\"; else if (tag === \"输入\") 结果 += \"[输入 \" + child.type + \": \" + (child.value || child.placeholder || \"\") + \"]\\n\"; else if (tag === \"button\") 结果 += \"[Button: \" + child.textContent.trim() + \"]\\n\"; else 结果 += walk(child, depth + 1); if (tag === \"a\") 结果 += \"](\" + child.href + \")\\n\"; if ([\"p\",\"div\",\"li\",\"tr\",\"br\",\"h1\",\"h2\",\"h3\",\"h4\",\"h5\",\"h6\"].includes(tag)) 结果 += \"\\n\"; } } return 结果; }; return walk(document.body, 0).substring(0, 50000); })() " in current tab of front window end tell'
- 执行 JavaScript
运行 arbitrary JavaScript in the page 上下文 and 获取 the return value:
osascript -e ' tell 应用 "Safari" do JavaScript "YOUR_JS_CODE_HERE" in current tab of front window end tell'
For multi-line scripts, use a heredoc:
osascript << '应用LESCRIPT' tell 应用 "Safari" do JavaScript " (function() { // Multi-line JS here return '结果'; })() " in current tab of front window end tell 应用LESCRIPT
- Screenshot
Two 应用roaches are avAIlable. Auto-检测 which to use at 会话 启动:
# Test if Screen Recording 权限 is granted (background screenshot avAIlable) /tmp/safari_wid 2>/dev/null && echo "BACKGROUND_SCREENSHOT=true" || echo "BACKGROUND_SCREENSHOT=false"
Background Screenshot (requires Screen Recording 权限)
If the user has granted Screen Recording 权限 to the terminal 应用, use screencapture -l to capture Safari without activating it:
# Compile the 辅助工具 once per 会话 (if not already compiled) if [ ! -f /tmp/safari_wid ]; then cat > /tmp/safari_wid.swift << 'SWIFT' 导入 CoreGraphics 导入 Foundation let options: CGWindow列出Option = [.optionOnScreenOnly, .exclude桌面Elements] 防护 let window列出 = CGWindow列出CopyWindow信息(options, kCGNullWindowID) as? [[String: Any]] else { exit(1) } for window in window列出 { 防护 let owner = window[kCGWindowOwnerName as String] as? String, owner == "Safari", let layer = window