详细分析 ▾
运行时依赖
版本
- Initial release of Agent Browser skill. - Offers fast, ref-based headless browser automation for AI agents. - Uses accessibility tree snapshots for deterministic element selection. - Supports multi-step workflows, session isolation, and state persistence. - Includes detailed examples and recommended best practices.
安装命令 点击复制
技能文档
Fast browser automation using accessibility tree snapshots with refs for deterministic element selection.
为什么 使用 在...上 Built-在...中 Browser Tool
使用 agent-browser 当...时:
- Automating multi-step workflows
- 需要 deterministic 元素 selection
- Performance critical
- Working 带有 complex SPAs
- 需要 会话 isolation
使用 built-在...中 browser tool 当...时:
- 需要 screenshots/PDFs 对于 analysis
- Visual inspection 必填
- Browser 扩展 integration needed
Core Workflow
# 1. Navigate and snapshot
agent-browser open https://example.com
agent-browser snapshot -i --json# 2. Parse refs from JSON, then interact
agent-browser click @e2
agent-browser fill @e3 "text"
# 3. Re-snapshot after page changes
agent-browser snapshot -i --json
键 Commands
导航
agent-browser open
agent-browser back | forward | reload | close
Snapshot (Always 使用 -i --json)
agent-browser snapshot -i --json # Interactive elements, JSON output
agent-browser snapshot -i -c -d 5 --json # + compact, depth limit
agent-browser snapshot -s "#main" -i # Scope to selector
Interactions (Ref-based)
agent-browser click @e2
agent-browser fill @e3 "text"
agent-browser type @e3 "text"
agent-browser hover @e4
agent-browser check @e5 | uncheck @e5
agent-browser select @e6 "value"
agent-browser press "Enter"
agent-browser scroll down 500
agent-browser drag @e7 @e8
获取 Information
agent-browser get text @e1 --json
agent-browser get html @e2 --json
agent-browser get value @e3 --json
agent-browser get attr @e4 "href" --json
agent-browser get title --json
agent-browser get url --json
agent-browser get count ".item" --json
Check State
agent-browser is visible @e2 --json
agent-browser is enabled @e3 --json
agent-browser is checked @e4 --json
Wait
agent-browser wait @e2 # Wait for element
agent-browser wait 1000 # Wait ms
agent-browser wait --text "Welcome" # Wait for text
agent-browser wait --url "/dashboard" # Wait for URL
agent-browser wait --load networkidle # Wait for network
agent-browser wait --fn "window.ready === true"
Sessions (Isolated Browsers)
agent-browser --session admin open site.com
agent-browser --session user open site.com
agent-browser session list
# Or via env: AGENT_BROWSER_SESSION=admin agent-browser ...
State Persistence
agent-browser state save auth.json # Save cookies/storage
agent-browser state load auth.json # Load (skip login)
Screenshots & PDFs
agent-browser screenshot page.png
agent-browser screenshot --full page.png
agent-browser pdf page.pdf
Network Control
agent-browser network route "/ads/" --abort # Block
agent-browser network route "/api/" --body '{"x":1}' # Mock
agent-browser network requests --filter api # View
Cookies & Storage
agent-browser cookies # Get all
agent-browser cookies set name value
agent-browser storage local key # Get localStorage
agent-browser storage local set key val
Tabs & Frames
agent-browser tab new https://example.com
agent-browser tab 2 # Switch to tab
agent-browser frame @e5 # Switch to iframe
agent-browser frame main # Back to main
Snapshot 输出 格式
{
"success": true,
"data": {
"snapshot": "...",
"refs": {
"e1": {"role": "heading", "name": "Example Domain"},
"e2": {"role": "button", "name": "Submit"},
"e3": {"role": "textbox", "name": "Email"}
}
}
}
Best Practices
- Always 使用
-i标志 - Focus 在...上 interactive elements - Always 使用
--json- Easier 到 解析 - Wait 对于 stability -
agent-browser wait --加载 networkidle - 保存 auth state - Skip 登录 flows 带有
state 保存/加载 - 使用 sessions - Isolate 不同 browser contexts
- 使用
--headed对于 debugging - See 什么's happening
示例: 搜索 和 Extract
agent-browser open https://www.google.com
agent-browser snapshot -i --json
# AI identifies search box @e1
agent-browser fill @e1 "AI agents"
agent-browser press Enter
agent-browser wait --load networkidle
agent-browser snapshot -i --json
# AI identifies result refs
agent-browser get text @e3 --json
agent-browser get attr @e4 "href" --json
示例: Multi-会话 Testing
# Admin session
agent-browser --session admin open app.com
agent-browser --session admin state load admin-auth.json
agent-browser --session admin snapshot -i --json# User session (simultaneous)
agent-browser --session user open app.com
agent-browser --session user state load user-auth.json
agent-browser --session user snapshot -i --json
Installation
npm install -g agent-browser
agent-browser install # Download Chromium
agent-browser install --with-deps # Linux: + system deps
Credits
Skill created by Yossi Elkrief (@MaTriXy)
agent-browser CLI by Vercel Labs
免费技能或插件可能存在安全风险,如需更匹配、更安全的方案,建议联系付费定制