请参见下方翻译的SKILL.md内容(由于字符限制,仅提供关键部分翻译,完整内容请参考原始文档)
The browser-use command provides fast, persistent browser automation. A background daemon keeps the browser open across commands, giving ~50ms latency per call.
Prerequisites
browser-use doctor # Verify installation
For setup details, see https://github.com/browser-use/browser-use/blob/main/browser_use/skill_cli/README.md
Core Workflow
- Navigate:
browser-use open — launches headless browser and opens page
- Inspect:
browser-use state — returns clickable elements with indices
- Interact: use indices from state (
browser-use click 5, browser-use input 3 "text")
- Verify:
browser-use state or browser-use screenshot to confirm
- Repeat: browser stays open between commands
If a command fails, run browser-use close first to clear any broken session, then retry.
To use the user's existing Chrome (preserves logins/cookies): run browser-use connect first.
To use a cloud browser instead: run browser-use cloud connect first.
After either, commands work the same way.
Browser Modes
browser-use open # Default: headless Chromium (no setup needed)
browser-use --headed open # Visible window (for debugging)
browser-use connect # Connect to user's Chrome (preserves logins/cookies)
browser-use cloud connect # Cloud browser (zero-config, requires API key)
browser-use --profile "Default" open # Real Chrome with specific profile
After connect or cloud connect, all subsequent commands go to that browser — no extra flags needed.
Commands
# Navigation
browser-use open # Navigate to URL
browser-use back # Go back in history
browser-use scroll down # Scroll down (--amount N for pixels)
browser-use scroll up # Scroll up
browser-use tab list # List all tabs
browser-use tab new [url] # Open a new tab (blank or with URL)
browser-use tab switch # Switch to tab by index
browser-use tab close [index...] # Close one or more tabs# Page State — always run state first to get element indices
browser-use state # URL, title, clickable elements with indices
browser-use screenshot [path.png] # Screenshot (base64 if no path, --full for full page)
# Interactions — use indices from state
browser-use click # Click element by index
browser-use click # Click at pixel coordinates
browser-use type "text" # Type into focused element
browser-use input "text" # Click element, then type
browser-use keys "Enter" # Send keyboard keys (also "Control+a", etc.)
browser-use select "option" # Select dropdown option
browser-use upload # Upload file to file input
browser-use hover # Hover over element
browser-use dblclick # Double-click element
browser-use rightclick # Right-click element
# Data Extraction
browser-use eval "js code" # Execute JavaScript, return result
browser-use get title # Page title
browser-use get html [--selector "h1"] # Page HTML (or scoped to selector)
browser-use get text # Element text content
browser-use get value # Input/textarea value
browser-use get attributes # Element attributes
browser-use get bbox # Bounding box (x, y, width, height)
# Wait
browser-use wait selector "css" # Wait for element (--state visible|hidden|attached|detached, --timeout ms)
browser-use wait text "text" # Wait for text to appear
# Cookies
browser-use cookies get [--url ] # Get cookies (optionally filtered)
browser-use cookies set # Set cookie (--domain, --secure, --http-only, --same-site, --expires)
browser-use cookies clear [--url ] # Clear cookies
browser-use cookies export # Export to JSON
browser-use cookies import # Import from JSON
# Session
browser-use close # Close browser and stop daemon
browser-use sessions # List active sessions
browser-use close --all # Close all sessions
For advanced browser control (CDP, device emulation, tab activation), see references/cdp-python.md.
Cloud API
browser-use cloud connect # Provision cloud browser and connect (zero-config)
browser-use cloud login # Save API key (or set BROWSER_USE_API_KEY)
browser-use cloud logout # Remove API key
browser-use cloud v2 GET /browsers # REST passthrough (v2 or v3)
browser-use cloud v2 POST /tasks '{"task":"...","url":"..."}'
browser-use cloud v2 poll # Poll task until done
browser-use cloud v2 --help # Show API endpoints
cloud connect provisions a cloud browser with a persistent profile (auto-created on first use), connects via CDP, and prints a live URL. browser-use close disconnects AND stops the cloud browser. For custom browser settings (proxy, timeout, specific profile), use cloud v2 POST /browsers directly with the desired parameters.
Agent Self-Registration
Only use this if you don't already have an API key (check browser-use doctor to see if api_key is set). If already logged in, skip this entirely.
browser-use cloud signup — get a challenge
- Solve the challenge
browser-use cloud signup --verify — verify and save API key
browser-use cloud signup --claim — generate URL for a human to claim the account
Tunnels
browser-use tunnel # Start Cloudflare tunnel (idempotent)
browser-use tunnel list # Show active tunnels
browser-use tunnel stop # Stop tunnel
browser-use tunnel stop --all # Stop all tunnels
Profile Management
browser-use profile list # List detected browsers and profiles
browser-use profile sync --all # Sync profiles to cloud
browser-use profile update # Download/update profile-use binary
Command Chaining
Commands can be chained with &&. The browser persists via the daemon, so chaining is safe and efficient.
browser-use open https://example.com && browser-use state
browser-use input 5 "user@example.com" && browser-use input 6 "password" && browser-use click 7
Chain when you don't need intermediate output. Run separately when you need to parse state to discover indices first.
Common Workflows
Authenticated Browsing
When a task requires an authenticated site (Gmail, GitHub, internal tools), use Chrome profiles:
browser-use profile list # Check available profiles
# Ask the user which profile to use, then:
browser-use --profile "Default" open https://github.com # Already logged in
Exposing Local Dev Servers
browser-use tunnel 3000 # → https://abc.trycloudflare.com
browser-use open https://abc.trycloudflare.com # Browse the tunnel
Multiple Browsers
For subagent workflows or running multiple browsers in parallel, use --session NAME. Each session gets its own browser. See references/multi-session.md.
Configuration
browser-use config list # Show all config values
browser-use config set cloud_connect_proxy jp # Set a value
browser-use config get cloud_connect_proxy # Get a value
browser-use config unset cloud_connect_timeout # Remove a value
browser-use doctor # Shows config + diagnostics
browser-use setup # Interactive post-install setup
Config stored in ~/.browser-use/config.json.
Global Options
| Option | Description |
|---|
--headed | Show browser window |
--profile [NAME] | Use real Chrome (bare --profile uses "Default") |
--cdp-url | Connect via CDP URL (http:// or ws://) |
--session NAME | Target a named session (default: "default") |
--json | Output as JSON |
--mcp | Run as MCP server via stdin/stdout |
Tips
- Always run
state first to see available elements and their indices
- Use
--headed for debugging to see what the browser is doing
- Sessions persist — browser stays open between commands
- CLI aliases:
bu, browser, and browseruse all work
- If commands fail, run
browser-use close first, then retry
Troubleshooting
- Browser won't start?
browser-use close then browser-use --headed open
- Element not found?
browser-use scroll down then browser-use state
- Run diagnostics:
browser-use doctor
Cleanup
browser-use close # Close browser session
browser-use tunnel stop --all # Stop tunnels (if any)