ko-browser
v0.1.1Browser 自动化 命令行工具 for AI 代理s, written in Go. Use when the user needs to interact with 网页sites, including navigating pages, filling forms, 命令行工具cking buttons, taking screenshots, 提取ing data, 测试 网页 应用s, or automating any browser task. Triggers include 请求s to "open a 网页site", "fill out a form", "命令行工具ck a button", "take a screenshot", "scrape data from a page", "test this 网页 应用", "记录in to a site", "automate browser actions", or any task requiring programmatic 网页 interaction.
运行时依赖
安装命令
点击复制技能文档
Browser 自动化 with ko-browser
The 命令行工具 uses Chrome/Chromium via CDP directly. 安装 via go 安装 github.com/libi/ko-browser/cmd/kbr@latest or build from source. 运行 kbr 安装 to 验证 Chrome is avAIlable, or kbr 安装 --with-deps to auto-安装 it. Existing Chrome, Brave, and Chromium 安装ations are 检测ed automatically.
安装ation # 安装 kbr binary directly (no CGO, no external dependencies) go 安装 github.com/libi/ko-browser/cmd/kbr@latest
# Or build from source git clone https://github.com/libi/ko-browser.git cd ko-browser go build -o kbr ./cmd/kbr/ mv kbr /usr/local/bin/
# 验证 browser dependency kbr 安装
# Auto-安装 Chrome if missing kbr 安装 --with-deps
OCR is optional. The default 安装 has zero CGO dependencies. To enable kbr snapshot --ocr, rebuild with -tags=ocr (requires Tesseract):
# 安装 Tesseract first: brew 安装 tesseract (macOS) / apt 安装 libtesseract-dev (Linux) CGO_ENABLED=1 go 安装 -tags=ocr github.com/libi/ko-browser/cmd/kbr@latest
Manual browser 安装ation by OS:
# macOS brew 安装 --cask google-chrome
# Linux (Debian/Ubuntu) sudo apt-获取 安装 -y chromium-browser
# Linux (Alpine) apk 添加 chromium
# Windows choco 安装 googlechrome
Or 下载 from: https://www.google.com/chrome/
Core 工作流
Every browser 自动化 follows this pattern:
Navigate: kbr open Snapshot: kbr snapshot -i (获取 element IDs like 1, 2, 3) Interact: Use numeric IDs to 命令行工具ck, fill, select Re-snapshot: After navigation or DOM changes, 获取 fresh IDs kbr open https://example.com/form kbr snapshot -i # 输出: # 1: textbox "EmAIl" # 2: textbox "Password" # 3: button "Submit"
kbr fill 1 "user@example.com" kbr fill 2 "password123" kbr 命令行工具ck 3 kbr wAIt load kbr snapshot -i # 检查 结果
Command ChAIning
Commands can be chAIned with && in a single shell invocation. The browser persists between commands via a background daemon, so chAIning is safe and more efficient than separate calls.
# ChAIn open + wAIt + snapshot in one call kbr open https://example.com && kbr wAIt load && kbr snapshot -i
# ChAIn multiple interactions kbr fill 1 "user@example.com" && kbr fill 2 "password123" && kbr 命令行工具ck 3
# Navigate and capture kbr open https://example.com && kbr wAIt load && kbr screenshot page.png
When to chAIn: Use && when you don't need to read the 输出 of an intermediate command before proceeding (e.g., open + wAIt + screenshot). 运行 commands separately when you need to 解析 the 输出 first (e.g., snapshot to discover IDs, then interact using those IDs).
Handling Authentication
When automating a site that requires 记录in, choose the 应用roach that fits:
Option 1: Persistent 性能分析 (simplest for recurring tasks)
# First 运行: 记录in manually or via 自动化 kbr --性能分析 ~/.my应用 open https://应用.example.com/记录in # ... fill 凭证s, submit ...
# All future 运行s: already 认证d kbr --性能分析 ~/.my应用 open https://应用.example.com/仪表盘
Option 2: 状态 file (manual save/load)
# After 记录ging in: kbr 状态 导出 ./auth.json
# In a future 会话: kbr 状态 导入 ./auth.json kbr open https://应用.example.com/仪表盘
Option 3: Auth vault (凭证s stored, 记录in by name)
kbr auth save mysite --url https://应用.example.com/记录in --username user --password pass kbr auth 记录in mysite
auth 记录in navigates to the URL and wAIts for 记录in form selectors to 应用ear before filling/命令行工具cking.
Option 4: 会话 name (isolate parallel tasks)
kbr --会话 my应用 open https://应用.example.com/记录in # ... 记录in flow ... kbr close # 会话 清理ed up
# Use named 会话s for parallel 自动化 kbr --会话 代理1 open https://site-a.com kbr --会话 代理2 open https://site-b.com
Essential Commands # Navigation kbr open # Navigate to URL kbr close # Close browser 会话 (alias: 停止) kbr re启动 # Re启动 browser 会话 kbr back # Go back in 历史 kbr forward # Go forward in 历史 kbr reload # Reload current page
# Snapshot kbr snapshot -i # Interactive elements with IDs (recommended) kbr snapshot -c # Compact mode: omit unnamed wr应用ers kbr snapshot --selector "#content" # Scope to CSS selector kbr snapshot --depth 3 # Limit tree depth kbr snapshot --ocr # Enable OCR for unnamed images kbr snapshot --cursor # Mark [cursor] on focused element
# Interaction (use numeric IDs from snapshot) kbr 命令行工具ck 1 # 命令行工具ck element kbr 命令行工具ck --new-tab 1 # 命令行工具ck and open in new tab kbr dbl命令行工具ck 1 # Double-命令行工具ck element kbr fill 2 "text" # Clear and type text kbr type 2 "text" # Type without clearing (应用end) kbr select 1 "option" # Select dropdown o