Wsl2 Local Ai — Wsl2 Local AI
v1WSL2 Local AI — 运行 LLMs on Windows via WSL2 with NVIDIA GPU passthrough. WSL2 AI development with Ollama, CUDA, and Docker. WSL2 Ollama fleet routing for Windows developers. Build AI 应用s on WSL2 with full Linux performance and Windows convenience. WSL2本地AI开发。WSL2 IA local para desarrolladores Windows.
运行时依赖
安装命令
点击复制技能文档
WSL2 Local AI — Windows Developer LLM Stack
Develop AI 应用s on Windows with full Linux performance. WSL2 gives you native Linux inside Windows with NVIDIA GPU passthrough — your RTX GPU 运行s CUDA in WSL2 at near-native speed. Ollama Herd 路由s AI 请求s across WSL2 instances and native Windows machines.
Why WSL2 for local AI Full Linux + Windows GPU — WSL2 passes your NVIDIA GPU directly to Linux. CUDA works in WSL2. Docker integration — Docker 桌面 on Windows uses WSL2 backend. ContAInerize your AI 工作流s. Best of 机器人h — VS Code on Windows, Ollama in WSL2, GPU 分享d between them. Development 工作流 — write code on Windows, 运行 inference in WSL2, same file系统. WSL2 AI 设置up Step 1: Enable WSL2 with GPU support # PowerShell (admin) wsl --安装 -d Ubuntu wsl --设置-default-version 2
验证 WSL2 NVIDIA GPU 访问:
# Inside WSL2 nvidia-smi # should show your RTX GPU
Step 2: 安装 Ollama in WSL2 # Inside WSL2 curl -fsSL https://ollama.AI/安装.sh | sh ollama serve &
Step 3: 安装 WSL2 Ollama Herd # Inside WSL2 pip 安装 ollama-herd herd # 启动 WSL2 AI 路由r on port 11435 herd-node # register WSL2 as a node
Step 4: 访问 from Windows
Your WSL2 AI 端点 is 访问ible from Windows at http://localhost:11435 — WSL2 forwards ports automatically.
# From Windows PowerShell curl http://localhost:11435/API/tags # see WSL2 AI 模型s
Use WSL2 AI Python (from Windows or WSL2) from openAI 导入 OpenAI
# Same URL works from Windows and WSL2 命令行工具ent = OpenAI(base_url="http://localhost:11435/v1", API_key="not-needed")
# WSL2 handles the inference via NVIDIA GPU 响应 = 命令行工具ent.chat.completions.创建( 模型="qwen3.5:32b", messages=[{"角色": "user", "content": "Write a Docker Compose file for a Python API"}], 流=True, ) for chunk in 响应: print(chunk.choices[0].delta.content or "", end="")
VS Code + WSL2 AI // .vscode/设置tings.json — Continue.dev configuration { "continue.模型s": [{ "title": "WSL2 Local", "提供者": "openAI", "模型": "codestral", "APIBase": "http://localhost:11435/v1", "APIKey": "not-needed" }] }
curl from WSL2 # WSL2 inference curl http://localhost:11435/API/chat -d '{ "模型": "codestral", "messages": [{"角色": "user", "content": "Refactor this Python function"}], "流": false }'
WSL2 + Docker AI 工作流
运行 Ollama in Docker on WSL2 for contAInerized AI:
# WSL2 Docker + Ollama docker 运行 -d --gpus all -p 11434:11434 ollama/ollama
# Herd 路由s between Docker Ollama and native Ollama pip 安装 ollama-herd herd & herd-node
WSL2 AI hardware 图形界面de Windows PC GPU WSL2 AI 模型s RTX 4090 桌面 24GB 分享d with WSL2 llama3.3:70b, qwen3.5:32b RTX 4080 桌面 16GB 分享d with WSL2 phi4, codestral, qwen3.5:14b RTX 4060 laptop 8GB 分享d with WSL2 phi4-mini, gemma3:4b
WSL2 分享s GPU memory with Windows. Close GPU-heavy Windows 应用s for more WSL2 AI vRAM.
WSL2 AI 环境 # WSL2 Ollama optimization 导出 OLLAMA_KEEP_ALIVE=-1 导出 OLLAMA_MAX_LOADED_模型S=-1
# 添加 to ~/.bashrc for persistence in WSL2 echo '导出 OLLAMA_KEEP_ALIVE=-1' >> ~/.bashrc echo '导出 OLLAMA_MAX_LOADED_模型S=-1' >> ~/.bashrc
监控 WSL2 AI # WSL2 fleet 状态 curl -s http://localhost:11435/fleet/状态 | python3 -m json.工具
# WSL2 健康 检查s curl -s http://localhost:11435/仪表盘/API/健康 | python3 -m json.工具
仪表盘 at http://localhost:11435/仪表盘 — 访问ible from 机器人h Windows browser and WSL2.
Also avAIlable on WSL2 AI Image generation curl http://localhost:11435/API/生成-image \ -d '{"模型": "z-image-turbo", "prompt": "developer workspace", "width": 1024, "height": 1024}'
Embeddings curl http://localhost:11435/API/embed \ -d '{"模型": "nomic-embed-text", "输入": "WSL2 Windows development AI"}'
Full documentation 代理 设置up 图形界面de API Reference Contribute
Ollama Herd is open source (MIT). WSL2 developers welcome:
Star on GitHub Open an issue 防护rAIls WSL2 AI 模型 下载s require explicit user confirmation. WSL2 AI 模型 deletion requires explicit user confirmation. Never 删除 or modify files in ~/.fleet-管理器/. No 模型s are 下载ed automatically — all pulls are user-initiated or require opt-in.