📦 Pydantic Ai Model Integration — 技能工具
v1.0.0在 PydanticAI 中配置 LLM 供应商、设置降级模型、流式输出与调参;用于模型选型与容错。
详细分析 ▾
运行时依赖
版本
- Initial release of pydantic-ai-model-integration. - Supports configuring LLM providers and model strings for OpenAI, Anthropic, Google, Mistral, Groq, and more. - Enables model fallback mechanisms for improved reliability. - Allows detailed model settings, including temperature, max tokens, and timeouts. - Offers streaming and structured streaming output with Pydantic models. - Includes usage tracking, usage limits, and provider-specific configuration. - Supports dynamic and deferred model selection and validation.
安装命令
点击复制技能文档
Provider Model Strings 模型字符串格式:provider:model-name
from pydantic_ai import Agent# OpenAI Agent('openai:gpt-4o') Agent('openai:gpt-4o-mini') Agent('openai:o1-preview')
# Anthropic Agent('anthropic:claude-sonnet-4-5') Agent('anthropic:claude-haiku-4-5')
# Google (API Key) Agent('google-gla:gemini-2.0-flash') Agent('google-gla:gemini-2.0-pro')
# Google (Vertex AI) Agent('google-vertex:gemini-2.0-flash')
# Groq Agent('groq:llama-3.3-70b-versatile') Agent('groq:mixtral-8x7b-32768')
# Mistral Agent('mistral:mistral-large-latest')
# Other providers Agent('cohere:command-r-plus') Agent('bedrock:anthropic.claude-3-sonnet')
Model Settings 模型设置
from pydantic_ai import Agent from pydantic_ai.settings import ModelSettingsagent = Agent( 'openai:gpt-4o', model_settings=ModelSettings( temperature=0.7, max_tokens=1000, top_p=0.9, timeout=30.0, # Request timeout ) )
# Override per-run result = await agent.run( 'Generate creative text', model_settings=ModelSettings(temperature=1.0) )
Fallback Models 降级模型
Chain models for resilience:
from pydantic_ai.models.fallback import FallbackModel# Try models in order until one succeeds fallback = FallbackModel( 'openai:gpt-4o', 'anthropic:claude-sonnet-4-5', 'google-gla:gemini-2.0-flash' ) agent = Agent(fallback) result = await agent.run('Hello')
# Custom fallback conditions from pydantic_ai.exceptions import ModelAPIError
def should_fallback(error: Exception) -> bool: """Only fallback on rate limits or server errors.""" if isinstance(error, ModelAPIError): return error.status_code in (429, 500, 502, 503) return False
fallback = FallbackModel( 'openai:gpt-4o', 'anthropic:claude-sonnet-4-5', fallback_on=should_fallback )
Streaming Responses 流式响应
async def stream_response():
async with agent.run_stream('Tell me a story') as response:
# Stream text output
async for chunk in response.stream_output():
print(chunk, end='', flush=True)
# Access final result after streaming
print(f"\nTokens used: {response.usage().total_tokens}")
Streaming with Structured Output 带结构化输出的流式处理
from pydantic import BaseModelclass Story(BaseModel): title: str content: str moral: str
agent = Agent('openai:gpt-4o', output_type=Story)
async with agent.run_stream('Write a fable') as response: # For structured output, stream_output yields partial JSON async for partial in response.stream_output(): print(partial) # Partial Story object as parsed # Final validated result story = response.output
Dynamic Model Selection 动态模型选择
import os# Environment-based selection model = os.getenv('PYDANTIC_AI_MODEL', 'openai:gpt-4o') agent = Agent(model)
# Runtime model override result = await agent.run( 'Hello', model='anthropic:claude-sonnet-4-5' # Override default )
# Context manager override with agent.override(model='google-gla:gemini-2.0-flash'): result = agent.run_sync('Hello')
Deferred Model Checking 延迟模型检查
Delay model validation for testing:
# Default: Validates model immediately (checks env vars) agent = Agent('openai:gpt-4o')# Deferred: Validates only on first run agent = Agent('openai:gpt-4o', defer_model_check=True)
# Useful for testing with override with agent.override(model=TestModel()): result = agent.run_sync('Test') # No OpenAI key needed
Usage Tracking 用量追踪
result = await agent.run('Hello')# Request usage (last request) usage = result.usage() print(f"Input tokens: {usage.input_tokens}") print(f"Output tokens: {usage.output_tokens}") print(f"Total tokens: {usage.total_tokens}")
# Full run usage (all requests in run) run_usage = result.run_usage() print(f"Total requests: {run_usage.requests}")
Usage Limits 用量限制
from pydantic_ai.usage import UsageLimits
# Limit token usage result = await agent.run( 'Generate content', usage_limits=UsageLimits( total_tokens=1000, request_tokens=500, response_tokens=500, ) )
Provider-Specific Features 供应商专属功能
OpenAI
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel( 'gpt-4o', api_key='your-key', # Or use OPENAI_API_KEY env var base_url='https://custom-endpoint.com' # For Azure, proxies )
Anthropic
from pydantic_ai.models.anthropic import AnthropicModel
model = AnthropicModel( 'claude-sonnet-4-5', api_key='your-key' # Or ANTHROPIC_API_KEY )
Common Model Patterns 常见模型模式
| Use Case 用例 | Recommendation 推荐 |
|---|---|
| General purpose 通用 | openai:gpt-4o or anthropic:claude-sonnet-4-5 |
| Fast/cheap 快速/便宜 | openai:gpt-4o-mini or anthropic:claude-haiku-4-5 |
| Long context 长上下文 | anthropic:claude-sonnet-4-5 (200k) or google-gla:gemini-2.0-flash |
| Reasoning 推理 | openai:o1-preview |
| Cost-sensitive prod 成本敏感生产 | FallbackModel with fast model first |