ABVâs Playground and LLM-as-a-Judge evaluations need to call LLM APIs directly from the platform. Your observability SDK doesnât provide these credentialsâtheyâre separate connections configured in project settings for features that make LLM calls on your behalf.Documentation Index
Fetch the complete documentation index at: https://docs.abv.dev/llms.txt
Use this file to discover all available pages before exploring further.
How LLM Connections Work
LLM connections associate LLM provider API keys with your ABV project:Add LLM connection to project
- Connection name: Friendly identifier (e.g., âOpenAI Productionâ, âClaude for Evalsâ)
- Provider: OpenAI, Anthropic, Google Vertex AI, AWS Bedrock, etc.
- API key: Your providerâs API key for authentication
- Base URL (optional): Custom endpoint for proxies or alternative hosts
- Custom headers (optional): Additional headers for authentication or routing
Select connection in Playground or evaluations
- Select the LLM connection from a dropdown
- Choose the model to use (connections show only supported models for that provider)
- Configure model parameters (temperature, max tokens, etc.)
ABV makes API calls on your behalf
- ABV constructs the API request using the connectionâs credentials
- Calls the LLM providerâs API directly from ABVâs servers
- Returns responses to the Playground UI or evaluation results
Manage and rotate credentials
- Rotate API keys when compromised or for security hygiene
- Update base URLs for proxy configuration changes
- Delete unused connections
- Add connections for new providers
Setting Up LLM Connections
Navigate to project settings
Configure connection details
- OpenAI
- Azure OpenAI
- Anthropic
- Google AI Studio
- Google Vertex AI
- Amazon Bedrock
- Custom (for proxies using supported API schemas)
- Base URL: Override default API endpoint (for proxies, custom deployments, regional endpoints)
- Custom headers: Additional HTTP headers for authentication or routing (e.g., x-api-version, x-organization-id)
- Custom model names: Add models not in ABVâs default list (for new models, fine-tuned models, or proxy-provided models)
Save connection
Supported Providers and Models
ABV supports major LLM providers with extensive model coverage:OpenAI & Azure OpenAI
OpenAI & Azure OpenAI
- o3 series:
o3,o3-2025-04-16 - o4 series:
o4-mini,o4-mini-2025-04-16 - GPT-4.1 series:
gpt-4.1,gpt-4.1-2025-04-14,gpt-4.1-mini-2025-04-14,gpt-4.1-nano-2025-04-14 - GPT-4o series:
gpt-4o,gpt-4o-2024-08-06,gpt-4o-2024-05-13,gpt-4o-mini,gpt-4o-mini-2024-07-18 - o3-mini series:
o3-mini,o3-mini-2025-01-31 - o1 series:
o1-preview,o1-preview-2024-09-12 - GPT-4 Turbo:
gpt-4-turbo-preview,gpt-4-1106-preview,gpt-4-0125-preview - GPT-4:
gpt-4,gpt-4-0613 - GPT-3.5 Turbo:
gpt-3.5-turbo,gpt-3.5-turbo-0125,gpt-3.5-turbo-1106,gpt-3.5-turbo-16k
- OpenAI: API key from OpenAI platform, default base URL
- Azure OpenAI: API key from Azure, custom base URL pointing to Azure endpoint, API version in custom headers
Anthropic
Anthropic
- Claude Opus 4.1:
claude-opus-4-1 - Claude Opus 4.0:
claude-opus-4-0 - Claude Sonnet 4.5:
claude-sonnet-4-5 - Claude Sonnet 4.0:
claude-sonnet-4-0 - Claude Haiku 4.5:
claude-haiku-4-5
Google Vertex AI
Google Vertex AI
- Gemini 2.5:
gemini-2.5-pro-exp-03-25 - Gemini 2.0:
gemini-2.0-pro-exp-02-05,gemini-2.0-flash-001,gemini-2.0-flash-lite-preview-02-05,gemini-2.0-flash-exp - Gemini 1.5:
gemini-1.5-pro,gemini-1.5-flash - Gemini 1.0:
gemini-1.0-pro
- API Key: Service account key JSON from Google Cloud
- Project ID: GCP project ID
- Region: GCP region (e.g., us-central1, europe-west1)
- Custom models: Add additional model names enabled in your GCP account via âCustom model namesâ field
Google AI Studio
Google AI Studio
- Gemini 2.5:
gemini-2.5-pro-exp-03-25 - Gemini 2.0:
gemini-2.0-pro-exp-02-05,gemini-2.0-flash-001,gemini-2.0-flash-lite-preview-02-05,gemini-2.0-flash-exp - Gemini 1.5:
gemini-1.5-pro,gemini-1.5-flash - Gemini 1.0:
gemini-1.0-pro
Amazon Bedrock
Amazon Bedrock
- Anthropic Claude models via Bedrock
- Amazon Titan models
- Cohere models
- Meta Llama models
- Mistral models
- Stability AI models
- AWS Access Key ID and Secret Access Key: IAM credentials
- AWS Region: Region where Bedrock is enabled
- Required IAM Permission:
bedrock:InvokeModel
OpenAI-Compatible Proxies and Providers
OpenAI-Compatible Proxies and Providers
- Groq
- OpenRouter
- Vercel AI Gateway
- LiteLLM
- Hugging Face (OpenAI-compatible endpoints)
- Mistral AI (OpenAI-compatible API)
- Any proxy implementing OpenAIâs API schema
- Provider: Select âOpenAIâ
- Base URL: Set to proxyâs endpoint (e.g.,
https://api.groq.com/openai/v1for Groq) - API Key: Proxy providerâs API key
- Custom model names: Add supported models (e.g.,
llama-3.1-70b-versatilefor Groq) - Custom headers: If proxy requires additional headers
- OpenAI chat completions API format
- Tool calling (for LLM-as-a-Judge evaluations)
Advanced Configuration
Provider-Specific Options
Many LLM providers support parameters beyond standard model configuration (temperature, max_tokens, top_p). Examples include reasoning_effort, service_tier, and response_format.Open model parameters
Find provider options field
- OpenAI: Chat Completions API Reference
- Anthropic: Messages API Reference
- Amazon Bedrock: Provider-specific parameters for Bedrock models
reasoning_effort: minimal on OpenAI o3 model calls.
Use cases:
- Control reasoning depth for o-series models (minimal, medium, high)
- Set service tier for OpenAI (auto, default)
- Configure response format for structured outputs
- Pass custom parameters to fine-tuned or custom models
Connecting via Proxies
Use LLM proxies to route Playground and LLM-as-a-Judge calls through centralized gateways for logging, rate limiting, cost management, or compliance.Configure for proxy
- Provider: Select the provider whose API schema your proxy implements (typically OpenAI for OpenAI-compatible proxies)
- Base URL: Set to your proxyâs endpoint (e.g.,
https://proxy.company.com/v1) - API Key: Your proxyâs authentication token (or pass-through key for the underlying provider)
- Custom headers (if needed): Add headers required by your proxy (e.g.,
x-tenant-id,x-route-to-region) - Custom model names: Add models available through your proxy
Verify tool calling support
- LiteLLM: Unified interface to 100+ LLM providers
- OpenRouter: Access to multiple providers through a single API
- Vercel AI Gateway: Caching, rate limiting, and observability for LLM calls
- Corporate proxies: Centralized logging, compliance, and cost control
- Regional routing: Route requests to geographically appropriate endpoints
Custom Model Names
ABV includes default model lists for supported providers, but providers constantly release new models, organizations use fine-tuned models, and proxies expose custom model identifiers. How to add custom models:Expand advanced options
Add custom model names
- Newly released models not yet in ABVâs default list
- Fine-tuned models from OpenAI, Azure, or other providers
- Custom models deployed on Vertex AI or Bedrock
- Proxy-provided models with non-standard names
Security Best Practices
Use Separate Keys for ABV
Use Separate Keys for ABV
- Easier rotation (disconnect doesnât affect production)
- Usage tracking (identify ABV-specific usage in provider dashboards)
- Blast radius containment (compromised key doesnât expose production systems)
Implement Least-Privilege Access
Implement Least-Privilege Access
bedrock:InvokeModel permission (not model management, provisioned throughput)Google Vertex AI: Service account only needs Vertex AI User role (not Vertex AI Admin)Benefit: Limits damage if keys are compromised.Rotate Keys Regularly
Rotate Keys Regularly
- Every 90 days for production projects
- Immediately if key compromise suspected
- After team member departures (if they had access)
- Generate new API key in provider console
- Update ABV LLM connection with new key
- Verify Playground and evaluations still work
- Revoke old key in provider console
Monitor Usage in Provider Dashboards
Monitor Usage in Provider Dashboards
- Check provider usage dashboards regularly (OpenAI Usage, AWS Cost Explorer, Google Cloud billing)
- Set usage alerts in provider consoles
- Identify unexpected usage spikes (runaway evaluations, excessive Playground testing)
- Use cheaper models for development/testing (gpt-3.5-turbo, gemini-flash, claude-haiku)
- Reserve expensive models (gpt-4o, claude-opus) for critical evaluations
- Set rate limits or spending caps in provider consoles
Restrict Project Access with RBAC
Restrict Project Access with RBAC
- Limit Owner/Admin roles to trusted team members
- Use Viewer or Member roles for users who only need to use existing connections
- Review permissions quarterly
Troubleshooting
Connection Fails or Returns Errors
Connection Fails or Returns Errors
- Invalid API key: Key revoked, expired, or entered incorrectly
- Incorrect base URL: Typo in custom endpoint, wrong region
- Provider service issues: OpenAI, Anthropic, or other provider experiencing outages
- Rate limiting: Exceeded providerâs rate limits for your API key
- Insufficient permissions: API key lacks necessary permissions (e.g., Bedrock key without InvokeModel permission)
- Verify API key is valid in provider console
- Check base URL matches provider documentation
- Review provider status pages for outages
- Check provider usage/rate limit dashboards
- Verify API key permissions (IAM role for Bedrock, service account role for Vertex AI)
Model Not Available in Dropdown
Model Not Available in Dropdown
- Model not in default list: Newly released model or provider-specific model
- Wrong provider selected: Model belongs to different provider than selected connection
- Regional restrictions: Model not available in configured region (Vertex AI, Bedrock)
- Account limitations: Model access not enabled in your provider account
- Add model to âCustom model namesâ field in connection configuration
- Verify connection provider matches modelâs provider
- Check model availability in your configured region
- Enable model access in provider console (Bedrock requires per-model enablement)
LLM-as-a-Judge Evaluations Fail with Tool Calling Errors
LLM-as-a-Judge Evaluations Fail with Tool Calling Errors
- Model doesnât support tool calling: Older models or certain providers lack function calling support
- Proxy doesnât support tool calling: Custom proxy doesnât implement OpenAIâs tool calling format
- Incorrect API format: Base URL points to non-compatible endpoint
- Use models that support tool calling (gpt-4o, gpt-3.5-turbo-1106+, claude-4+, gemini-pro+)
- Verify proxy implements OpenAI tool calling format (test with sample request above)
- For Playground testing only (not evaluations), use any modelâtool calling only required for LLM-as-a-Judge
Unexpected Costs in Provider Billing
Unexpected Costs in Provider Billing
- Extensive Playground testing: Manual testing consuming significant tokens
- Large-scale evaluations: Running evaluations on thousands of examples
- Expensive model selection: Using GPT-4o or Claude Opus for evaluations instead of cheaper alternatives
- Runaway evaluation jobs: Evaluations running longer than expected
- Review provider usage dashboards to identify ABV-specific usage
- Use cheaper models for development (gpt-3.5-turbo, claude-haiku, gemini-flash)
- Monitor running evaluations and cancel if unexpectedly long
- Set provider-side spending limits or rate limits