How to Use This FAQ
This guide is organized by topic to help you quickly find answers:- Getting Started - Basic concepts, managing prompts, prompt engineering fundamentals
- Configuration & Setup - Retries, timeouts, caching configuration
- Performance & Reliability - Caching strategies, guaranteed availability, performance optimization
- Advanced Features - Version control, A/B testing, metrics tracking
- Integration & Tracing - Linking prompts to traces, measuring performance
Getting Started
How can I manage my prompts with ABV?
How can I manage my prompts with ABV?
ABV provides comprehensive prompt management through the UI, SDKs, and API.Creating Prompts:Via UI:Via JavaScript/TypeScript SDK:Fetching Prompts:Updating Labels:Key Features:
- Sign in to ABV
- Navigate to Prompts section
- Click “Create Prompt”
- Enter prompt content with
{{variables}} - Add configuration (model, temperature, etc.)
- Assign labels for deployment
- Version control with automatic versioning
- Labels for deployment management (production, staging, etc.)
- Config versioning alongside prompts
- Diff view to see changes between versions
- Protected labels for production safety
- Rollback capability with one click or API call
- Variables with
{{mustache}}syntax for dynamic content
What is prompt engineering?
What is prompt engineering?
Prompt engineering is the practice of designing and optimizing text prompts to get better outputs from Large Language Models (LLMs).Why it matters:2. Few-Shot Examples:
Show the model examples of desired output to establish patterns and format.3. Role/Persona:
Define who the LLM should act as, which influences tone and expertise level.4. Chain of Thought:
Ask the model to think step-by-step to improve reasoning and accuracy.5. Constraints and Format:
Specify output format (JSON, markdown, etc.), set length limits, and define what to avoid.ABV’s Role in Prompt Engineering:ABV helps you iterate on prompts systematically:
- Better prompt = better LLM output quality
- Can significantly impact accuracy, relevance, and usefulness
- More cost-effective than fine-tuning models
- Faster iteration cycle than model training
- Version control to track changes and compare iterations
- A/B testing to compare variants with statistical rigor
- Metrics tracking to measure improvements objectively
- Tracing to see prompts in context with real user interactions
- Team collaboration via UI for cross-functional input
- Quick rollbacks when changes don’t work as expected
- Start simple, then iterate based on results
- Test with diverse inputs representing edge cases
- Measure performance metrics (latency, cost, quality)
- Use version control to track what works
- A/B test significant changes in production
- Document what works and why for team knowledge
- Keep prompts maintainable and readable for future iterations
Configuration & Setup
How to configure retries and timeouts when fetching prompts?
How to configure retries and timeouts when fetching prompts?
ABV prompts are cached client-side by default, so network-related issues are minimized after the first fetch. However, you can configure network behavior for initial requests.Caching Configuration:The default cache TTL is 60 seconds. You can customize this to reduce network calls:Python SDK:JavaScript/TypeScript SDK:Guaranteed Availability:For critical applications requiring 100% availability, use these strategies:1. Pre-fetch prompts on startup to populate the cache:2. Provide fallback prompts for when the API is unavailable:How caching works:
- Cache hit: Prompt returned immediately from memory (no network call)
- Stale cache: Old prompt returned immediately while revalidating in background (stale-while-revalidate pattern)
- Cache miss: Prompt fetched from API (ABV uses Redis cache for low latency ~15-50ms median)
Performance & Reliability
How do I cache prompts for better performance?
How do I cache prompts for better performance?
ABV prompts are automatically cached client-side in the SDKs with intelligent background revalidation, ensuring minimal latency impact.How Caching Works:Custom Cache Duration:Python SDK:JavaScript/TypeScript SDK:Pre-fetching for Zero Latency:Load prompts during application startup to eliminate runtime latency:Fallback for 100% Availability:Performance Benchmarks:From ABV’s testing (1000 sequential requests):Without caching (cache_ttl_seconds=0):
- Cache Hit - Prompt in cache and fresh → returned immediately (0ms network overhead)
- Stale Cache - Prompt in cache but expired → returned immediately, revalidated in background
- Cache Miss - First request → fetched from API (low latency, Redis-backed on ABV side)
- Median latency: ~50ms
- 95th percentile: ~100ms
- 99th percentile: ~150ms
- Cached requests: 0ms (instant, in-memory)
- Stale-while-revalidate: 0ms (instant return, background update)
- Production: Use default 60s cache or longer (5-10 minutes) for stable prompts
- Development: Disable cache to see changes immediately
- Critical paths: Pre-fetch prompts on application startup
- High availability: Implement fallback prompts for mission-critical flows
- Staging: Use moderate cache (30-60s) for balance between freshness and performance
- Monitor: Check ABV status page (status.abv.dev) for API availability
- Increase TTL: Stable production prompts, reduce API calls, improve performance
- Decrease TTL: Frequently updated prompts, need faster updates
- Disable (0s): Local development, testing prompt changes in real-time
- Pre-fetch: Startup-critical prompts, serverless cold start optimization
Advanced Features
How do I version control my prompts?
How do I version control my prompts?
ABV provides built-in version control for all prompts with automatic versioning and label-based deployment.Automatic Versioning:Every time you create or update a prompt, ABV automatically assigns an incrementing version number:Labels for Deployment:Use labels to manage which version is deployed to different environments:Fetching Specific Versions:Version Comparison:The ABV UI provides a diff view to compare prompt versions:Or perform the rollback in the UI with one click.Protected Labels:For additional production safety, admins can mark labels as “protected”:
- See exactly what changed between versions (text diff)
- Track who made changes and when (audit trail)
- Review config changes alongside prompt changes
- View commit messages explaining why changes were made
production label:- Only admins/owners can modify protected labels
- Prevents accidental changes to production prompts
- Enforces change management process
- Configure in project settings
- Always use
productionlabel for deployed versions - Use
stagingfor testing before promoting to production - Use descriptive labels for experiments (e.g.,
experiment-longer-context,variant-a) - The
latestlabel is automatically maintained by ABV (always points to newest version) - Never delete old versions - keep history for debugging and rollback
- Use commit messages to document why changes were made
- Review diffs before promoting to production to catch unintended changes
- Develop prompt changes locally (use
label="latest"andcache_ttl_seconds=0) - Deploy to staging (
labels=["staging"]) - Test in staging environment
- Review metrics and validate quality
- Promote to production by reassigning
productionlabel - Monitor production metrics
- Rollback if issues detected (reassign
productionto previous version)
How do I implement A/B testing for prompts?
How do I implement A/B testing for prompts?
ABV enables A/B testing by using labels to identify different prompt variants, then randomly selecting between them in your application.Step 1: Create Prompt VariantsCreate multiple versions and label them for your test:Step 2: Implement Random SelectionPython SDK:JavaScript/TypeScript SDK:Step 3: Analyze ResultsNavigate to your prompt in the ABV UI and view the Metrics tab:Compare Metrics by Variant:Best Practices:
- Response latency (median, p95, p99)
- Token usage (input tokens, output tokens)
- Cost per request
- Quality scores (if you’re scoring responses via evaluations)
- Volume/distribution between variants
- Run tests long enough to gather sufficient data (minimum 100-200 requests per variant)
- Use statistical tests (t-test, Mann-Whitney U) to determine significance
- Consider using staged rollout (90/10 split initially) for safety
- Start with canary deployment (90/10 or 95/5) to limit blast radius
- Monitor error rates and user feedback closely during initial rollout
- Use A/B testing for significant changes (major rewrites, different approaches)
- Run tests long enough for statistical significance (don’t stop early)
- Consider user segments (test on subset of users first)
- Have rollback plan ready (can immediately switch back to variant A)
- Track multiple metrics (not just one - latency, cost, quality, user satisfaction)
- Document test hypotheses and results for organizational learning
- Testing prompt improvements in production with real users
- Validating changes before full rollout
- When evaluation datasets don’t capture real usage patterns
- For consumer apps where some variation is acceptable
- After thorough testing on evaluation datasets (A/B test is final validation)
Integration & Tracing
How do I link prompt management with tracing in ABV?
How do I link prompt management with tracing in ABV?
Linking prompts to traces enables you to track which prompt version was used for each LLM call and analyze performance by prompt version.Why Link Prompts to Traces:JavaScript/TypeScript SDK:Benefits of Linking:
- See which prompt version was used in each generation
- Filter traces by prompt name or version
- Track metrics aggregated by prompt version
- Compare performance between prompt versions
- Identify which prompts lead to better outcomes (higher user satisfaction, lower cost, etc.)
- Trace filtering: Filter traces by prompt name or version in ABV UI
- Metrics by version: See latency, cost, tokens by prompt version
- Performance comparison: Compare metrics between v1 and v2 of a prompt
- Debugging: Identify which prompt version caused issues in production
- A/B testing: Track metrics by variant label for statistical analysis
- If a fallback prompt is used (when API is unavailable), no prompt link is created
- Prompt link must be set before generation completes to appear in metrics
- Use same prompt object from
get_prompt()to ensure version tracking works
How do I measure prompt performance?
How do I measure prompt performance?
ABV provides comprehensive metrics when you link prompts to traces, enabling performance tracking by prompt version.Step 1: Link prompts to generationsPython SDK:JavaScript/TypeScript SDK:Step 2: View metrics in ABV UINavigate to your prompt in the ABV UI and click the Metrics tab to see:Available Metrics:Best Practices:
- Median generation latency - How long generations take
- Median input tokens - Token count for prompts sent to LLM
- Median output tokens - Token count for LLM responses
- Median generation costs - Cost per generation (based on model pricing)
- Generation count - Total number of generations using this prompt
- Median score values - From evaluations or custom scores
- First and last generation timestamps - When prompt was first/last used
- Use the UI to compare metrics across different prompt versions
- A/B test variants to see which performs better
- Track improvements over time as you iterate on prompts
- Accuracy (for tasks with right/wrong answers)
- Relevance (how well response addresses the query)
- User satisfaction (thumbs up/down, star ratings)
- Hallucination rate (factual correctness)
- Tone appropriateness (for customer-facing apps)
- Always link prompts to generations for metrics tracking
- Track multiple metrics (latency, cost, quality) not just one
- Use custom scores for domain-specific quality measures
- Compare versions systematically using A/B tests
- Monitor trends over time to catch regressions
- Set up alerts for anomalies (cost spikes, latency increases)
- Link Prompts to Traces for integration details
- Evaluations for automated quality scoring
Next Steps
Get Started with Prompt Management
Complete quickstart guide for creating, versioning, and deploying prompts
Caching Prompts
Client-side caching implementation and stale-while-revalidate strategy
Version Control
Deploy and rollback prompts safely using labels and versions
A/B Testing Prompts
Run statistical A/B tests on prompt variants in production