Understanding the Failure Scenario
Before implementing guaranteed availability, understand when and why prompt fetching can fail:Normal Operation with Caching
Normal Operation with Caching
Typical request flow:
- Application starts, first
get_prompt()call fetches from ABV API - Prompt is cached locally in SDK (60-second TTL default)
- All subsequent
get_prompt()calls return from cache (<1ms, zero network) - After TTL expiry, background revalidation updates cache
- Cycle repeats
The Failure Scenario
The Failure Scenario
Prompt fetching fails only when all of these conditions are true simultaneously:
- No cached prompt: Fresh application startup, or fetching a prompt name for the first time
- Network request to ABV fails: After retries (typically 3 attempts)
- No fallback configured: Application didn’t provide a fallback prompt
- Application deployment to new instances during ABV outage
- Kubernetes pod restart during ABV outage
- First use of a new prompt name when ABV is unreachable
get_prompt() raises an exception, application must handle error.Why Most Applications Don't Need Guaranteed Availability
Why Most Applications Don't Need Guaranteed Availability
ABV’s high availability:
- Multi-region deployment with automatic failover
-
99.9% uptime SLA
- Public status page with real-time monitoring
- Multiple caching layers (SDK cache, API Redis cache, database fallback)
- Caching eliminates network dependency for most requests
- Stale-while-revalidate ensures zero downtime during cache updates
- Automatic retries with exponential backoff
Option 1: Pre-Fetch Prompts on Startup
Fetch prompts during application initialization and exit if fetching fails:Python Implementation (Flask)
Python Implementation (Flask)
Install dependencies:Implementation:Behavior:
- ABV available at startup: Prompts cached, application starts normally
- ABV unavailable at startup: Application exits with error code, orchestration system can retry or alert
JavaScript/TypeScript Implementation (Express)
JavaScript/TypeScript Implementation (Express)
Install dependencies:Environment variables (Implementation:Run:Test:
.env):Kubernetes Health Check Integration
Kubernetes Health Check Integration
Configure Kubernetes to detect pre-fetch failures:Deployment manifest (Health endpoint implementation:Behavior: If pre-fetch fails (app exits with code 1), Kubernetes:
deployment.yaml):- Detects container exit
- Doesn’t route traffic to failed pod
- Attempts restart with backoff
- Alerts if restart limit exceeded
Option 2: Fallback Prompts
Provide hardcoded fallback prompts when ABV is unreachable:Python SDK Fallback
Python SDK Fallback
Text prompt with fallback:Chat prompt with fallback:Key properties:
prompt.is_fallback(bool):Trueif fallback prompt is being usedprompt.compile(**vars): Works identically for ABV and fallback prompts
JavaScript/TypeScript SDK Fallback
JavaScript/TypeScript SDK Fallback
Text prompt with fallback:Chat prompt with fallback:Key properties:
prompt.isFallback(boolean):trueif fallback prompt is being usedprompt.compile(vars): Works identically for ABV and fallback prompts
Fallback Best Practices
Fallback Best Practices
Keep fallbacks in sync with production:Monitor fallback usage:Understand fallback limitations:
- No metrics tracking: Fallback prompts aren’t linked to traces, so you lose version-specific metrics
- No config: Fallback prompts don’t include
configfield (model parameters, tools, etc.) - Maintenance burden: Must keep fallbacks updated manually
- Version mismatch risk: Fallback may not match current production prompt
Comparing Approaches
Pre-Fetching vs. Fallbacks
Pre-Fetching vs. Fallbacks
| Aspect | Pre-Fetching | Fallbacks |
|---|---|---|
| Startup behavior | Fails if ABV unreachable | Always starts successfully |
| Operational mode | Fail closed (won’t run degraded) | Fail open (runs with fallbacks) |
| Metrics tracking | Full metrics (uses ABV prompts) | No metrics during fallback |
| Maintenance | No additional maintenance | Must keep fallbacks updated |
| Version mismatch risk | None (always uses latest from ABV) | High (fallbacks can diverge) |
| Complexity | Low (one-time fetch) | Medium (manage fallbacks) |
| Best for | Applications that shouldn’t run degraded | Applications requiring 100% uptime |
Do You Really Need Guaranteed Availability?
Do You Really Need Guaranteed Availability?
Questions to ask:
-
What’s the impact of a startup failure?
- If deployment tools retry automatically, brief startup failures are harmless
- If manual intervention is required, pre-fetching adds risk
-
What’s your uptime requirement?
-
99% (two nines): SDK caching is sufficient
-
99.9% (three nines): Consider pre-fetching
-
99.99% (four nines): Consider fallbacks or pre-fetching with multiple retries
-
-
Do you have other single points of failure?
- Database, auth service, payment gateway all have similar availability
- If you tolerate those failures, why special-case prompts?
-
Can you tolerate fallback degradation?
- Fallback prompts may have lower quality than ABV-managed prompts
- If quality is critical, pre-fetching (fail closed) is better