How the Gateway Works
Understanding the gatewayâs architecture helps you use it more effectively. When you make a request through the gateway, hereâs what happens:Your request is sent to the gateway
Your client library sends your request to ABVâs gateway service. This request includes which provider you want to use (OpenAI, Anthropic, or Gemini), which model you want, and your messages. The request uses a standardized format thatâs compatible with OpenAIâs API structure.
Automatic tracing begins
The gateway receives your request and immediately begins tracing it. Before the request even goes to the provider, ABV creates a trace record that captures:
- Input data: Your messages, temperature, max tokens, and other parameters
- Timing information: Request start time
- Contextual metadata: Provider, model, user identifier, and session information
Request translation
The gateway translates your request into the specific format that your chosen provider expects. If youâre calling Anthropic, it converts your OpenAI-formatted request into Anthropicâs format. If youâre calling Gemini, it performs a different translation. This translation happens transparently - you donât see it or think about it.
Provider API call
The gateway makes the actual API call to your chosen provider using your providerâs credentials (which youâve configured in ABV).
Response translation
When the provider responds, the gateway translates that response back into the standardized OpenAI-compatible format. It also extracts key metrics like token usage and latency.
Trace completion and response
The gateway completes the trace with all the data collected during the request:
- Output data: The modelâs response, token usage, and any errors that occurred
- Performance metrics: Total request duration and latency breakdown
- Final status: Success or failure indicators
Throughout this entire process, your application only sees the standardized interface. The complexity of dealing with different providers is handled by the gateway, while comprehensive observability happens automatically.
When to Use the Gateway
The gateway is particularly valuable in these scenarios:Production Applications
Production Applications
When cost control, performance monitoring, and reliability matter, the automatic tracing gives you the visibility you need to optimize and debug. Every request is automatically traced with detailed metrics on tokens, costs, latency, and errors - giving you the data you need to make informed decisions about your AI infrastructure.
Provider Flexibility
Provider Flexibility
Experiment with different providers or models without rewriting your application. The consistent interface makes switching trivial - change one parameter and youâre calling a different provider. This is invaluable when evaluating which model works best for your use case, or when you want to A/B test different providers.
Avoiding Vendor Lock-in
Avoiding Vendor Lock-in
Concerned about provider lock-in or want redundancy in case a provider has an outage? The gateway makes it easy to route requests to different providers. You can implement fallback logic, distribute load across providers, or seamlessly migrate from one provider to another without code changes.
Multi-Model Support
Multi-Model Support
Support multiple models or providers in the same application - let users choose their preferred model, route different request types to different models based on complexity, or use different models for different features. All through a single, unified interface.
Simplified Error Handling
Simplified Error Handling
Get consistent error handling without learning the quirks of each providerâs error types and retry logic. The gateway normalizes these concerns, providing a standardized error interface regardless of which provider youâre using.
Cost Optimization
Cost Optimization
Track and analyze costs across providers in real-time, helping you make informed decisions about which models to use for different workloads. Compare costs between providers, identify expensive requests, and optimize your usage patterns based on actual data.
What the Gateway Doesnât Do
Itâs equally important to understand what the gateway is not.The gateway is not an LLM itself
The gateway is not an LLM itself
It doesnât train models, host models, or process your requests with its own AI. Every request goes to the provider you specify (OpenAI, Anthropic, or Gemini). The gateway acts purely as a translation and observability layer.
The gateway doesn't add significant latency
The gateway doesn't add significant latency
It performs efficient request translation and tracing, but the vast majority of your request time is spent waiting for the AI provider to generate tokens. The gatewayâs overhead is minimal - typically just a few milliseconds for the translation and tracing operations.
The gateway doesn't store your data beyond tracing
The gateway doesn't store your data beyond tracing
Your messages flow through the gateway to the provider and back. The traces store metadata and request/response information for observability, but ABV doesnât use your data to train models or for any purpose beyond providing you visibility into your own requests.
The gateway doesn't replace provider SDKs for advanced features
The gateway doesn't replace provider SDKs for advanced features
If youâre using advanced provider-specific features that donât have cross-provider equivalents, you may need to call those providers directly. However, for the common case of chat completions (which covers most AI applications), the gateway provides everything you need.