Skip to main content
When you build LLM-powered applications, you face several common challenges. The LLM Gateway was designed to solve these specific problems.
Get started with $1 in free credits—new users receive complimentary gateway credits to test real LLM models from OpenAI, Anthropic, and Google without needing your own provider API keys!

How the Gateway Works

Understanding the gateway’s architecture helps you use it more effectively. When you make a request through the gateway, here’s what happens:

Your request is sent to the gateway

Your client library sends your request to ABV’s gateway service. This request includes which provider you want to use (OpenAI, Anthropic, or Gemini), which model you want, and your messages. The request uses a standardized format that’s compatible with OpenAI’s API structure.

Automatic tracing begins

The gateway receives your request and immediately begins tracing it. Before the request even goes to the provider, ABV creates a trace record that captures:
  • Input data: Your messages, temperature, max tokens, and other parameters
  • Timing information: Request start time
  • Contextual metadata: Provider, model, user identifier, and session information
This happens automatically without any instrumentation code in your application. You don’t need to manually log anything or set up monitoring infrastructure.

Request translation

The gateway translates your request into the specific format that your chosen provider expects. If you’re calling Anthropic, it converts your OpenAI-formatted request into Anthropic’s format. If you’re calling Gemini, it performs a different translation. This translation happens transparently - you don’t see it or think about it.

Provider API call

The gateway makes the actual API call to your chosen provider using your provider’s credentials (which you’ve configured in ABV).

Response translation

When the provider responds, the gateway translates that response back into the standardized OpenAI-compatible format. It also extracts key metrics like token usage and latency.

Trace completion and response

The gateway completes the trace with all the data collected during the request:
  • Output data: The model’s response, token usage, and any errors that occurred
  • Performance metrics: Total request duration and latency breakdown
  • Final status: Success or failure indicators
The complete trace is immediately available in your ABV dashboard where you can filter by provider, model, user, time range, or error status. You can see patterns like which models are slowest, which requests use the most tokens, or where errors are occurring.Finally, the gateway returns the standardized response to your application.
Throughout this entire process, your application only sees the standardized interface. The complexity of dealing with different providers is handled by the gateway, while comprehensive observability happens automatically.

When to Use the Gateway

The gateway is particularly valuable in these scenarios:
When cost control, performance monitoring, and reliability matter, the automatic tracing gives you the visibility you need to optimize and debug. Every request is automatically traced with detailed metrics on tokens, costs, latency, and errors - giving you the data you need to make informed decisions about your AI infrastructure.
Experiment with different providers or models without rewriting your application. The consistent interface makes switching trivial - change one parameter and you’re calling a different provider. This is invaluable when evaluating which model works best for your use case, or when you want to A/B test different providers.
Concerned about provider lock-in or want redundancy in case a provider has an outage? The gateway makes it easy to route requests to different providers. You can implement fallback logic, distribute load across providers, or seamlessly migrate from one provider to another without code changes.
Support multiple models or providers in the same application - let users choose their preferred model, route different request types to different models based on complexity, or use different models for different features. All through a single, unified interface.
Get consistent error handling without learning the quirks of each provider’s error types and retry logic. The gateway normalizes these concerns, providing a standardized error interface regardless of which provider you’re using.
Track and analyze costs across providers in real-time, helping you make informed decisions about which models to use for different workloads. Compare costs between providers, identify expensive requests, and optimize your usage patterns based on actual data.

What the Gateway Doesn’t Do

It’s equally important to understand what the gateway is not.
It doesn’t train models, host models, or process your requests with its own AI. Every request goes to the provider you specify (OpenAI, Anthropic, or Gemini). The gateway acts purely as a translation and observability layer.
It performs efficient request translation and tracing, but the vast majority of your request time is spent waiting for the AI provider to generate tokens. The gateway’s overhead is minimal - typically just a few milliseconds for the translation and tracing operations.
Your messages flow through the gateway to the provider and back. The traces store metadata and request/response information for observability, but ABV doesn’t use your data to train models or for any purpose beyond providing you visibility into your own requests.
If you’re using advanced provider-specific features that don’t have cross-provider equivalents, you may need to call those providers directly. However, for the common case of chat completions (which covers most AI applications), the gateway provides everything you need.

Related Topics