Models Available

The LLM Gateway supports a comprehensive range of models from OpenAI, Anthropic, and Google Gemini. For the latest official model information and pricing, see:

Don’t see your model? We’re always adding new models based on user needs. Contact our support team to request support for additional models.

OpenAI
Anthropic
Google Gemini

GPT-5 Series

Model	Input Price	Output Price
`gpt-5`	$0.00125 per 1K tokens	$0.01 per 1K tokens
`gpt-5-mini`	$0.00025 per 1K tokens	$0.002 per 1K tokens
`gpt-5-nano`	$0.00005 per 1K tokens	$0.0004 per 1K tokens
`gpt-5-pro`	ChatGPT Pro subscription	($200/month)

GPT-4 Series

Model	Input Price	Output Price
`gpt-4`	$0.03 per 1K tokens	$0.06 per 1K tokens
`gpt-4-turbo`	$0.01 per 1K tokens	$0.03 per 1K tokens
`gpt-4o`	$0.0025 per 1K tokens	$0.01 per 1K tokens
`gpt-4o-mini`	$0.00015 per 1K tokens	$0.0006 per 1K tokens
`gpt-4.1`	$0.002 per 1K tokens	$0.008 per 1K tokens
`gpt-4.1-mini`	$0.0004 per 1K tokens	$0.0016 per 1K tokens
`gpt-4.1-nano`	$0.0001 per 1K tokens	$0.0004 per 1K tokens

O-Series (Reasoning Models)

Model	Input Price	Output Price
`o1`	$0.015 per 1K tokens	$0.06 per 1K tokens
`o1-pro`	$0.15 per 1K tokens	$0.60 per 1K tokens
`o3`	$0.002 per 1K tokens	$0.008 per 1K tokens
`o3-mini`	$0.0011 per 1K tokens	$0.0044 per 1K tokens
`o4-mini`	$0.0011 per 1K tokens	$0.0044 per 1K tokens

Specialized Models

Model	Input Price	Output Price
`codex-mini-latest`	$0.0015 per 1K tokens	$0.006 per 1K tokens
`gpt-4o-mini-search-preview`	gpt-4o-mini pricing	+ web search fees
`gpt-4o-search-preview`	gpt-4o pricing	+ web search fees

Claude 4 Series

Model	Input Price	Output Price
`claude-opus-4-0`	$0.015 per 1K tokens	$0.075 per 1K tokens
`claude-opus-4-1`	$0.015 per 1K tokens	$0.075 per 1K tokens
`claude-sonnet-4-0`	$0.003 per 1K tokens	$0.015 per 1K tokens
`claude-sonnet-4-5`	$0.003 per 1K tokens	$0.015 per 1K tokens
`claude-haiku-4-5`	$0.001 per 1K tokens	$0.005 per 1K tokens

Gemini 2.5 Series

Model	Input Price	Output Price
`gemini-2.5-flash`	$0.0003 per 1K tokens	$0.0025 per 1K tokens
`gemini-2.5-flash-lite`	$0.0001 per 1K tokens	$0.0004 per 1K tokens
`gemini-2.5-pro`	$0.00125 per 1K tokens	$0.01 per 1K tokens

Gemini 2.0 Series

Model	Input Price	Output Price
`gemini-2.0-flash`	$0.0001 per 1K tokens	$0.0004 per 1K tokens
`gemini-2.0-flash-lite`	$0.00007 per 1K tokens	$0.0003 per 1K tokens

Understanding Pricing

Why Input and Output Pricing Differs

Input and output pricing differs because generation requires more computation than processing. The model must read and understand input tokens relatively quickly, but generating output tokens requires iterative computation and reasoning. Applications with long outputs benefit from models with lower output costs, while applications processing large inputs should consider input pricing carefully.

Choosing Cost-Effective Models

Select models based on your specific use case:

High-volume, simple tasks: Use the most cost-effective models like gpt-4o-mini, claude-haiku-4-5, or gemini-2.5-flash-lite
Complex reasoning: Consider claude-sonnet-4-5, gpt-4o, or O-series models for tasks requiring careful analysis
Long outputs: Prioritize models with lower output token costs like Gemini Flash models
Large context inputs: Look at input pricing—Gemini models often offer the best value for processing large documents

Monitoring Costs in Production

The gateway automatically tracks token usage and costs for every request. Use the ABV dashboard to:

Monitor real-time spending across providers and models
Identify expensive requests or usage patterns
Compare costs between different providers for the same task
Set up alerts when spending exceeds thresholds

This visibility helps you optimize costs while maintaining the quality your application needs.

Next Steps

Ready to start using these models? Here’s where to go next:

Quickstart

Get up and running with your first gateway request in 5 minutes

TypeScript Guide

Learn how to use these models in TypeScript/JavaScript applications

Python Guide

Learn how to use these models in Python applications

LLM Gateway Overview

Understand the core concepts and architecture of the gateway

Getting Started

Basic Features

LLM Gateway

Guardrails

Evaluations

Prompt Management

Cookbook

SDKs

Platform

Support

GPT-5 Series

GPT-4 Series

O-Series (Reasoning Models)

Specialized Models

Claude 4 Series

Gemini 2.5 Series

Gemini 2.0 Series

Understanding Pricing

Next Steps

Quickstart

TypeScript Guide

Python Guide

LLM Gateway Overview

Getting Started

Basic Features

LLM Gateway

Guardrails

Evaluations

Prompt Management

Cookbook

SDKs

Platform

Support

​GPT-5 Series

​GPT-4 Series

​O-Series (Reasoning Models)

​Specialized Models

​Claude 4 Series

​Gemini 2.5 Series

​Gemini 2.0 Series

​Understanding Pricing

​Next Steps

Quickstart

TypeScript Guide

Python Guide

LLM Gateway Overview

GPT-5 Series

GPT-4 Series

O-Series (Reasoning Models)

Specialized Models

Claude 4 Series

Gemini 2.5 Series

Gemini 2.0 Series

Understanding Pricing

Next Steps