Cost Tracking Implementation

This guide shows you how to implement cost tracking in your application using ABV’s SDKs. Choose your preferred language and integration method below.

Before you start: Decide whether to ingest or infer usage data. See the Cost Tracking Reference to understand the differences.

Python SDK Implementation

Installation

Install the required packages:

pip install abvdev
# Also install your LLM provider SDK
pip install anthropic  # for Anthropic
pip install openai     # for OpenAI

Method 1: Using the Decorator (Recommended)

The decorator approach is clean and integrates seamlessly with your existing functions.

Anthropic
OpenAI

from abvdev import ABV, observe
import anthropic

# ABV client initialization
abv = ABV(
  api_key="sk-abv-...",
  host="https://app.abv.dev",  # or "https://eu.app.abv.dev" for EU
)

# Anthropic client initialization
anthropic_client = anthropic.Anthropic(api_key="sk-ant-...")

@observe(as_type="generation")
def anthropic_completion(**kwargs):
  # Extract fields from kwargs
  kwargs_clone = kwargs.copy()
  input = kwargs_clone.pop('messages', None)
  model = kwargs_clone.pop('model', None)

  # Set initial generation data
  abv.update_current_generation(
      input=input,
      model=model,
      metadata=kwargs_clone
  )

  # Make the LLM call
  response = anthropic_client.messages.create(**kwargs)

  # Update with usage and cost details
  abv.update_current_generation(
      usage_details={
          "input": response.usage.input_tokens,
          "output": response.usage.output_tokens,
          "cache_read_input_tokens": response.usage.cache_read_input_tokens
          # "total" is automatically derived if not set
      },
      # Optional: Ingest cost if you calculate it yourself
      # Otherwise, ABV will infer it from model definitions
      cost_details={
          "input": 1.0,
          "cache_read_input_tokens": 0.5,
          "output": 1.0,
          # "total" is automatically derived if not set
      }
  )

  return response.content[0].text

@observe()
def main():
  return anthropic_completion(
      model="claude-sonnet-4-5",
      max_tokens=1024,
      messages=[
          {"role": "user", "content": "Hello, Claude"}
      ]
  )

main()

from abvdev import ABV, observe
from openai import OpenAI

# ABV client initialization
abv = ABV(
    api_key="sk-abv-...",
    host="https://app.abv.dev",
)

openai_client = OpenAI(api_key="sk-proj-...")

@observe(as_type="generation")
def openai_completion(**kwargs):
  kwargs_clone = kwargs.copy()
  messages = kwargs_clone.pop('messages', None)
  model = kwargs_clone.pop('model', None)

  abv.update_current_generation(
      input=messages,
      model=model,
      metadata=kwargs_clone
  )

  response = openai_client.chat.completions.create(**kwargs)

  # Use OpenAI-style usage schema (automatically mapped by ABV)
  abv.update_current_generation(
      usage_details={
          "prompt_tokens": response.usage.prompt_tokens,
          "completion_tokens": response.usage.completion_tokens,
          "total_tokens": response.usage.total_tokens,
          "prompt_tokens_details": {
              "cached_tokens": response.usage.prompt_tokens_details.cached_tokens,
              "audio_tokens": response.usage.prompt_tokens_details.audio_tokens,
          },
          "completion_tokens_details": {
              "reasoning_tokens": response.usage.completion_tokens_details.reasoning_tokens,
          },
      }
  )

  return response.choices[0].message.content

@observe()
def main():
  return openai_completion(
      model="gpt-4o",
      messages=[{"role": "user", "content": "Hello, OpenAI"}]
  )

main()

OpenAI Usage Schema: When using OpenAI-style usage details, ABV automatically maps:

prompt_tokens → input
completion_tokens → output
total_tokens → total
prompt_tokens_details.* → input_*
completion_tokens_details.* → output_*

Method 2: Manual Context Manager

For more control, use the context manager approach:

from abvdev import ABV
import anthropic

abv = ABV(
  api_key="sk-abv-...",
  host="https://app.abv.dev",
)

anthropic_client = anthropic.Anthropic(api_key="sk-ant-...")

with abv.start_as_current_observation(
    as_type='generation',
    name="anthropic-completion",
    model="claude-haiku-4-5",
    input=[{"role": "user", "content": "Hello, Claude"}]
) as generation:
    # Make the LLM call
    response = anthropic_client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello, Claude"}]
    )

    # Update with response and usage
    generation.update(
        output=response.content[0].text,
        usage_details={
            "input": response.usage.input_tokens,
            "output": response.usage.output_tokens,
            "cache_read_input_tokens": response.usage.cache_read_input_tokens
        },
        # Optional: Add cost details
        cost_details={
            "input": 1.0,
            "cache_read_input_tokens": 0.5,
            "output": 1.0,
        }
    )

JavaScript/TypeScript SDK Implementation

Installation

Install the required packages:

npm install @abvdev/tracing @abvdev/otel @opentelemetry/sdk-node dotenv openai

Setup

1. Add credentials to .env:

ABV_API_KEY="sk-abv-..."
ABV_BASE_URL="https://app.abv.dev"  # or "https://eu.app.abv.dev" for EU
OPENAI_API_KEY="sk-proj-..."

2. Create instrumentation.ts:

import dotenv from "dotenv";
dotenv.config();

import { NodeSDK } from "@opentelemetry/sdk-node";
import { ABVSpanProcessor } from "@abvdev/otel";

const sdk = new NodeSDK({
  spanProcessors: [
    new ABVSpanProcessor({
      apiKey: process.env.ABV_API_KEY,
      baseUrl: process.env.ABV_BASE_URL,
      exportMode: "immediate",
      flushAt: 1,
      flushInterval: 1,
      additionalHeaders: {
        "Content-Type": "application/json",
        "Accept": "application/json"
      }
    })
  ],
});

sdk.start();

Important: Import instrumentation.ts as the first import in your application to ensure proper initialization.

Method 1: Context Manager (Recommended)

import "./instrumentation"; // Must be first!
import {
  startActiveObservation,
  startObservation,
} from "@abvdev/tracing";
import OpenAI from "openai";
import dotenv from "dotenv";
dotenv.config();

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

async function main() {
  await startActiveObservation("user-request", async (span) => {
    span.update({
      input: { query: "What is the capital of France?" },
    });

    const model = "gpt-4o";
    const input = [{ role: "user", content: "What is the capital of France?" }];

    // Create a generation observation
    const generation = startObservation(
      "llm-call",
      {
        model: model,
        input: input,
      },
      { asType: "generation" }
    );

    // Make the LLM call
    const response = await openai.chat.completions.create({
      messages: input,
      model: model,
    });

    const llmOutput = response.choices[0].message.content;

    // Update with usage details
    generation.update({
      usageDetails: {
        prompt_tokens: response.usage.prompt_tokens,
        completion_tokens: response.usage.completion_tokens,
        total_tokens: response.usage.total_tokens,
        prompt_tokens_details: {
          cached_tokens: response.usage.prompt_tokens_details.cached_tokens,
          audio_tokens: response.usage.prompt_tokens_details.audio_tokens,
        },
        completion_tokens_details: {
          reasoning_tokens: response.usage.completion_tokens_details.reasoning_tokens,
        },
      },
      output: { content: llmOutput },
    });

    generation.end();
  });
}

main();

Method 2: Observe Wrapper

Wrap existing functions to trace them automatically:

import "./instrumentation"; // Must be first!
import { observe, updateActiveObservation } from "@abvdev/tracing";
import OpenAI from "openai";
import dotenv from "dotenv";
dotenv.config();

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// Existing function
async function fetchData(source: string) {
  const model = "gpt-4o";
  const input = [{ role: "user", content: "What is the capital of France?" }];

  const response = await openai.chat.completions.create({
    messages: input,
    model: model,
  });

  const llmOutput = response.choices[0].message.content;

  // Update the active observation with usage details
  updateActiveObservation(
    "generation",
    {
      usageDetails: {
        prompt_tokens: response.usage.prompt_tokens,
        completion_tokens: response.usage.completion_tokens,
        total_tokens: response.usage.total_tokens,
        prompt_tokens_details: {
          cached_tokens: response.usage.prompt_tokens_details.cached_tokens,
          audio_tokens: response.usage.prompt_tokens_details.audio_tokens,
        },
        completion_tokens_details: {
          reasoning_tokens: response.usage.completion_tokens_details.reasoning_tokens,
        },
      },
      output: { content: llmOutput },
    }
  );

  return { data: llmOutput };
}

// Wrap the function to trace it
const tracedFetchData = observe(fetchData, {
  name: "fetch-data",
  asType: "generation",
});

async function main() {
  const result = await tracedFetchData("API");
}

main();

Method 3: Manual Span Creation

For maximum control, create spans manually:

import "./instrumentation"; // Must be first!
import { startObservation } from "@abvdev/tracing";
import OpenAI from "openai";
import dotenv from "dotenv";
dotenv.config();

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

async function main() {
  const span = startObservation("manual-observation", {
    input: { query: "What is the capital of France?" },
  });

  const model = "gpt-4o";
  const input = [{ role: "user", content: "What is the capital of France?" }];

  const response = await openai.chat.completions.create({
    messages: input,
    model: model,
  });

  const llmOutput = response.choices[0].message.content;

  // Create a child generation observation
  const generation = span.startObservation(
    "llm-call",
    {
      model: model,
      input: input,
    },
    { asType: "generation" }
  );

  // Update with usage details
  generation.update({
    usageDetails: {
      prompt_tokens: response.usage.prompt_tokens,
      completion_tokens: response.usage.completion_tokens,
      total_tokens: response.usage.total_tokens,
      prompt_tokens_details: {
        cached_tokens: response.usage.prompt_tokens_details.cached_tokens,
        audio_tokens: response.usage.prompt_tokens_details.audio_tokens,
      },
      completion_tokens_details: {
        reasoning_tokens: response.usage.completion_tokens_details.reasoning_tokens,
      },
    },
    output: { content: llmOutput },
  });

  generation.end();
  span.update({ output: "Successfully answered user request." }).end();
}

main();

Automatic Usage Capture via Integrations

Many ABV integrations automatically capture usage and cost from LLM responses. If you’re using an integration and usage data isn’t appearing as expected, contact support.

Check the integrations documentation to see which providers support automatic usage capture.

Ingesting Only Usage (Let ABV Calculate Cost)

You don’t need to calculate costs yourself. Simply ingest usage details, and ABV will calculate costs based on model definitions:

# Python example - only usage, no cost
generation.update(
    usage_details={
        "input": response.usage.input_tokens,
        "output": response.usage.output_tokens,
    }
    # No cost_details - ABV will infer cost from model pricing
)

// TypeScript example - only usage, no cost
generation.update({
  usageDetails: {
    prompt_tokens: response.usage.prompt_tokens,
    completion_tokens: response.usage.completion_tokens,
  }
  // No costDetails - ABV will infer cost from model pricing
});

Next Steps

Custom Model Definitions

Learn how to define custom models and pricing

Daily Metrics API

Export usage and cost data for analytics

Common Issues

Usage and cost aren't showing up

Check these:

Verify you’re passing usage_details in the update call
Ensure the model name matches a model definition in ABV
For inference: Check if your model is supported in the model definitions list
Look for errors in your application logs

Costs seem inaccurate

Possible causes:

Using inference instead of ingestion (switch to ingesting actual usage)
Model pricing may have changed (update model definitions)
Custom models need pricing configured
For historical data: Model definitions aren’t applied retroactively

Missing usage for reasoning models

Solution: Reasoning models (like OpenAI o1) require ingested usage. ABV cannot infer usage because reasoning tokens are generated internally.Always ingest reasoning_tokens from the API response for these models.Learn more about reasoning models

Getting Started

Basic Features

LLM Gateway

Guardrails

Evaluations

Prompt Management

Cookbook

SDKs

Platform

Support

Python SDK Implementation

Installation

Method 1: Using the Decorator (Recommended)

Method 2: Manual Context Manager

JavaScript/TypeScript SDK Implementation

Installation

Setup

Method 1: Context Manager (Recommended)

Method 2: Observe Wrapper

Method 3: Manual Span Creation

Automatic Usage Capture via Integrations

Ingesting Only Usage (Let ABV Calculate Cost)

Next Steps

Custom Model Definitions

Daily Metrics API

Common Issues

Getting Started

Basic Features

LLM Gateway

Guardrails

Evaluations

Prompt Management

Cookbook

SDKs

Platform

Support

​Python SDK Implementation

​Installation

​Method 1: Using the Decorator (Recommended)

​Method 2: Manual Context Manager

​JavaScript/TypeScript SDK Implementation

​Installation

​Setup

​Method 1: Context Manager (Recommended)

​Method 2: Observe Wrapper

​Method 3: Manual Span Creation

​Automatic Usage Capture via Integrations

​Ingesting Only Usage (Let ABV Calculate Cost)

​Next Steps

Custom Model Definitions

Daily Metrics API

​Common Issues

Python SDK Implementation

Installation

Method 1: Using the Decorator (Recommended)

Method 2: Manual Context Manager

JavaScript/TypeScript SDK Implementation

Installation

Setup

Method 1: Context Manager (Recommended)

Method 2: Observe Wrapper

Method 3: Manual Span Creation

Automatic Usage Capture via Integrations

Ingesting Only Usage (Let ABV Calculate Cost)

Next Steps

Common Issues