Skip to main content

How Log Levels Work

ABV supports four log levels to categorize observations by severity:
  • DEBUG: Verbose internal details (tool calls, intermediate steps, debugging info)
  • DEFAULT: Standard operations (successful LLM calls, normal workflow steps)
  • WARNING: Degraded performance or unexpected behavior (slow responses, fallbacks, retries)
  • ERROR: Failures (API errors, timeouts, invalid outputs, exceptions)

Understand Log Level Hierarchy

Log levels follow a severity hierarchy from least to most critical:
DEBUG < DEFAULT < WARNING < ERROR
When to use each level:DEBUG
  • Internal tool executions
  • Intermediate processing steps
  • Variable values for debugging
  • Cache hits/misses
  • Retry attempts before failure
DEFAULT (normal operations)
  • Successful LLM API calls
  • Standard workflow completions
  • User interactions without issues
  • Expected behavior
WARNING (concerning but non-fatal)
  • Slow LLM responses (>5 seconds)
  • Fallback to alternative model/prompt
  • Deprecated feature usage
  • Rate limit warnings (approaching threshold)
  • Validation warnings (non-blocking)
ERROR (failures)
  • LLM API errors (401, 500, timeouts)
  • Exceptions and crashes
  • Invalid outputs (failed parsing, guardrail violations)
  • Data processing failures
  • Critical validation failures

Set Log Levels on Observations

Assign log levels when creating spans or generations, or update them dynamically based on runtime conditions.Python (set on creation):
from abvdev import ABV

abv = ABV(api_key="sk-abv-...")

# Create span with WARNING level
with abv.start_as_current_observation(
    as_type='span',
    name="risky-operation",
    level="WARNING",
    status_message="Operation may fail with invalid input"
) as span:
    result = process_risky_data(data)
Python (update dynamically):
with abv.start_as_current_observation(
    as_type='generation',
    name="llm-call",
    level="DEFAULT"
) as generation:
    try:
        response = openai_client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": query}]
        )
        generation.update(output=response.choices[0].message.content)

    except Exception as e:
        # Update to ERROR on failure
        generation.update(
            level="ERROR",
            status_message=f"LLM call failed: {str(e)}"
        )
        raise
JavaScript/TypeScript:
import { startObservation } from '@abvdev/tracing';

const span = startObservation('manual-observation', {
  input: { query: 'What is the capital of France?' },
  level: 'WARNING',
  statusMessage: 'This operation is experimental'
});

// Update level dynamically
span.update({
  level: 'ERROR',
  statusMessage: 'Operation failed'
});

span.end();

Add Status Messages for Context

Include a statusMessage alongside the log level to provide human-readable context about why this observation has a particular severity.Good status messages:
  • "LLM timeout after 30 seconds" (ERROR)
  • "Fallback to gpt-3.5-turbo due to rate limit" (WARNING)
  • "Cache miss, fetching from API" (DEBUG)
  • "Retry 2/3 after transient error" (WARNING)
Bad status messages:
  • "Error" (too generic)
  • "Something went wrong" (not actionable)
  • "" (empty, provides no context)
Example:
with abv.start_as_current_observation(
    as_type='span',
    name="guardrail-check",
    level="DEFAULT"
) as span:
    result = check_for_biased_language(output)

    if result.violations:
        span.update(
            level="WARNING",
            status_message=f"Detected {len(result.violations)} potential bias issues"
        )
    else:
        span.update(
            level="DEFAULT",
            status_message="No guardrail violations detected"
        )

Filter Traces by Log Level in Dashboard

In the ABV Dashboard, filter traces or observations by log level to focus on specific severity levels.Use cases:
  • Production debugging: Filter to ERROR only to see all failures
  • Performance optimization: Filter to WARNING to find slow or degraded operations
  • Development: Show DEBUG to see full execution details
Dashboard filters:
  • View single trace → Filter observations by level
  • Trace list view → Filter entire traces containing ERROR observations
  • Search queries: level = "ERROR" or level IN ["WARNING", "ERROR"]
This helps you quickly identify issues without scrolling through hundreds of DEBUG observations.

Set Minimum Log Level for Sampling

Configure your SDK to only send observations at or above a certain log level. This reduces ingestion costs while preserving critical error data.Example: Only log warnings and errors in production
import os

# Set minimum log level based on environment
min_level = "WARNING" if os.getenv("ENV") == "production" else "DEBUG"

abv = ABV(
    api_key="sk-abv-...",
    min_log_level=min_level  # Only send WARNING and ERROR in production
)
Result:
  • Development: All DEBUG, DEFAULT, WARNING, ERROR observations logged
  • Production: Only WARNING and ERROR observations logged (70% cost reduction)
Note: Check your SDK documentation for exact parameter names (min_log_level, log_level, etc.).

Why Use Log Levels?

Production traces can contain hundreds of observations. Log levels let you filter to what matters.Filter examples:
  • level = "ERROR": See only failed observations
  • level >= "WARNING": See concerning behavior leading to failure
  • level = "ERROR" AND environment = "production": Production failures only
Benefits:
  • Find failures in seconds instead of scrolling through hundreds of successful operations
  • Eliminate noise from DEBUG logs in production
  • Focus on actionable errors
Ingesting every DEBUG observation from production is expensive and unnecessary. Filter to WARNING/ERROR levels to reduce ingestion costs by 70-90% while preserving critical error data.Example: Production app with 15 observations per trace (3 DEBUG, 10 DEFAULT, 2 ERROR/WARNING) can reduce log volume by 87% by filtering out DEBUG and DEFAULT levels.Implementation:
import os

# Environment-based log level
if os.getenv("ENV") == "production":
    min_level = "WARNING"  # Only warnings and errors
elif os.getenv("ENV") == "staging":
    min_level = "DEFAULT"  # Standard operations and above
else:
    min_level = "DEBUG"  # Everything in development

abv = ABV(api_key="sk-abv-...", min_log_level=min_level)
Best practice: Use DEBUG in development, WARNING in production. Errors are always critical, so always log ERROR.
Generic alerts (“LLM API returned 500”) fire constantly and get ignored. Log level-based alerts are precise and actionable.Alert strategies:1. Error rate threshold
  • Alert when ERROR observations exceed 1% of total traces
  • Catches systematic failures (API outage, bad prompt deployment)
2. Absolute error count
  • Alert when >100 ERROR observations in 5 minutes
  • Catches spikes in failures
3. Warning accumulation
  • Alert when WARNING observations exceed 10% of traces
  • Catches degraded performance (slow responses, frequent retries)
4. Zero errors expected
  • Alert on any ERROR in critical workflows (payment processing, compliance tasks)
  • Catches every failure immediately
Example: PagerDuty integration
# Monitor ERROR observations via ABV webhook
@app.post("/abv-webhook")
def handle_abv_event(event: ABVEvent):
    if event.observation.level == "ERROR":
        # Trigger PagerDuty alert
        pagerduty.trigger(
            summary=f"LLM error in production: {event.observation.name}",
            severity="error",
            source="abv",
            custom_details={
                "trace_url": event.observation.trace_url,
                "error_message": event.observation.status_message
            }
        )
Benefits:
  • Proactive issue detection (catch errors before users complain)
  • Reduced alert fatigue (only actionable alerts)
  • Faster incident response (trace URL in alert for instant debugging)
Multi-step LLM workflows (RAG, agents, chains) generate dozens of observations. Log levels let you control verbosity dynamically.Use case: Agent workflow with tool callsFull trace (DEBUG enabled):
[DEFAULT] User query received: "What's the weather in Paris?"
[DEBUG] Parsing query intent
[DEBUG] Intent identified: weather_lookup
[DEBUG] Searching for weather tool
[DEBUG] Tool found: get_weather(location)
[DEFAULT] Calling tool: get_weather(location="Paris")
[DEBUG] API request to weather service
[DEBUG] Weather API response: 200 OK
[DEFAULT] Tool result: 72°F, sunny
[DEBUG] Formatting response
[DEFAULT] LLM generating final response
[DEFAULT] Response generated: "It's 72°F and sunny in Paris."
Production trace (WARNING+ only):
[WARNING] Weather API slow response (3.2s)
Error case (ERROR level):
[ERROR] Weather API timeout after 5 seconds
[WARNING] Fallback to cached weather data (2 hours old)
[DEFAULT] Response generated with cached data
Benefits:
  • Development: See every step for debugging
  • Production: Only see issues (warnings, errors)
  • Selective detail: Enable DEBUG for specific users or traces when investigating issues
Not all issues are outright failures. Slow responses, fallbacks, and retries indicate degraded performance that should be monitored.Warning-worthy scenarios:1. Latency degradation
import time

start = time.time()
response = llm.generate(query)
latency = time.time() - start

if latency > 5.0:
    span.update(
        level="WARNING",
        status_message=f"Slow LLM response: {latency:.2f}s"
    )
2. Fallback behavior
try:
    result = primary_model.generate(query)
except RateLimitError:
    span.update(
        level="WARNING",
        status_message="Rate limit hit, falling back to secondary model"
    )
    result = fallback_model.generate(query)
3. Retry patterns
for attempt in range(3):
    try:
        result = llm.generate(query)
        break
    except TransientError as e:
        if attempt < 2:
            span.update(
                level="WARNING",
                status_message=f"Retry {attempt + 1}/3 after error: {str(e)}"
            )
        else:
            span.update(
                level="ERROR",
                status_message="All retries exhausted"
            )
            raise
4. Guardrail violations
if guardrail_check.violations:
    span.update(
        level="WARNING",
        status_message=f"Guardrail triggered: {guardrail_check.violations}"
    )
    # Still return response but log the warning
Dashboard analysis:
  • Query for WARNING observations over time
  • Identify trends: Are latency warnings increasing?
  • Correlate warnings with deployments or traffic spikes

Implementation Guide

Set log levels when using the @observe() decorator to automatically trace functions.Setup:
pip install abvdev
Update log level dynamically:
from abvdev import ABV, observe

abv = ABV(api_key="sk-abv-...", host="https://app.abv.dev")

@observe()
def process_document(document):
    # Start at DEFAULT level
    try:
        result = complex_processing(document)

        # Update to WARNING if processing takes too long
        if result.processing_time > 10:
            abv.update_current_span(
                level="WARNING",
                status_message=f"Slow processing: {result.processing_time}s"
            )

        return result

    except Exception as e:
        # Update to ERROR on failure
        abv.update_current_span(
            level="ERROR",
            status_message=f"Processing failed: {str(e)}"
        )
        raise

process_document(my_document)
Set level on creation:
@observe(level="DEBUG", status_message="Debugging this function")
def debug_function():
    # This function's span starts at DEBUG level
    pass
Set log levels when creating spans or generations manually with context managers.Set on creation:
from abvdev import ABV

abv = ABV(api_key="sk-abv-...")

# Create span with WARNING level
with abv.start_as_current_observation(
    as_type='span',
    name="experimental-feature",
    level="WARNING",
    status_message="This feature is experimental and may fail"
) as span:
    result = try_experimental_logic()
Update during execution:
with abv.start_as_current_observation(
    as_type='generation',
    name="llm-call",
    model="gpt-4",
    level="DEFAULT"
) as generation:
    try:
        response = openai_client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": query}]
        )

        generation.update(output=response.choices[0].message.content)

    except TimeoutError:
        generation.update(
            level="ERROR",
            status_message="LLM timeout after 30 seconds"
        )
        raise
    except RateLimitError:
        generation.update(
            level="WARNING",
            status_message="Rate limit hit, retrying with exponential backoff"
        )
        # Retry logic here
Update without direct span reference:
with abv.start_as_current_observation(as_type='span', name="workflow"):
    # Some processing
    validation_result = validate_input(data)

    if not validation_result.is_valid:
        # Update the current span
        abv.update_current_span(
            level="WARNING",
            status_message=f"Validation warnings: {validation_result.warnings}"
        )
Set and update log levels in JavaScript/TypeScript using the @abvdev/tracing package.Setup:
npm install @abvdev/tracing @abvdev/otel @opentelemetry/sdk-node
Update during execution:
import './instrumentation';
import { startActiveObservation, updateActiveObservation } from '@abvdev/tracing';

async function main() {
  await startActiveObservation('process-request', async (span) => {
    span.update({
      input: { query: 'What is the capital of France?' }
    });

    try {
      const result = await processQuery(query);

      if (result.latency > 5000) {
        // Update to WARNING if slow
        updateActiveObservation('span', {
          level: 'WARNING',
          statusMessage: `Slow response: ${result.latency}ms`
        });
      }

    } catch (error) {
      // Update to ERROR on failure
      updateActiveObservation('span', {
        level: 'ERROR',
        statusMessage: `Processing failed: ${error.message}`
      });
      throw error;
    }
  });
}

main();
Wrap existing functions with automatic tracing and log level updates.Example:
import './instrumentation';
import { observe, updateActiveObservation } from '@abvdev/tracing';

// Original function
async function fetchData(source: string) {
  try {
    const data = await fetch(source);

    if (data.status !== 200) {
      updateActiveObservation('span', {
        level: 'WARNING',
        statusMessage: `Non-200 status: ${data.status}`
      });
    }

    return await data.json();

  } catch (error) {
    updateActiveObservation('span', {
      level: 'ERROR',
      statusMessage: `Fetch failed: ${error.message}`
    });
    throw error;
  }
}

// Wrap with observe
const tracedFetchData = observe(fetchData, {
  name: 'fetch-data-operation'
});

// Use traced version
async function main() {
  const result = await tracedFetchData('https://api.example.com/data');
}

main();
Create spans manually and set log levels explicitly.Example:
import './instrumentation';
import { startObservation } from '@abvdev/tracing';

const span = startObservation('manual-operation', {
  input: { query: 'Process this data' },
  level: 'WARNING',
  statusMessage: 'This operation is in beta'
});

try {
  const result = processData(input);

  // Update to DEFAULT on success
  span.update({
    level: 'DEFAULT',
    statusMessage: 'Operation completed successfully',
    output: result
  });

} catch (error) {
  // Update to ERROR on failure
  span.update({
    level: 'ERROR',
    statusMessage: `Operation failed: ${error.message}`
  });
} finally {
  span.end();
}

Next Steps