Skip to main content

How Tags Work

1

Add tags during execution

Use the ABV SDK to attach one or more string tags to a trace. Tags can be added when creating a trace or updated later during execution.
abv.update_current_trace(tags=["rag", "beta", "gpt-4o"])
2

Tags appear in the dashboard

All tags attached to a trace are visible in the ABV UI. You’ll see them as clickable labels on each trace, making it easy to identify categories at a glance.
3

Filter traces by tags

Click any tag in the UI to filter your trace list. The filter shows only traces that include that specific tag, reducing noise and focusing your analysis.
4

Combine tags for precise filtering

Use multiple tag filters simultaneously to narrow down exactly what you need. For example, filter by both production and error to see only production errors.
5

Use tags in analytics and exports

Tags are included in all exports and available for grouping in custom dashboards. Use them to segment performance metrics, cost analysis, or error rates by any dimension you choose.

Why Use Tags?

A/B testing a new prompt? Tag each variant to measure quality, latency, and cost separately.
variant = get_experiment_variant(user_id)
abv.update_current_trace(tags=[f"prompt:{variant}"])
Filter by tag to compare metrics side-by-side and make data-driven rollout decisions.Combine with metadata for richer analysis: use tags for simple categories (prompt:v1) and metadata for detailed attributes.
Tag traces with your application version to isolate errors by deployment.
import os
APP_VERSION = os.getenv("APP_VERSION", "unknown")

@observe()
def handle_request(data):
    abv.update_current_trace(tags=[f"version:{APP_VERSION}"])
    # ... your logic ...
Filter by version:2.3.0 to see only new deployment errors and compare error rates across versions.Set version tags automatically via environment variables for complete deployment visibility.
Separate dev, staging, and production traffic for clearer debugging.
ENVIRONMENT = os.getenv("ENVIRONMENT", "dev")

@observe()
def process_data(input_data):
    abv.update_current_trace(tags=[ENVIRONMENT, "data-pipeline"])
    # ... processing ...
Filter by production for real user traffic or staging for pre-release validation.For formal separation with access controls, see Environments.
Tag traces by LLM technique (RAG, few-shot, chain-of-thought) to analyze performance and cost.
@observe()
def answer_question(question, context_docs):
    abv.update_current_trace(tags=["rag", "documentation"])
    # ... RAG implementation ...

@observe()
def classify_intent(user_input):
    abv.update_current_trace(tags=["few-shot", "classification"])
    # ... few-shot classification ...
Compare metrics to discover that RAG costs 3x more than few-shot or that chain-of-thought has higher latency but better accuracy.Optimize technique selection based on cost and performance data.
Categorize errors for effective triage: rate limits, validation failures, or unexpected errors.
@observe()
def call_llm(prompt):
    try:
        response = llm.complete(prompt)
        return response
    except RateLimitError:
        abv.update_current_trace(tags=["error", "rate-limit"])
        raise
    except ValidationError:
        abv.update_current_trace(tags=["error", "validation"])
        raise
    except Exception:
        abv.update_current_trace(tags=["error", "unknown"])
        raise
Filter by error type to identify quota issues, bad input, or unexpected failures.Set up alerts on specific error tags to get notified only for critical issues.
Tag traces with user cohorts to measure adoption and performance across customer segments.
@observe()
def process_request(user_id):
    user = get_user(user_id)
    cohort_tags = [f"tier:{user.tier}", f"region:{user.region}"]
    abv.update_current_trace(tags=cohort_tags)
    # ... processing ...
Filter by tier:premium for paying customers or compare latency across regions. Segment cost analysis by customer tier.Avoid PII in tags—use cohort identifiers (tier:premium) not personal info (user:[email protected]).

Implementation Guide

The simplest approach for functions already decorated with @observe():
from abvdev import observe, ABV

# Initialize ABV client
abv = ABV(
    api_key="sk-abv-...",
    host="https://app.abv.dev"  # or "https://eu.app.abv.dev" for EU
)

@observe()
def process_document(doc_id, use_rag=True):
    # Add tags to categorize this trace
    tags = ["document-processing"]
    if use_rag:
        tags.append("rag")

    abv.update_current_trace(tags=tags)

    # ... your processing logic ...
    return result

# Call the function - tags are automatically attached
process_document("doc-123", use_rag=True)
When to use: For functions already using the @observe() decorator. Minimal code changes required.
Installation:
pip install abvdev
See Python SDK docs for complete reference.
Install packages:
npm install @abvdev/tracing @abvdev/otel @opentelemetry/sdk-node dotenv
Add credentials to .env:
.env
ABV_API_KEY="sk-abv-..."
ABV_BASE_URL="https://app.abv.dev"  # or "https://eu.app.abv.dev" for EU
Create instrumentation.ts:
instrumentation.ts
import dotenv from "dotenv";
dotenv.config();

import { NodeSDK } from "@opentelemetry/sdk-node";
import { ABVSpanProcessor } from "@abvdev/otel";

const sdk = new NodeSDK({
  spanProcessors: [
    new ABVSpanProcessor({
      apiKey: process.env.ABV_API_KEY,
      baseUrl: process.env.ABV_BASE_URL,
      exportMode: "immediate",
      flushAt: 1,
      flushInterval: 1,
    })
  ],
});

sdk.start();
See JS/TS SDK docs for complete reference.

Best Practices

Use lowercase, hyphenated strings for reliability and readability:Good examples:
  • rag, production, few-shot
  • error, beta, v2.1.0
  • document-processing, cache-hit
Avoid:
  • Spaces: "rate limit error" → use rate-limit-error
  • Mixed case: ProductionEnv → use production
  • Special characters: user@premium → use user-premium
Why it matters: Consistent formatting makes tags easier to filter, prevents duplicate categories, and ensures reliable UI behavior.
Define and document tag patterns for your team before scaling:Option 1: Namespaced tags
tags = [
    "env:production",
    "version:2.1.0",
    "model:gpt-4o",
    "tier:premium"
]
Option 2: Simple tags
tags = [
    "production",
    "v2.1.0",
    "gpt4o",
    "premium"
]
Create a tag dictionary: Document your conventions in your team wiki or codebase:
# tags.py - Team tag conventions
ENVIRONMENTS = ["dev", "staging", "production"]
TECHNIQUES = ["rag", "few-shot", "chain-of-thought"]
ERROR_TYPES = ["rate-limit", "validation", "timeout"]
Why it matters: Prevents tag proliferation (prod vs production vs prd), ensures team-wide consistency, and makes onboarding easier.
Tags are designed for categorization, not sensitive data. They appear in UI filters, exports, and analytics dashboards.❌ Don’t use:
# Bad - contains PII
tags = [
    "user:[email protected]",
    "customer:acme-corp",
    "account:123-45-6789"
]
✅ Instead use:
# Good - anonymous identifiers
tags = [
    "tier:premium",
    "region:us-east",
    "cohort:enterprise-2024"
]

# Use metadata for identifiers with proper access controls
metadata = {
    "user_id": "usr_abc123",  # Internal ID, not email
    "tenant_id": "tenant_xyz",
    "account_hash": hash(account_number)
}
Why it matters: Tags are visible to all project members and appear in exported data. PII in tags creates compliance risks and potential data leaks. Use metadata with appropriate access controls for sensitive identifiers.
Use both features together for maximum flexibility:Tags: Simple categories for filtering
tags = ["production", "rag", "error"]
Use when you need to:
  • Filter traces quickly in the UI
  • Create alerts on specific categories
  • Group metrics by common dimensions
Metadata: Detailed attributes for analysis
metadata = {
    "model_version": "gpt-4o-2024-08-06",
    "prompt_tokens": 1523,
    "chunks_retrieved": 5,
    "cache_hit": True,
    "retrieval_latency_ms": 245
}
Use when you need to:
  • Store structured data for custom queries
  • Track numeric metrics (tokens, latency, cost)
  • Include detailed context for debugging
Example combining both:
@observe()
def process_rag_query(query: str, user_tier: str):
    # Tags for filtering and categorization
    abv.update_current_trace(
        tags=["rag", "production", f"tier:{user_tier}"],
        metadata={
            "query_length": len(query),
            "retrieval_method": "vector-search",
            "embedding_model": "text-embedding-3-large",
            "chunks_count": 5
        }
    )
Why it matters: Tags give you fast filtering. Metadata gives you deep analysis. Together they provide both speed and depth.
Add tags based on execution paths, not just static configuration:
@observe()
def process_request(user_input: str):
    tags = ["api-request"]

    # Add tags based on validation
    if not is_valid_input(user_input):
        tags.append("validation-error")

    # Add tags based on execution path
    if needs_rag(user_input):
        tags.append("rag")
        tags.append("retrieval-heavy")
    else:
        tags.append("direct-completion")

    # Add tags based on performance
    start = time.time()
    result = execute(user_input)
    duration = time.time() - start

    if duration > 5.0:
        tags.append("slow-response")

    abv.update_current_trace(tags=tags)
    return result
Why it matters: Dynamic tagging captures what actually happened during execution, making it easier to identify patterns, debug issues, and optimize performance.
Keep tag lists focused and meaningful:Recommended: 3-7 tags per traceGood - focused and actionable:
tags = ["production", "rag", "gpt-4o", "premium-tier"]
Too many - loses signal in noise:
tags = [
    "production", "api", "v2", "rag", "gpt-4o",
    "cached", "premium", "us-east", "web",
    "document", "summarization", "high-priority"
]  # 12 tags - too many!
Strategy: Ask “Will I actually filter by this?” If not, put it in metadata instead.Why it matters: Too many tags make the UI cluttered and filtering less effective. Focus on tags you’ll actually use for filtering, grouping, or alerting.

Tags vs Metadata vs Environments

Choosing the right feature for your use case:
FeatureBest ForExample Use Cases
TagsSimple categorization, filtering, experimentsrag, production, beta, v2.1.0, few-shot
MetadataStructured data, detailed attributes, analytics{"tenant_id": "acme", "user_tier": "premium", "prompt_tokens": 1523}
EnvironmentsFormal separation with access controlsDevelopment, Staging, Production projects
When to use multiple features together:
  • Tag with environment (production) AND use dedicated ABV Environments for formal separation
  • Tag with experiment variant (prompt:v2) AND include detailed metadata ({"variant_id": "abc123", "assignment_ts": "2024-01-15T10:30:00Z"})
  • Tag with feature (rag) AND include metadata about the RAG implementation ({"chunks": 5, "embedding_model": "text-embedding-ada-002"})