Guardrails automatically check your LLM’s inputs and outputs to keep them safe, compliant, and high-quality. Think of them as safety checks that run before content reaches your users or your language model.

How Guardrails Work

Send content for validation

This could be a user message before it reaches your LLM, or LLM-generated content before it reaches your users.

Guardrail analyzes the content

This can take up to 3 seconds.

Receive validation result

The guardrail returns a status (pass, fail, or unsure), confidence score, and reason for the decision.

Make decisions based on results

Based on the result, you decide whether to allow the content, flag it for human review, or regenerate with different parameters.

Monitor in your dashboard

Every validation automatically creates a complete observation in ABV, capturing the input, result, configuration, timing, and token usage.

Available Guardrails

Guardrails come in two categories based on how they work. LLM-Powered Guardrails can understand context and nuance, but about 1-3 seconds per check:

Toxic Language

Detects hate speech, threats, insults, and harmful content that could damage your community or poison your LLM’s context. Use this when you need to understand intent and tone, not just keywords.Learn more →

Biased Language

Identifies stereotypes, discriminatory assumptions, and coded language that could harm your brand or violate compliance requirements. Essential for HR content, job postings, and customer-facing communications.Learn more →

Rule-Based Guardrails use deterministic logic for instant results in under 10 milliseconds:

Contains String

Validates presence or absence of specific text—perfect for ensuring legal disclaimers appear in LLM outputs or blocking sensitive information like passwords and credit card numbers in user inputs.Learn more →

Valid JSON

Ensures LLM’s structured outputs are properly formatted and match your expected schema, preventing application errors from malformed data.Learn more →

Integrations

Guardrails integrate with ABV’s core features:

Automatic Observability

Every guardrail check creates an observation in your traces. Monitor patterns, analyze confidence distributions, tune sensitivity settings based on real data.

SDK Integration

Guardrails come with the Python SDK and JS/TS SDK. Native async support, type safety, and framework integration included.

Evaluation Tracking

Track guardrail performance using Evaluations. Measure false positive rates, analyze failure patterns, and optimize your validation strategy using the wizard.

Metrics Dashboard

Monitor guardrail status in the Metrics Dashboard. See failure rates by type, track costs, analyze latency, and identify optimization opportunities.

Next Steps

Ready to implement guardrails in your application?

Quickstart

Get up and running with your first guardrail in under 5 minutes

Core Concepts

Understand sensitivity levels, confidence scores, and validation strategies

Best Practices

Learn optimal patterns for layering and combining guardrails

Security Guardrails

Explore security-focused guardrails for protecting against attacks

Getting Started

Basic Features

LLM Gateway

Guardrails

Evaluations

Prompt Management

Cookbook

SDKs

Platform

Support

Overview

How Guardrails Work

Available Guardrails

Integrations

Next Steps

Quickstart

Core Concepts

Best Practices

Security Guardrails

Getting Started

Basic Features

LLM Gateway

Guardrails

Evaluations

Prompt Management

Cookbook

SDKs

Platform

Support

​How Guardrails Work

​Available Guardrails

​Integrations

​Next Steps

Quickstart

Core Concepts

Best Practices

Security Guardrails

How Guardrails Work

Available Guardrails

Integrations

Next Steps