Skip to main content
Guardrails automatically check your LLM’s inputs and outputs to keep them safe, compliant, and high-quality. Think of them as safety checks that run before content reaches your users or your language model.

How Guardrails Work

Send content for validation

This could be a user message before it reaches your LLM, or LLM-generated content before it reaches your users.

Guardrail analyzes the content

This can take up to 3 seconds.

Receive validation result

The guardrail returns a status (pass, fail, or unsure), confidence score, and reason for the decision.

Make decisions based on results

Based on the result, you decide whether to allow the content, flag it for human review, or regenerate with different parameters.

Monitor in your dashboard

Every validation automatically creates a complete observation in ABV, capturing the input, result, configuration, timing, and token usage.

Available Guardrails

Guardrails come in two categories based on how they work. LLM-Powered Guardrails can understand context and nuance, but about 1-3 seconds per check:
Detects hate speech, threats, insults, and harmful content that could damage your community or poison your LLM’s context. Use this when you need to understand intent and tone, not just keywords.Learn more →
Identifies stereotypes, discriminatory assumptions, and coded language that could harm your brand or violate compliance requirements. Essential for HR content, job postings, and customer-facing communications.Learn more →
Rule-Based Guardrails use deterministic logic for instant results in under 10 milliseconds:
Validates presence or absence of specific text—perfect for ensuring legal disclaimers appear in LLM outputs or blocking sensitive information like passwords and credit card numbers in user inputs.Learn more →
Ensures LLM’s structured outputs are properly formatted and match your expected schema, preventing application errors from malformed data.Learn more →

Integrations

Guardrails integrate with ABV’s core features:
Every guardrail check creates an observation in your traces. Monitor patterns, analyze confidence distributions, tune sensitivity settings based on real data.
Guardrails come with the Python SDK and JS/TS SDK. Native async support, type safety, and framework integration included.
Track guardrail performance using Evaluations. Measure false positive rates, analyze failure patterns, and optimize your validation strategy using the wizard.
Monitor guardrail status in the Metrics Dashboard. See failure rates by type, track costs, analyze latency, and identify optimization opportunities.

Next Steps

Ready to implement guardrails in your application?