This guide will have you running guardrails in under five minutes. You’ll learn how to install the SDK, validate your first piece of content, and understand the result.
Let’s check if a user message contains toxic content. This is one of the most common uses of guardrails since it protects both your LLM from poisoned context and your users from harmful responses.
TypeScript/JavaScript
Python
Copy
import { ABVClient } from "@abvdev/client";// Initialize the client with your API keyconst abv = new ABVClient({ apiKey: process.env.ABV_API_KEY,});// Check a user message for toxic contentconst result = await abv.guardrails.toxicLanguage.validate( "I really disagree with your approach to this problem.", { sensitivity: "medium" });// The result tells you if the content passed or failedconsole.log("Status:", result.status); // "pass", "fail", or "unsure"console.log("Confidence:", result.confidence); // 0.0 to 1.0console.log("Reason:", result.reason); // Explanation of the decision
Copy
from abvdev import ABV# Initialize the client with your API keyabv = ABV(api_key="your_api_key_here")# Check a user message for toxic contentresult = abv.guardrails.toxic_language.validate( "I really disagree with your approach to this problem.", {"sensitivity": "medium"})# The result tells you if the content passed or failedprint("Status:", result["status"]) # "pass", "fail", or "unsure"print("Confidence:", result["confidence"]) # 0.0 to 1.0print("Reason:", result["reason"]) # Explanation of the decision
The message in this example expresses disagreement but does so professionally, so it should pass with high confidence. Try changing the message to something more hostile and see how the result changes.
Your code called the toxic language guardrail with the message text and a sensitivity setting of “medium”. This sensitivity level catches clear violations while allowing professional disagreement.
The guardrail analyzed the content
The guardrail used an LLM to understand the context, tone, and intent of the message. Unlike keyword filters, it understands that “I disagree” is different from “You’re an idiot.”
You received a structured result
The result contains three key pieces: status (pass/fail/unsure), confidence (0.0-1.0), and reason (human-readable explanation). This structure is consistent across all guardrails.
ABV automatically created an observation
Without any additional code, the guardrail check was logged to your ABV dashboard. You can now view the input, output, confidence score, and timing information.
You can make decisions based on the result
Your application can now decide what to do: allow the content if it passed, block it if it failed, or flag it for human review if the guardrail was unsure.
You’ll typically use the status field to make decisions in your code:
TypeScript/JavaScript
Python
Copy
if (result.status === "pass") { // Content is safe, continue processing await processUserMessage(message);} else if (result.status === "fail") { // Content violated guidelines, block it return { error: "Your message violates our guidelines." };} else { // Ambiguous case - you choose how to handle // Conservative: treat as fail // Permissive: treat as pass // Balanced: flag for human review await flagForReview(message, result);}
Copy
if result["status"] == "pass": # Content is safe, continue processing process_user_message(message)elif result["status"] == "fail": # Content violated guidelines, block it return {"error": "Your message violates our guidelines."}else: # Ambiguous case - you choose how to handle # Conservative: treat as fail # Permissive: treat as pass # Balanced: flag for human review flag_for_review(message, result)
Guardrails work equally well for checking LLM-generated content before you show it to users. This helps maintain brand safety and compliance:
TypeScript/JavaScript
Python
Copy
// Generate a response from your LLMconst llmResponse = await generateResponse(userPrompt);// Check if the LLM's response contains biased languageconst validation = await abv.guardrails.biasedLanguage.validate( llmResponse, { sensitivity: "high" });// Only show the response if it passesif (validation.status === "pass") { return llmResponse;} else { // Either regenerate or use a fallback message return "I apologize, I need to reconsider my response.";}
Copy
# Generate a response from your LLMllm_response = generate_response(user_prompt)# Check if the LLM's response contains biased languagevalidation = abv.guardrails.biased_language.validate( llm_response, {"sensitivity": "high"})# Only show the response if it passesif validation["status"] == "pass": return llm_responseelse: # Either regenerate or use a fallback message return "I apologize, I need to reconsider my response."
If your LLM generates JSON, you can validate both the format and schema in one step:
TypeScript/JavaScript
Python
Copy
// Your LLM generated this responseconst llmOutput = await generateStructuredResponse(prompt);// Validate it matches your expected schemaconst validation = await abv.guardrails.validJson.validate( llmOutput, { schema: { name: "string", age: "number", email: "string", }, });if (validation.status === "pass") { // Safe to parse and use const data = JSON.parse(llmOutput); await saveToDatabase(data);} else { console.error("Invalid format:", validation.reason); // Retry with a more explicit prompt}
Copy
# Your LLM generated this responsellm_output = generate_structured_response(prompt)# Validate it matches your expected schemavalidation = abv.guardrails.valid_json.validate( llm_output, { "schema": { "name": "string", "age": "number", "email": "string" } })if validation["status"] == "pass": # Safe to parse and use data = json.loads(llm_output) save_to_database(data)else: print("Invalid format:", validation["reason"]) # Retry with a more explicit prompt
You’ll often want to check content against multiple criteria. Run guardrails in parallel to minimize latency:
TypeScript/JavaScript
Python
Copy
// Check multiple things at onceconst [toxicCheck, biasCheck, formatCheck] = await Promise.all([ abv.guardrails.toxicLanguage.validate(content), abv.guardrails.biasedLanguage.validate(content), abv.guardrails.validJson.validate(content),]);// All must pass for content to be approvedconst allPassed = [toxicCheck, biasCheck, formatCheck].every( (result) => result.status === "pass");
Copy
import asyncio# Check multiple things at oncetoxic_check, bias_check, format_check = await asyncio.gather( abv.guardrails.toxic_language.validate_async(content), abv.guardrails.biased_language.validate_async(content), abv.guardrails.valid_json.validate_async(content))# All must pass for content to be approvedall_passed = all( result["status"] == "pass" for result in [toxic_check, bias_check, format_check])