Skip to main content
LLMs frequently generate structured outputs, but they sometimes make syntax errors that break your application. The valid JSON guardrail solves this challenge through instant validation with optional schema checking, preventing parsing errors and ensuring your LLM outputs match your expected data structure.
When your LLM generates JSON, it might include trailing commas, forget closing brackets, use single quotes instead of double quotes, or make other syntax errors that cause JSON.parse() to throw exceptions and crash your application.Valid JSON validates syntax instantly (under 10ms) and costs nothing since it runs locally. Catch malformed JSON before it reaches your parsing logic, allowing you to regenerate or handle errors gracefully.

How Valid JSON Works

Understanding the validation process helps you configure schemas effectively and handle errors appropriately:

Content submission with optional schema

You send text to the valid JSON guardrail, optionally including a schema specification and strict mode setting. The schema maps field names to type strings (string, number, boolean, object, array). Without a schema, the guardrail only validates syntax.

Syntax validation

The guardrail first attempts to parse the text as JSON. This happens locally without LLM calls, making it instant (under 10ms) and free. If parsing fails due to syntax errors (missing brackets, trailing commas, invalid escape sequences), validation fails immediately.This step catches common LLM errors: using single quotes instead of double quotes, including trailing commas after the last array element, forgetting closing braces, or adding explanatory text before or after the JSON.

Schema validation (if configured)

If you provided a schema, the guardrail checks that required fields exist and have correct types. It validates that name is a string, age is a number, and all specified fields are present.Non-strict mode (default): Passes if all required fields exist with correct types, even if extra fields are present.Strict mode: Passes only if the JSON exactly matches the schema with no extra fields.The guardrail validates types at the top level but doesn’t recursively validate contents of nested objects or arrays—it only checks that they are objects or arrays as specified.

Deterministic result

The guardrail returns pass or fail. Since this is rule-based validation using parsing and type checking, there’s no ā€œunsureā€ status. Confidence is always 1.0 because the validation is deterministic.The reason field explains syntax errors or schema mismatches (missing fields, wrong types, extra fields in strict mode). Log this internally for debugging, but never expose detailed reasons to end users.

Automatic observability

Every validation automatically creates an observation in ABV capturing the input, result, configuration (schema and strict mode), and performance metrics. This helps you identify patterns in LLM errors, tune your prompts to reduce failures, and monitor validation effectiveness.

When to Use This Guardrail

Valid JSON is essential in specific scenarios where structured LLM outputs matter:
When extracting structured data from unstructured text (parsing resumes, invoices, or documents), you need the LLM’s output as parseable JSON with specific fields. Syntax errors break your extraction pipeline.Validate that the LLM’s extraction output is valid JSON with the fields you need (name, email, phone, address) before attempting to parse and store it in your database.
LLM-powered function calling requires the LLM to generate JSON specifying which function to call and what arguments to provide. Malformed JSON or missing required fields breaks the function execution.Validate that function call JSON has the correct structure (function name as string, arguments as object) before attempting to execute. Strict mode ensures no unexpected fields are injected.
When chaining multiple LLM calls where one LLM’s output feeds into another’s input, structured JSON ensures reliable communication. Schema validation guarantees each step in the pipeline receives the data structure it expects.Validate outputs between LLM calls to catch errors early in the pipeline rather than propagating malformed data through multiple steps.
When your LLM generates JSON to send to external APIs, those APIs expect exact formats. Extra fields might be rejected, missing fields cause errors, and type mismatches break integrations.Use strict mode to ensure the LLM’s output exactly matches the API’s expected schema before making the API call, preventing integration failures.
LLMs can generate configuration files, database schemas, or structured settings. These files must be syntactically valid JSON and contain specific required fields to work correctly.Validate that generated configurations are parseable and complete before saving them or using them to configure systems.

Understanding Validation Modes

Valid JSON operates in three modes depending on whether you provide a schema and enable strict mode:

Basic JSON Validation (No Schema)

Behavior: Validates only syntax—checks if text can be parsed as valid JSON. Use cases:
  • Storing LLM outputs in JSON columns
  • General-purpose JSON storage where structure varies
  • Quick syntax checking before more detailed validation
// Only validate syntax
await abv.guardrails.validJson.validate(llmOutput);

// Passes if parseable as JSON, regardless of content structure

Schema Validation (Non-Strict Mode)

Behavior: Validates that required fields exist with correct types, but allows extra fields not in the schema. Use cases:
  • Working with LLMs that add helpful context beyond requirements
  • Flexible schemas where additional information is acceptable
  • Gradual schema evolution without breaking existing outputs
// Validate schema, allow extra fields (default)
await abv.guardrails.validJson.validate(llmOutput, {
  schema: {
    name: "string",
    age: "number",
    email: "string"
  }
});

// Passes if name, age, email exist with correct types
// Also passes if additional fields like "phone" are present

Schema Validation (Strict Mode)

Behavior: Validates that JSON exactly matches the schema with no extra fields. Use cases:
  • Security-sensitive contexts where unexpected data poses risks
  • External API integration requiring exact formats
  • Enforcing discipline in LLM outputs
// Strict schema validation - reject extra fields
await abv.guardrails.validJson.validate(llmOutput, {
  strictMode: true,
  schema: {
    function: "string",
    arguments: "object"
  }
});

// Passes only if EXACTLY function and arguments exist
// Fails if any additional fields are present

Schema Definition

Schemas map field names to type strings. Available types:
Validates field contains a string value. Matches any text including empty strings.Example: "name": "string" matches {"name": "John"} but fails on {"name": 123}
Validates field contains a numeric value (integers or floats). Does not match strings containing numbers.Example: "age": "number" matches {"age": 25} but fails on {"age": "25"}
Validates field contains true or false. Does not match strings like ā€œtrueā€ or ā€œfalseā€.Example: "active": "boolean" matches {"active": true} but fails on {"active": "true"}
Validates field contains a JSON object (nested structure). Does not recursively validate the object’s contents.Example: "user": "object" matches {"user": {"name": "John"}} but doesn’t validate what’s inside user
Validates field contains a JSON array. Does not validate array element types or contents.Example: "tags": "array" matches {"tags": ["a", "b"]} but doesn’t validate array contents

Complex Schema Example

// Schema for structured extraction
await abv.guardrails.validJson.validate(llmOutput, {
  schema: {
    person: "object",      // Nested person data
    skills: "array",       // List of skills
    yearsExperience: "number",
    currentlyEmployed: "boolean",
    email: "string"
  }
});

// Validates types but not nested contents
// person must be an object, but any object passes
// skills must be an array, but any array passes

Common Validation Failures

Understanding why validation fails helps you write better LLM prompts:
Common causes:
  • Trailing commas: {"name": "John", "age": 25,} (comma after 25)
  • Missing closing brackets: {"name": "John", "age": 25
  • Single quotes instead of double quotes: {'name': 'John'}
  • Unescaped quotes inside strings: {"message": "He said "hello""}
Fix: Make your prompt more explicit about JSON syntax requirements. Include examples of correct JSON formatting.
Common causes:
  • Numbers as strings: {"age": "25"} when schema expects "age": "number"
  • Strings as numbers: {"name": 123} when schema expects "name": "string"
  • Wrong boolean format: {"active": "true"} when schema expects "active": "boolean"
Fix: Specify types explicitly in your prompt: ā€œage as a number, not a stringā€ or ā€œactive as a boolean (true or false), not a stringā€
Common causes:
  • LLM didn’t include all fields from your schema
  • Field name typos: ā€œuserNameā€ vs ā€œusernameā€
  • LLM understood the field as optional when it’s required
Fix: Explicitly list all required fields in your prompt with examples showing the exact field names you need
Common causes:
  • LLM included helpful additional information
  • LLM added explanation or metadata fields
  • LLM followed patterns from training data that include extra context
Fix: Either disable strict mode to allow extra fields, or be very explicit in your prompt: ā€œReturn ONLY these exact fields with no additional fieldsā€

Implementation Patterns

Structured Information Extraction

Extract structured data from unstructured text with validation:
async function extractContactInfo(text: string): Promise<any | null> {
  // Ask LLM to extract structured information
  const llmOutput = await callLLM(
    `Extract contact information from this text: ${text}\n\n` +
    `Return a JSON object with: name (string), email (string), phone (string)`
  );

  // Validate the output matches expected schema
  const validation = await abv.guardrails.validJson.validate(llmOutput, {
    schema: {
      name: "string",
      email: "string",
      phone: "string"
    }
  });

  if (validation.status === "pass") {
    return JSON.parse(llmOutput);
  }

  // Validation failed - could retry with more explicit prompt
  console.error("JSON validation failed:", validation.reason);
  return null;
}

Function Calling with Strict Validation

Validate function calls use exact schema with no extra fields:
async function validateFunctionCall(llmOutput: string): Promise<{ valid: boolean; call?: any }> {
  // Expect: {"function": "function_name", "arguments": {"param": "value"}}

  const validation = await abv.guardrails.validJson.validate(llmOutput, {
    strictMode: true,  // Don't allow extra fields
    schema: {
      function: "string",
      arguments: "object"
    }
  });

  if (validation.status === "pass") {
    const call = JSON.parse(llmOutput);
    return { valid: true, call };
  }

  return { valid: false };
}

Retry Logic for Failed Validation

Regenerate with more explicit instructions when validation fails:
async function getValidJsonWithRetry(
  prompt: string,
  schema: Record<string, string>,
  maxRetries: number = 3
): Promise<any | null> {
  let currentPrompt = prompt;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const output = await callLLM(currentPrompt);
    const validation = await abv.guardrails.validJson.validate(output, { schema });

    if (validation.status === "pass") {
      return JSON.parse(output);
    }

    // Make prompt more explicit for retry
    currentPrompt = prompt +
      `\n\nIMPORTANT: Respond ONLY with valid JSON. ` +
      `Do not include explanatory text before or after the JSON. ` +
      `The JSON must match this structure: ${JSON.stringify(schema)}`;
  }

  console.error("Failed to get valid JSON after", maxRetries, "attempts");
  return null;
}

Prompt Engineering for Valid JSON

Getting consistently valid JSON from LLMs requires effective prompting:
Don’t just ask for information—specify the exact JSON format.Bad: ā€œExtract the name and ageā€ Good: ā€œReturn a JSON object with two fields: name as a string and age as a numberā€The more explicit you are, the more likely the LLM complies correctly.
LLMs excel at pattern matching. Include a template:
Respond in this exact format:
{"name": "string", "age": number, "email": "string"}
Showing the structure dramatically improves success rates.
LLMs often add helpful explanations around the JSON. Stop this:Add to prompt: ā€œRespond ONLY with the JSON object. No text before or after it. No explanations.ā€This prevents common failures where explanatory text breaks JSON parsing.
Prevent type mismatch errors:Add to prompt: ā€œage as a number, not a string. active as a boolean (true or false), not the strings ā€˜true’ or ā€˜falseā€™ā€Explicit type requirements reduce the most common schema validation failures.
Tell the LLM what to do when information is missing:Add to prompt: ā€œIf any field is unknown, use null for that field. Do not omit fields.ā€This prevents missing field errors when information isn’t available.

When to Use Strict Mode

Strict mode has specific use cases where it adds value: Use strict mode when:
  • Integrating with external APIs expecting exact formats
  • Security-critical contexts where unexpected fields pose injection risks
  • Enforcing output discipline to improve LLM consistency
  • Preventing data leakage through extra fields
Don’t use strict mode when:
  • Iterating on schemas that might evolve
  • LLM often provides helpful additional context you want to preserve
  • Flexibility in outputs is acceptable as long as required fields exist
Non-strict mode (default) works well for most applications—it ensures required fields exist while allowing LLMs to be more verbose.

Combining with Other Guardrails

Valid JSON often serves as the first validation in a pipeline: Sequential validation: Validate JSON structure first (instant), then validate content with other guardrails. For example, after confirming valid JSON, check text fields for toxic or biased language. With contains string: After JSON validation passes, use contains string to verify specific required strings exist in text fields. Before expensive checks: Always validate JSON syntax before running expensive LLM-powered content checks. No point analyzing content if you can’t even parse the structure. In function calling chains: Validate each function call’s JSON structure before attempting execution, preventing runtime errors from malformed inputs.

Next Steps