Output Validation Pipeline

How do you ensure an LLM's output is actually usable? The model might return malformed JSON, miss required fields, or produce technically valid responses that don't meet your quality bar.

Agent Actions addresses this with a multi-layer validation system. Think of it like airport security: each layer catches different problems, and outputs must pass all checks to proceed.

Validation Layers

LLM outputs pass through three validation layers. Let's walk through what each layer catches:

Guards run last because they evaluate semantic conditions that require valid, schema-conforming data.

Layer	Purpose	Mechanism
1. JSON	Structural integrity	Reprompt with error feedback
2. Schema	Type/field validation	Schema constraints + reprompt
3. Guard	Semantic validation	Condition expressions

Layer 1: JSON Validation

Ensures the LLM returns valid JSON. If parsing fails and reprompt is configured, the LLM is retried with forceful JSON feedback including the expected field names:

yamlYAML
- name: extract_data
  schema: my_schema
  reprompt:
    on_schema_mismatch: reprompt    # enables JSON + schema reprompt
    max_attempts: 3
    on_exhausted: return_last

Layer 2: Schema Validation

Validates output structure, types, and constraints.

Structural Validation

Required Fields - Reject if missing:

yamlYAML
# schema/my_schema.yml
type: object
properties:
  title:
    type: string
  content:
    type: string
required:
  - title
  - content  # Both must be present

Type Checking - Reject wrong types:

yamlYAML
properties:
  score:
    type: integer  # Rejects "85" (string) or 85.5 (float)
  tags:
    type: array    # Rejects "tag1, tag2" (string)

Value Constraints

Enums - Reject values not in list:

yamlYAML
properties:
  status:
    type: string
    enum:
      - approved
      - rejected
      - pending
    # Rejects: "maybe", "APPROVED", "Approved"

Numeric Ranges - Reject out-of-range values:

yamlYAML
properties:
  score:
    type: number
    minimum: 0
    maximum: 100
    # Rejects: -5, 101, 150

  confidence:
    type: number
    exclusiveMinimum: 0
    exclusiveMaximum: 1
    # Rejects: 0, 1 (must be between, not equal)

String Constraints - Reject by length/pattern:

yamlYAML
properties:
  summary:
    type: string
    minLength: 10
    maxLength: 500
    # Rejects: "Short" (< 10 chars)

  email:
    type: string
    pattern: "^[a-zA-Z0-9+_.-]+@[a-zA-Z0-9.-]+$"
    # Rejects: "not-an-email"

Array Constraints - Reject by count:

yamlYAML
properties:
  items:
    type: array
    minItems: 1
    maxItems: 10
    # Rejects: [] (empty) or arrays with 11+ items

Schema-Echo and Empty-Object Detection

Beyond type and field checks, schema validation detects two common LLM failure modes that produce valid JSON but useless data:

Schema-echo — The model returns the JSON Schema definition itself instead of conforming data. For example, given a schema expecting { optimal_code: string }, the LLM returns:

jsonJSON
{"title": "InlineSchema", "type": "object", "properties": {"optimal_code": {"type": "string"}}, "required": [], "additionalProperties": false}

This is valid JSON and structurally matches a JSON object, but contains zero declared fields. The validator detects this by checking whether the output's top-level keys are JSON Schema meta-keys (title, type, properties, required, additionalProperties) rather than the expected output fields.

Empty object — The model returns {}. This passes structural validation when no fields are required, but is semantically useless. The validator now rejects empty objects when the schema declares any output fields, even if none are marked required.

Both failures trigger reprompt when on_schema_mismatch: reprompt is configured, giving the model another chance to produce valid output.

:::note Meta-key nuance If your schema legitimately declares a field named type (or another JSON Schema keyword), outputs containing that field are not rejected. The check only triggers when the output has zero declared schema fields. :::

Reprompt on Schema Failure

When schema validation fails, reprompting retries with error context. Set on_schema_mismatch: reprompt inside the reprompt block to enable this:

yamlYAML
- name: generate_analysis
  schema: analysis_schema
  reprompt:
    on_schema_mismatch: reprompt
    max_attempts: 4
    on_exhausted: return_last

The retry prompt includes:

Original response that failed
Specific validation errors (missing fields, wrong types)
Expected field names from the schema

Schema Mismatch Behavior

Control what happens when an LLM response doesn't match the expected schema using on_schema_mismatch inside the reprompt block:

yamlYAML
- name: extract_entities
  schema: entity_schema
  reprompt:
    on_schema_mismatch: reprompt   # "reprompt" | "reject"
    max_attempts: 3

Value	Behavior
`reprompt`	Trigger reprompt with schema errors as feedback
`reject`	Reject the response, action fails

When not set, schema is not enforced — the output is accepted regardless of schema conformance.

When set to reprompt, no custom validation UDF is needed — the schema errors are used directly as feedback to the LLM.

Layer 3: Guard Validation

Here's where it gets interesting: schema validation catches structural problems, but what about semantic ones? A score of 25 is a valid integer, but maybe you only want to process high-quality content with scores above 85.

Guards validate semantic and business logic after schema passes:

Filter Unwanted Values

yamlYAML
- name: score_quality
  schema: quality_score
  # Schema ensures score is number 0-100

- name: generate_final
  dependencies: score_quality  # Input source
  guard:
    condition: 'score_quality.score >= 85'  # Semantic: only high quality
    on_false: filter

Reject Specific Content

yamlYAML
# Filter out responses with unwanted status
- name: next_action
  guard:
    condition: 'upstream_action.status != "invalid"'
    on_false: filter

# Filter based on category
- name: process_technical
  guard:
    condition: 'classify.category IN ["technical", "implementation"]'
    on_false: filter

Skip vs Filter

Consider what happens when a guard fails. You have two choices, and they have very different implications:

Action	Use Case
`filter`	Remove record entirely from agentic workflow
`skip`	Skip this action, but continue processing record

yamlYAML
# Filter: Record stops here
guard:
  condition: 'score_quality.quality >= 50'
  on_false: filter

# Skip: Record continues without this action
guard:
  condition: 'analyze_content.needs_enhancement == true'
  on_false: skip

Combining All Layers

Now let's see how these layers work together in real agentic workflows.

Pattern: Quality Gate Pipeline

yamlYAML
actions:
  # Step 1: Extract with schema validation + reprompt
  - name: extract_facts
    prompt: $prompts.extract_facts
    schema: candidate_facts_list  # Layer 2: type/structure
    reprompt:
      on_schema_mismatch: reprompt
      max_attempts: 4
      on_exhausted: return_last

  # Step 2: Filter empty results (Layer 3)
  - name: validate_facts
    dependencies: extract_facts  # Input source
    guard:
      condition: 'extract_facts.candidate_facts_list != []'
      on_false: filter

  # Step 3: Score quality with schema
  - name: score_quality
    dependencies: validate_facts  # Input source
    schema: quality_score  # Ensures score is 0-100
    reprompt:
      on_schema_mismatch: reprompt
      max_attempts: 3
      on_exhausted: return_last

  # Step 4: Filter low quality (Layer 3)
  - name: generate_output
    dependencies: score_quality  # Input source
    guard:
      condition: 'score_quality.score >= 85'
      on_false: filter

Pattern: Two-Stage LLM Validation

Use an LLM action to validate another LLM's output:

yamlYAML
actions:
  # Generate content
  - name: generate_content
    prompt: $prompts.generate
    schema: content_schema
    reprompt:
      on_schema_mismatch: reprompt
      max_attempts: 4
      on_exhausted: return_last

  # LLM validates the content
  - name: validate_content
    dependencies: generate_content  # Input source
    prompt: |
      Review this content and determine if it meets quality standards:
      {{ generate_content.content }}

      Return: {"is_valid": true/false, "reason": "..."}
    schema:
      is_valid: boolean
      reason: string

  # Guard on validation result
  - name: publish_content
    dependencies: validate_content  # Input source
    guard:
      condition: 'validate_content.is_valid == true'
      on_false: filter

Pattern: Enum + Guard for Categories

yamlYAML
# Schema enforces valid categories
# schema/classification.yml
properties:
  category:
    type: string
    enum:
      - technical
      - conceptual
      - procedural
      - invalid

---
# Workflow guards against unwanted category
- name: classify_content
  schema: classification
  reprompt:
    on_schema_mismatch: reprompt
    max_attempts: 3
    on_exhausted: return_last

- name: process_valid
  dependencies: classify_content  # Input source
  guard:
    condition: 'classify_content.category != "invalid"'  # Filter "invalid" category
    on_false: filter

Pattern: Numeric Threshold with Reprompt

Force the LLM to return acceptable scores:

yamlYAML
# Schema with constraints
# schema/score_schema.yml
properties:
  confidence_score:
    type: number
    minimum: 0
    maximum: 100
    description: "Confidence from 0-100. Must be >= 70 for high confidence."

---
# Action with reprompt
- name: assess_confidence
  prompt: |
    Assess confidence in this analysis.
    Return a score from 0-100.
    Scores below 70 indicate low confidence.
  schema: score_schema
  reprompt:
    on_schema_mismatch: reprompt
    max_attempts: 5
    on_exhausted: return_last

# Guard for business threshold
- name: proceed_if_confident
  dependencies: assess_confidence  # Input source
  guard:
    condition: 'assess_confidence.confidence_score >= 70'
    on_false: filter

Custom Validation with Tool Actions

What if your validation logic is too complex for schema constraints or guard expressions? For example, checking against a blocklist or calling an external API.

For complex validation logic, use a tool action:

yamlYAML
actions:
  - name: generate_content
    schema: content_schema
    reprompt:
      on_schema_mismatch: reprompt
      max_attempts: 3
      on_exhausted: return_last

  - name: custom_validate
    kind: tool
    impl: validate_content
    dependencies: generate_content  # Input source

  - name: next_step
    dependencies: custom_validate  # Input source
    guard:
      condition: 'custom_validate.validation_passed == true'
      on_false: filter

pythonPY
# tools/validate_content.py
@udf_tool()
def validate_content(data: dict) -> dict:
    """Custom validation logic."""
    issues = []
    content = data["generate_content"]

    # Check for prohibited words
    prohibited = ["todo", "placeholder", "tbd"]
    text = content["text"].lower()
    for word in prohibited:
        if word in text:
            issues.append(f"Contains prohibited word: {word}")

    # Check minimum quality
    if len(content["summary"]) < 50:
        issues.append("Summary too short")

    return {
        "validation_passed": len(issues) == 0,
        "issues": issues
    }

Validation Decision Matrix

Want to Reject	Use	Example
Invalid JSON	`reprompt: { on_schema_mismatch: reprompt }`	Malformed response
Wrong type	Schema `type`	String instead of number
Missing field	Schema `required`	No "title" field
Wrong value	Schema `enum`	"maybe" not in ["yes", "no"]
Out of range	Schema `min/max`	Score of 150 (max 100)
Too short/long	Schema `minLength/maxLength`	Summary < 10 chars
Empty array	Guard `!= []`	No facts extracted
Low score	Guard `>= threshold`	Quality < 85
Wrong category	Guard `!= value`	Category == "invalid"
Complex logic	Tool action	Custom business rules

Reprompt vs Guard

You might wonder: when should I use reprompting, and when should I use a guard? The key distinction is whether the LLM can fix the problem.

Aspect	Reprompt	Guard
When	Before accepting output	After output accepted
Purpose	Fix LLM mistakes	Filter valid but unwanted
Action	Retry LLM call	Skip/remove record
Cost	Additional LLM calls	No additional cost
Use for	Structural issues	Semantic filtering

Use reprompt when:

Output is malformed (JSON errors)
Schema validation fails
LLM can fix the issue with guidance

Use guard when:

Output is valid but doesn't meet criteria
Filtering based on values (scores, categories)
Business logic decisions

The limitation here: reprompting costs API tokens. Guards are free. If you're filtering on a value the LLM produced correctly, use a guard—don't reprompt hoping for a different answer.

Best Practices

1. Layer Your Validation

yamlYAML
# Layer 1 & 2: Schema + Reprompt
- name: extract
  schema: extraction_schema
  reprompt:
    on_schema_mismatch: reprompt
    max_attempts: 4
    on_exhausted: return_last

# Layer 3: Guard for quality
- name: process
  guard:
    condition: 'upstream_action.quality >= threshold'
    on_false: filter

2. Use Enums for Constrained Values

yamlYAML
# Good: LLM must choose from list
status:
  type: string
  enum: ["approved", "rejected", "pending"]

# Avoid: Free-form string with guard
# (LLM might return "Approved", "APPROVED", etc.)

3. Provide Schema Descriptions

yamlYAML
properties:
  score:
    type: integer
    minimum: 0
    maximum: 100
    description: "Quality score. 85+ is high quality, below 50 is rejected."

4. Set Reasonable Reprompt Limits

yamlYAML
# Simple schema: fewer attempts
reprompt:
  on_schema_mismatch: reprompt
  max_attempts: 3
  on_exhausted: return_last

# Complex schema: more attempts, fail on exhaustion
reprompt:
  on_schema_mismatch: reprompt
  max_attempts: 5
  on_exhausted: raise

5. Guard Early to Save Cost

This is important: guards prevent downstream work. Place guards as early as possible in your agentic workflow to avoid wasting API calls on records that will be filtered anyway.

yamlYAML
# Filter early before expensive actions
- name: extract  # Cheap
- name: validate  # Cheap
  guard:
    condition: 'extract.facts != []'
    on_false: filter
- name: expensive_llm_call  # Only runs on valid data
  dependencies: validate  # Input source

Debugging Validation Failures

Check Schema Validation Errors

bashBASH
agac run -a workflow --log-level DEBUG

Look for:

text

TXT

SchemaValidationError: Required field 'title' missing
SchemaValidationError: Value 'invalid' not in enum

Check Guard Evaluation

bashBASH
agac run -a workflow --log-level DEBUG

Look for:

text

TXT

Guard condition 'score >= 85' evaluated to False
Record filtered by guard on action 'generate_output'

Validate Schema Syntax

agac inspect loads the workflow through preflight, which validates referenced schemas. Malformed schemas surface with a path-attributed error and a non-zero exit before any LLM call is made:

bashBASH
agac inspect -a workflow_name

Analyze Schema Structure

bashBASH
agac schema -a workflow

Validation Layers​

Layer 1: JSON Validation​

Layer 2: Schema Validation​

Structural Validation​

Value Constraints​

Schema-Echo and Empty-Object Detection​

Reprompt on Schema Failure​

Schema Mismatch Behavior​

Layer 3: Guard Validation​

Filter Unwanted Values​

Reject Specific Content​

Skip vs Filter​

Combining All Layers​

Pattern: Quality Gate Pipeline​

Pattern: Two-Stage LLM Validation​

Pattern: Enum + Guard for Categories​

Pattern: Numeric Threshold with Reprompt​

Custom Validation with Tool Actions​

Validation Decision Matrix​

Reprompt vs Guard​

Best Practices​

1. Layer Your Validation​

2. Use Enums for Constrained Values​

3. Provide Schema Descriptions​

4. Set Reasonable Reprompt Limits​

5. Guard Early to Save Cost​

Debugging Validation Failures​

Check Schema Validation Errors​

Check Guard Evaluation​

Validate Schema Syntax​

Analyze Schema Structure​