Evaluate how precisely AI outputs follow explicit prompt and system instructions using DeepRails Guardrail Metrics.
Instruction Adherence measures how closely an AI-generated response follows the explicit instructions defined in the prompt and system directives.
01
Low Adherence
The response ignores or contradicts instructions
High Adherence
The response follows all instructions precisely
A high Instruction Adherence score (close to 1) indicates the response matches the requested format, tone, and content constraints. A low score suggests the model deviated from its assigned behavior or produced off-script content. The final score is continuous, ranging from 0 to 1, and can be formatted as a float or boolean, depending on user needs.
DeepRails performs a structured, multi-step audit to determine how well a model’s response aligns with its given instructions. Each instruction is analyzed independently for compliance.
1
Instruction Extraction
The system identifies and enumerates every explicit instruction from the prompt and system message. Compound directives are split into discrete, evaluable elements.
2
Response Alignment
The AI output is decomposed into segments corresponding to each instruction. These segments are analyzed for fidelity in content, tone, structure, or format.
3
Adherence Judgment
For each instruction, a binary verdict is assigned:
Y if the instruction is fully followed
N if it is partially followed, misinterpreted, or ignored
Each verdict is accompanied by a confidence rating: Low, Medium, High, or Certain.
4
Score Aggregation
All judgments are consolidated into a final Instruction Adherence score between 0 and 1. This reflects the degree to which the model respected the full set of instructions.
The result is a single, interpretable score that helps assess whether a model output is generated in the manner it was explicitly requested.
Evaluate each explicit instruction in isolation to avoid partial compliance slipping through.
Track Instruction Types
Categorize instructions (format, tone, scope, etc.) to detect which kinds of directives are most frequently ignored.
Use Adherence as a Gate
Prevent low-adherence completions from reaching users in production, especially when structure or compliance is critical.
Design Robust Prompts
Write instructions that are structured, unambiguous, and directive (e.g., “Respond only in bullet points” or “Return valid JSON”).
Instruction Adherence ensures not only that a model gives the right information, but that it gives it in the right way. This guardrail is essential for structured outputs, enterprise use cases, and task compliance.