Instruction Adherence measures how closely a model response follows the instructions defined in the user and/or system prompts.
01
Low Adherence
The response ignores or contradicts instructionsHigh Adherence
The response follows all instructions preciselyUnderstanding Instruction Adherence
How Instruction Adherence Differs from Other Metrics
Instruction Adherence: Measures whether the response followed how it was supposed to answer—structure, tone, content constraints, formatting, etc.
Context Adherence: Measures whether the response reflects what was in the provided context (e.g., source documents).
Correctness: Measures whether the information in the response is factually accurate, regardless of whether it followed instructions or context.
Evaluation Process
DeepRails performs a Multimodal Partitioned Evaluation of every model output to assess the extent to which it follows all prompt instructions. A few core pieces of logic ensure that the evaluation is as thorough and accurate as possible.Instruction Identification and Extraction
Instruction Identification and Extraction
The model input is separated into explicit, atomic instructions. The most important claims are intelligently selected for evaluation to ensure the evaluation is completed in a timely manner.
Response Segmentation
Response Segmentation
The model output is decomposed into segments each relating to an identified instruction.
Confidence-based Adherence Judgment
Confidence-based Adherence Judgment
Each segment analyzed and determined to either follow or not follow its instruction. For each binary verdict, a confidence rating is given as well.
Aggregate Scoring
Aggregate Scoring
All claim judgments are weighted by their confidence rating and consolidated into a final instruction adherence score between 0 and 1.
Addressing Low Instruction Adherence Scores
Improving Instruction Adherence
Refine prompts: Reword unclear or ambiguous instructions to be more direct, structured, and constraint-based.
Compare model variants: Some models are significantly more instruction-aligned than others. Use Adherence metrics to validate before selecting the model used in deployment.
Best Practices
Track Instruction Types
Categorize instructions (format, tone, scope, etc.) to make it harder for model’s to miss them.
Design Robust Prompts
Write instructions that are structured, unambiguous, and directive (e.g., “Respond only in bullet points” or “Return valid JSON”).
While the other metrics ensure the model gives all the right information, Instruction Adherence ensures that it gives it in the right way. This guardrail is essential for structured outputs, enterprise use cases, and task compliance.
