Run Modes control how DeepRails executes evaluations across Monitor and Defend. Every run uses two different LLMs in parallel to reduce bias and improve accuracy. The selected Run Mode determines which models are selected — from compact cost-efficient models to advanced reasoning models — so you can optimize your workflow.
Why Run Modes Matter
Not every task requires the same evaluation depth. A simple summarization prompt can be tested cost-effectively with smaller models, while multi-step reasoning (math generation, chained steps, or multi-task prompts) benefits from reasoning-capable models. DeepRails’ Run Modes let you tune this balance.- Always two models in parallel: Every evaluation uses two distinct LLMs to generate scores, avoiding single-model bias.
- Reasoning vs. non-reasoning models: For complex prompts, modes that include reasoning models yield better accuracy and interpretability.
- Available everywhere: Run Modes function the same across the Monitor and Defend APIs on all plans.
The Five Run Modes
Precision Max Codex
Uses two reasoning models with Codex-optimized deep analysis. Ultimate accuracy for highly complex, code-based workflows that require deep thought and analysis on every task.
Precision Max
Uses two reasoning models in parallel for maximum depth. Best for complex, multi-step workflows and mission-critical use cases where accuracy outweighs cost or latency.
Precision Codex
Uses high accuracy with code-optimized analysis. Recommended for code-based workflows that benefit from specialized code analysis.
Precision (default)
High accuracy analysis. Recommended for complex prompts that benefit from reasoning. Default for all workflows.
Fast
Maximum speed for high-volume processing. Uses compact models to generate signals at scale, with less precision than other modes.
Choosing whether to use reasoning models is often part of the prompt engineering process. If your task involves multi-step logic, mathematics, or complex instructions, Precision or Precision Max are recommended. If your task requires complex code analysis, use Precision Codex or Precision Max Codex.
Choosing the Right Run Mode
| Name | Description | When to Use | Example Use Case |
|---|---|---|---|
| Precision Max Codex | Ultimate accuracy with Codex-optimized deep analysis (highest accuracy, lowest speed). | Complex code creation or refactoring tasks, debugging large systems, or similar software development tasks requiring the highest accuracy. | Final security and correctness review of AI-generated code before merging to production. |
| Precision Max | Two reasoning models in parallel; maximum accuracy and detail (very high cost and latency). | Mission-critical evaluations, final QA sweeps, regulated or safety-sensitive domains. | Compliance evaluation on a healthcare agent before production. |
| Precision Codex | High accuracy with code-optimized analysis; specialized for code-based workflows. | Code reviews, software development tasks, and technical documentation requiring code analysis. | Evaluating AI-generated code snippets for correctness and best practices. |
| Precision (default) | High accuracy analysis; strong reasoning coverage with balanced cost/latency. | Most workflows: complex prompts with logic/calculations or multi-step reasoning. Default for all workflows. | Monitoring daily regressions in a legal research bot. |
| Fast | Maximum speed for high-volume processing; compact models for fastest evaluation. | Large batch screening, early exploration, low-stakes triage. | Screening 10,000 code-gen outputs to flag potential safety risks. |
