Why Monitor Exists
Blind spots in production GenAI cost teams time, money, and trust. Monitor closes that gap. It evaluates live traffic with the same research-backed Guardrail Metrics used across DeepRails, correlates quality with operational signals (latency, tokens, cost, volume), and highlights trends and regressions so you can fix issues fast and improve with confidence.Key Definitions
- Guardrail Metrics: DeepRails’ General-Purpose Guardrail Metrics (GPMs) for correctness, completeness, adherence (instruction, context, ground truth), and comprehensive safety. Custom Guardrail Metrics are supported on SME & Enterprise plans.
- Monitor: A read-only evaluation pipeline for a specific LLM use case or surface. A monitor receives your input/output pairs and model metadata, scores them with selected guardrails, and exposes real-time metrics, trends, and drill-downs. It does not remediate outputs (use Defend for correction).
- Nametag (Segmentation): An optional label you attach to events (e.g., “staging”, “release-2025-09”, “feature-x”) to slice charts, compare cohorts, and run A/B or pre/post analyses.
How Monitor Works
Monitor operates through a simple lifecycle:Define a Monitor
Stream Events
Evaluate at Ingest
Analyze Trends
Investigate & Act
Console Walkthrough
The Monitor Console brings observability to life across three tabs: Monitor Metrics, Monitor Data, and Manage Monitors.Monitor Metrics
The Monitor Metrics tab shows real-time operational and quality performance for a selected monitor: request volume, failure rate, latency, tokens, and cost—plus guardrail score distributions that reveal drift and regressions at a glance.
Operational and economic signals (top) with guardrail score distributions (bottom) help you spot regressions, latency spikes, and quality drift in real time.
Monitor Data
The Monitor Data tab lists every evaluated event for deep inspection. Filter by monitor, metrics, status, model, date range, or nametag; search by run ID or prompt; and open any row to view full details.
Event table with flexible filters for monitor, metrics, status, model, and time window. Use nametags to compare releases and cohorts.

Event details show per-metric scores and rationales, status, processing time, and the original input/output for precise debugging.
Manage Monitors
The Manage Monitors tab is where you create and maintain monitors across environments and surfaces. See when each monitor last received traffic, how many outputs it has evaluated, and more.
Create a new monitor with a name and description, and manage existing monitors with last-used status and total outputs evaluated.