model_input
dictionary (containing at least a system_prompt
or user_prompt
field), a model_output
string to be evaluated, and a guardrail_metrics
array specifying which metrics to evaluate against. Optionally, include the model_used
, selected run_mode
, and a human-readable nametag
.Run modes determine the models that power evaluations
-
precision_plus
- Maximum accuracy using the most advanced models-
precision
- High accuracy with optimized performance-
smart
- Balanced speed and accuracy (default)-
economy
- Fastest evaluation at lowest costAvailable guardrail metrics include
correctness
, completeness
, instruction_adherence
, context_adherence
, ground_truth_adherence
, and comprehensive_safety
. When you create a monitor event, you’ll receive an event ID. Use this ID to track the event’s progress and retrieve the evaluation results.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Path Parameters
The ID of the monitor associated with this event.
Body
A dictionary of inputs sent to the LLM to generate output. The dictionary must contain at least a user_prompt
or system_prompt
field. For ground_truth_aherence guadrail metric, ground_truth
should be provided.
- Option 1
- Option 2
Output generated by the LLM to be evaluated.
An array of guardrail metrics that the model input and output pair will be evaluated on. For non-enterprise users, these will be limited to correctness
, completeness
, instruction_adherence
, context_adherence
, ground_truth_adherence
, and/or comprehensive_safety
.
Model ID used to generate the output, like gpt-4o
or o3
.
Run mode for the monitor event. The run mode allows the user to optimize for speed, accuracy, and cost by determining which models are used to evaluate the event. Available run modes include precision_plus
, precision
, smart
, and economy
. Defaults to smart
.
precision_plus
, precision
, smart
, economy
An optional, user-defined tag for the event.