Skip to main content
GET
/
defend
/
{workflow_id}
/
events
/
{event_id}
Retrieve an Event's Details
from deeprails import DeepRails

DEEPRAILS_API_KEY = "YOUR_API_KEY"

client = DeepRails(
    api_key=DEEPRAILS_API_KEY,
)

event_response = client.defend.retrieve_event(
    workflow_id="wkfl_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    event_id="evt_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
)
print(event_response)
{
  "workflow_id": "wkfl_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
  "event_id": "evt_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
  "status": "Completed",
  "improvement_action": "fixit",
  "threshold_type": "custom",
  "filtered": false,
  "improvement_tool_status": null,
  "improved_model_output": null,
  "evaluation_result": {
    "correctness": {
      "score": 1,
      "rationale": "The response correctly and concisely summarizes core features of franchise agreements: it gives an accurate definitional statement, correctly identifies the parties and their roles, and lists the common commercial provisions (fees/royalties, term/renewal, and dispute-resolution mechanisms) with appropriately cautious language (e.g., 'often' and 'likely to see').",
      "threshold": 0.9
    },
    "completeness": {
      "score": 0.9166666666666667,
      "rationale": "The AI response is accurate, well-organized, and directly addresses both parts of the user's request (it correctly requests the file and gives a clear overview of franchise agreements). The primary shortcoming is depth: the reply is a solid preliminary checklist but does not provide clause-level detail, jurisdiction-specific guidance, sample language, numerical norms (e.g., typical royalty percentages or fee ranges by industry), or prioritized negotiation strategies that would be expected for a high-standard explanatory review.",
      "threshold": 0.8
    }
  },
  "evaluation_history": [
    {
      "attempt": "Initial Evaluation",
      "evaluation_status": "completed",
      "guardrail_metrics": [
        "correctness",
        "completeness"
      ],
      "run_mode": "smart",
      "model_input": {
        "system_prompt": "You are a helpful assistant.",
        "user_prompt": "Hello, how are you?"
      },
      "model_output": "Hello, how are you?",
      "nametag": "Test Event",
      "progress": 100,
      "error_message": null,
      "evaluation_result": {
        "correctness": {
          "score": 1,
          "rationale": "The response correctly and concisely summarizes core features of franchise agreements.",
          "threshold": 0.9
        },
        "completeness": {
          "score": 0.9166666666666667,
          "rationale": "The AI response is accurate, well-organized, and directly addresses both parts of the user's request (it correctly requests the file and gives a clear overview of franchise agreements). The primary shortcoming is depth: the reply is a solid preliminary checklist but does not provide clause-level detail, jurisdiction-specific guidance, sample language, numerical norms (e.g., typical royalty percentages or fee ranges by industry), or prioritized negotiation strategies that would be expected for a high-standard explanatory review.",
          "threshold": 0.8
        }
      },
      "evaluation_total_cost": 0.01,
      "created_at": "2025-01-15T10:30:00Z",
      "modified_at": "2025-01-15T10:30:00Z"
    }
  ],
  "automatic_hallucination_tolerance_levels": null,
  "custom_hallucination_threshold_values": {
    "correctness": 0.9,
    "completeness": 0.8
  },
  "capabilities": [
    {
      "capability": "web_search"
    }
  ],
  "files": [
    {
      "file_name": "example.pdf",
      "file_id": "file_xxxxxxxx",
      "file_size": 1024
    }
  ]
}
Workflow events can include a whole series of improvement attempts and corresponding evaluations when the initial evaluation fails one or more metrics. Poll this endpoint for these details, plus the completion status, whether the output needed to be filtered out, the improved_model_output if it needed remediation, and more.

The filtered field is set to true when the event fails one or more metrics on the most recent evaluation and is set to false if all metric evaluations were above their thresholds. If filtered is true, then an improvement attempt will begin immediately after the initial evaluation concludes.

The evaluation_history field is an array of the details for each evaluation performed for the event. It can be used to track the progress of the event and see how DeepRails improved your model output over time.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

workflow_id
string
required

The ID of the workflow associated with the event.

event_id
string
required

The ID of the requested workflow event.

Response

Workflow event details retrieved successfully

workflow_id
string
required

Workflow ID associated with the event.

Example:

"wkfl_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

event_id
string
required

A unique workflow event ID.

Example:

"evt_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

status
enum<string>
required

Status of the event.

Available options:
In Progress,
Completed
Example:

"Completed"

improvement_action
enum<string>
required

Type of improvement action used to improve the event.

Available options:
regen,
fixit,
do_nothing
Example:

"fixit"

threshold_type
enum<string>
required

Type of thresholds used to evaluate the event.

Available options:
custom,
automatic
Example:

"custom"

filtered
boolean
required

Whether the event was filtered and requires improvement.

Example:

false

improvement_tool_status
enum<string>
required

Status of the improvement tool used to improve the event.

Available options:
improved,
failed on max retries,
improvement_required,
Example:

null

improved_model_output
string
required

Improved model output after improvement tool was applied and each metric passed evaluation.

Example:

null

evaluation_result
object
required

Evaluation result consisting of average scores and rationales for each of the evaluated guardrail metrics.

Example:
{
"correctness": {
"score": 1,
"rationale": "The response correctly and concisely summarizes core features of franchise agreements: it gives an accurate definitional statement, correctly identifies the parties and their roles, and lists the common commercial provisions (fees/royalties, term/renewal, and dispute-resolution mechanisms) with appropriately cautious language (e.g., 'often' and 'likely to see').",
"threshold": 0.9
},
"completeness": {
"score": 0.9166666666666667,
"rationale": "The AI response is accurate, well-organized, and directly addresses both parts of the user's request (it correctly requests the file and gives a clear overview of franchise agreements). The primary shortcoming is depth: the reply is a solid preliminary checklist but does not provide clause-level detail, jurisdiction-specific guidance, sample language, numerical norms (e.g., typical royalty percentages or fee ranges by industry), or prioritized negotiation strategies that would be expected for a high-standard explanatory review.",
"threshold": 0.8
}
}
evaluation_history
object[]
required

History of evaluations for the event.

automatic_hallucination_tolerance_levels
object

Mapping of guardrail metric names to tolerance values. Values are strings (low, medium, high) representing automatic tolerance levels.

Example:

null

custom_hallucination_threshold_values
object

Mapping of guardrail metric names to threshold values. Values are floating point numbers (0.0-1.0) representing custom thresholds.

Example:
{ "correctness": 0.9, "completeness": 0.8 }
capabilities
object[]

Extended AI capabilities available to the event, if any. Can be web_search and/or file_search.

files
object[]

List of files available to the event, if any. Will only be present if file_search is enabled.