Skip to main content
GET
/
agent-eval
/
runs
/
{id}
/
evaluation
Get run evaluation
curl --request GET \
  --url https://api.goyappr.com/agent-eval/runs/{id}/evaluation \
  --header 'Authorization: Bearer <token>'
{
  "score": 50,
  "pass_fail": true,
  "results": [
    {
      "assertion": {
        "weight": 1,
        "description": "<string>",
        "pattern": "<string>",
        "tool_name": "<string>",
        "args_match": {},
        "node_id": "<string>",
        "rubric": "<string>"
      },
      "passed": true,
      "weight": 123,
      "reason": "<string>"
    }
  ]
}

Documentation Index

Fetch the complete documentation index at: https://docs.goyappr.com/llms.txt

Use this file to discover all available pages before exploring further.

Returns just the assertion roll-up — equivalent to GET /agent-eval/runs/:id but trimmed to the evaluation field. Useful for CI checks where you only care about the score. The scoring fields aren’t final until the worker finishes both the conversation AND the assertion pass. Poll the parent GET /agent-eval/runs/:id until queue_status === "done", then call this endpoint. If you hit this endpoint while queue_status is still pending or claimed, score / pass_fail / results will be null.

Response shape

{
  "score": 87.5,
  "pass_fail": true,
  "results": [
    {
      "assertion": { "type": "must_say", "pattern": "Tuesday at 3pm", "weight": 1 },
      "passed": true,
      "weight": 1,
      "reason": "matched at turn 7"
    },
    {
      "assertion": { "type": "must_call_tool", "tool_name": "bookAppointment", "weight": 2 },
      "passed": true,
      "weight": 2,
      "reason": "fired at turn 8"
    }
  ]
}

Authorizations

Authorization
string
header
required

Your Yappr API key (e.g. ypr_live_...). Generate one in the dashboard under Settings → API Keys.

Path Parameters

id
string<uuid>
required

Response

Evaluation result

Roll-up of assertion results that's also stored on eval_runs.evaluation.

score
number
required

Weighted percentage of assertions that passed.

Required range: 0 <= x <= 100
pass_fail
boolean
required

True iff score >= case.pass_threshold.

results
object[]
required