Trajectory evaluation

AI & LLM Testing

// Definition

Evaluating an agent on the sequence of steps it took, not just the final outcome. End-to-end evaluation ("did the agent eventually complete the task?") misses a large class of failures: agents that arrived at the right answer via the wrong tool, that took ten steps when two would have done, that corrupted state mid-flow but recovered, that retried successfully past a permission boundary they shouldn't have crossed. Trajectory evaluation scores the steps themselves: were tool-call arguments correct, was state propagation clean, did the agent refuse when it should have refused. Research from 2023 onward shows agents pass 20–40 percent more end-to-end evaluations than they pass trajectory ones — the gap is the work hidden by single-shot scoring.

// Related terms