Skill Detail

Grade agent trajectories and tool-use decisions with AgentEvals

Score whether an agent took a sensible intermediate path, called tools correctly, and reached the outcome without relying only on final-answer checks.

Code Quality & ReviewCustom Agents

Code Quality & Review Custom Agents Security Reviewed

⭐ 550 GitHub stars ⬇ 251k/wk npm

INSTALL WITH ANY AGENT

npx skills add agentskillexchange/skills --skill grade-agent-trajectories-and-tool-use-decisions-with-agentevals Copy

Works best when you want a reusable capability, not another fragile one-off prompt.

View source

At a glance

Tools required

Python or TypeScript runtime, agent run outputs or trajectories, optional LLM judge provider

Install & setup

pip install agentevals or npm install agentevals @langchain/core, then pass captured agent trajectories into the provided evaluators.

Author

LangChain

Publisher

Open Source Project

Last updated

Apr 19, 2026

Quick brief

Use AgentEvals when you need to judge the path an agent took, not just whether the final answer looked good. The upstream package is specifically about evaluating agent trajectories, including message sequences, tool calls, graph paths, and LLM-as-judge scoring.

How it works

What this skill actually does

Invoke this instead of a general observability stack or broad eval product when the immediate job is trajectory grading inside tests or evaluation suites. The scope boundary is tight: AgentEvals evaluates agent steps and tool-use paths. It is not a general framework, hosted platform, or catch-all agent builder listing.

Best fit

When to reach for it

Best when the job fits Code Quality & Review.
Works naturally with Custom Agents setups.
Requires Python or TypeScript runtime, agent run outputs or trajectories, optional….
Installation is straightforward: pip install agentevals or npm install agentevals @langchain/core, then pass captured agent trajectories into the provided…

Trust & provenance

Why this listing is credible

Trust status: Security Reviewed.
550 GitHub stars on the linked upstream source.
251k/week npm downloads recorded.
Last updated Apr 19, 2026.

View source ↗