← Docs/ Agent / Tool use

AgentBoard

Roadmap

Fine-grained progress-rate metrics over partially solved agent tasks.

On the roadmap

AgentBoard is catalogued but not runnable yet, so there are no usage docs — we do not document what does not run. The fact sheet below is sourced from the paper; the protocols it will implement are stable today.

Paper: AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents
Citation: Ma et al., 2024, arXiv:2401.13178
License: Apache-2.0
Homepage: https://hkust-nlp.github.io/agentboard

How an eval goes live

Implement an EvalRunner against the stable protocols.
Bundle a small real-schema sample so it runs offline.
Point the catalog entry's runner at the class.
Ship its docs in the same change — required to flip live.

pip install agi-eval