← Catalog/ Agent / Tool use
API-Bank
RoadmapPlan-and-call evaluation over a graded pool of tool APIs.
Status
This eval is catalogued and on the roadmap. The protocols are stable — implementing it is an EvalRunner with a catalog entry.
Plan-and-call evaluation over a graded pool of tool APIs.
This eval is catalogued and on the roadmap. The protocols are stable — implementing it is an EvalRunner with a catalog entry.