AGI·EVALSSign in
Catalog

5 evals — Code

The same catalog/evals.yaml the CLI reads. Live means it runs end-to-end today; building and roadmap entries show exactly what is coming and welcome contributions.

EvalStatus
HumanEval+Live
LiveCodeBenchLive
BigCodeBenchRoadmap
RepoBenchRoadmap
SWE-LancerRoadmap