Catalog
2 evals — Code
The same catalog/evals.yaml the CLI reads. Live means it runs end-to-end today; building and roadmap entries show exactly what is coming and welcome contributions.
| Eval | Category | Paper | License | Status |
|---|---|---|---|---|
| HumanEval+ | Code | Live | ||
| LiveCodeBench | Code | Live |