← Docs/ Reasoning
FrontierMath
RoadmapUnpublished research-level math problems vetted by professional mathematicians.
On the roadmap
FrontierMath is catalogued but not runnable yet, so there are no usage docs — we do not document what does not run. The fact sheet below is sourced from the paper; the protocols it will implement are stable today.
- Paper
- FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning
- Citation
- Glazer et al., 2024, arXiv:2411.04872
- License
- Proprietary (Epoch AI)
- Homepage
- https://epoch.ai/frontiermath
How an eval goes live
- Implement an EvalRunner against the stable protocols.
- Bundle a small real-schema sample so it runs offline.
- Point the catalog entry's runner at the class.
- Ship its docs in the same change — required to flip live.
pip install agi-evals