Skip to content

Elo Rating

A ranking system adapted from chess for comparing AI models through head-to-head comparisons. Users vote on which model response they prefer, and Elo ratings update accordingly. It provides a single score reflecting relative model quality across a population of evaluators.

Related terms

LeaderboardBenchmarkHuman Evaluation
← Back to glossary