How ReLE Redefines Chinese LLM Evaluation and Reveals Capability Anisotropy
The ReLE framework introduces a dynamic, variance‑aware evaluation system that diagnoses capability anisotropy across 304 Chinese large language models, exposing ranking instability, commercial‑vs‑open‑source gaps, and format barriers while cutting evaluation cost by 70%.
