Tagged articles

PertEval

1 articles · Page 1 of 1
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 12, 2024 · Artificial Intelligence

How PertEval Reveals the Real Knowledge Limits of Large Language Models

At NeurIPS 2024, Alibaba Cloud's PAI team presented the Spotlight paper PertEval, which introduces knowledge‑invariant perturbations to expose the true knowledge capacity of LLMs, critiques over‑optimistic static benchmarks, and showcases responsible AI solutions and platform demos for enterprise use.

Alibaba CloudEvaluationNeurIPS 2024
0 likes · 6 min read
How PertEval Reveals the Real Knowledge Limits of Large Language Models