Baobao Algorithm Notes
Baobao Algorithm Notes
May 13, 2024 · Artificial Intelligence

How to Detect Test Set Leakage in Black‑Box Language Models

The ICLR 2024 paper introduces a black‑box method for detecting test‑set leakage in large language models by comparing log‑probabilities of original and shuffled test orders, proposes a scalable sharded likelihood test, and demonstrates its effectiveness on several open‑source models, revealing a potential leak in Mistral‑7B.

LLM evaluationlanguage model securityshuffled likelihood test
0 likes · 7 min read
How to Detect Test Set Leakage in Black‑Box Language Models