DeepHub IMBA
DeepHub IMBA
Mar 6, 2026 · Artificial Intelligence

New March 2026 Paper Exposes Fraudulent Third‑Party APIs for Large Language Models

A recent arXiv study audited 17 popular shadow APIs used in 187 papers, finding up to a 47.21% performance gap versus official models—e.g., Gemini‑2.5‑flash’s accuracy drops from 83.82% to about 37% on MedQA—highlighting serious reliability and safety risks of unofficial LLM services.

AI safetylarge language modelsmodel verification
0 likes · 3 min read
New March 2026 Paper Exposes Fraudulent Third‑Party APIs for Large Language Models
DeepHub IMBA
DeepHub IMBA
Mar 6, 2026 · Artificial Intelligence

Shadow APIs vs Official LLMs: Up to 47% Performance Gap Revealed in New Study

A recent arXiv paper audits 17 widely used shadow APIs, showing that their outputs can deviate from official large language model APIs by as much as 47.21%, with accuracy on the MedQA benchmark dropping from 83.82% to around 37%, raising serious reliability concerns.

AI safetylarge language modelsmodel verification
0 likes · 3 min read
Shadow APIs vs Official LLMs: Up to 47% Performance Gap Revealed in New Study