New March 2026 Paper Exposes Fraudulent Third‑Party APIs for Large Language Models
A recent arXiv study audited 17 popular shadow APIs used in 187 papers, finding up to a 47.21% performance gap versus official models—e.g., Gemini‑2.5‑flash’s accuracy drops from 83.82% to about 37% on MedQA—highlighting serious reliability and safety risks of unofficial LLM services.
