Large Language Models GPT-4.5 and LLaMa-3.1-405B Pass Standard Turing Test in UCSD Study
A UC San Diego study found that GPT-4.5 was judged human 73% of the time and LLaMa-3.1-405B 56%, demonstrating that both large language models can pass a standard three‑party Turing test, with detailed methodology, results, and analysis of judge behavior.