Recent headlines have claimed that an AI chatbot has officially passed the Turing test, marking what some see as a major milestone in artificial intelligence. These reports are based on a preprint study conducted by researchers Cameron Jones and Benjamin Bergen at the University of California, San Diego. Their study found that OpenAI’s GPT-4.5 was judged to be human more than 70% of the time during a controlled experiment—suggesting it has reached a new level of conversational realism.
The experiment, which has not yet undergone peer review, tested four large language models (LLMs): ELIZA, GPT-4o, LLaMa-3.1-405B, and GPT-4.5. A total of 284 participants were involved, alternating between roles as interrogators and witnesses. Interrogators engaged in text-based conversations with two entities—one human, one AI—via a split-screen interface for five minutes. At the end of each session, participants were asked to determine which was human.
Continue reading… “GPT-4.5 Reportedly Passes Turing Test—But What Does That Really Mean?”
