There doesn't seem to be a super-rigorous definition of the Turing Test, but I don't think it's reasonable to require it to fool an expert whose life depends on the correct choice. It already seems to be decently able to fool a person of average intelligence who has a basic knowledge of LLMs.
I agree that we don't really have AGI yet, but I'd hope we can come up with a better definition of what it is than "we'll know it when we see it". I think it is a legitimate point that we've moved the goalposts some.
The real answer is that once LLMs passed a "casual" application of the Turing test, it just made us realize that the "casual Turing test" is not particularly interesting. It turns out to be too easy to ape human behavior over short time frames for it to be a good indicator of human-like intelligence.
Now, you could argue that this right here is the aforementioned moving of the goalposts. After all, we're deciding that the casual Turing test wasn't interesting precisely after having seen that LLMs could pass it.
However, in my view, the Turing test _always_ implied the "rigorous" Turing test, and it's only now that we're actually flirting with passing it that it had to be clarified what counts as a true Turing test. As I see it, the Turing test can still be salvaged as a criteria for genera intelligence, but only if you allow it to be a no-holds-barred, life-depends-on-it test to exhaustion. This would involve allowing arbitrarily long questioning periods, for instance. I think this is more in the spirit of the original formulation, because the whole idea is to pit a machine against all of human intelligence, proving it has a similar arsenal of adaptability at its disposal. If it only has to passingly fool a human for brief periods, well... I'm afraid that just doesn't prove much. All sorts of stuff briefly fools humans. What requires intelligence is to consistently anticipate and adapt to all lines of questioning in a sustained manner until the human runs out of ideas for how to differentiate.
ELIZA fooled plenty of people (both originally and in the study you just linked) but i still wouldn't say Eliza passed/passes the turing test in general. It just shows that occasionally or even frequently fooling people is not a sufficient proxy for general intelligence. Ofc there isn't a standardized definition, but one thing I would personally include in a "strict" Turing test is that the human interrogee ought to be incentivized to cooperate and to make their humanity as clear as possible. And the interrogator should similarly be incentivized to find the right answer.
Turing gave a pretty rigorous definition of the Turing Test IMO. Well, as rigorous as something that is inherently "anecdotal" can be, which is part of the philosophical point of the Turing Test.
First of. The Turing test has a rigorous definition. Secondly, it has been debunked for almost half a century at this point by Searle’s Chinese room thought experiment. Thirdly, intelligence it self is a scientifically fraught term with ever changing meaning as we discover more and more “intelligent” behavior in nature (by animals and plants, and more). And to make matters worse, general intelligence is even worse, as the term was used almost exclusively for racist pseudo-science, as a way to operationally define a metric which would prove white supremacy.
Artificial General Intelligence will exist when the grifters who profit from it claim it exists. The meaning of it will shift to benefit certain entrepreneurs. It will never actually be a useful term in science nor philosophy.
>Secondly, it has been debunked for almost half a century at this point by Searle’s Chinese room thought experiment.
Searles thought experiment is stupid and debunked nothing. What neuron, cell, atom of your brain understands English ? That's right. You can't answer that anymore than you can answer the subject of Searles proposition, ergo the brain is a Chinese room. If you conclude that you understand English, then the Chinese room understands Chinese.
> Searle’s response to the Systems Reply is simple: in principle, he could internalize the entire system, memorizing all the instructions and the database, and doing all the calculations in his head. He could then leave the room and wander outdoors, perhaps even conversing in Chinese. But he still would have no way to attach “any meaning to the formal symbols”. The man would now be the entire system, yet he still would not understand Chinese. For example, he would not know the meaning of the Chinese word for hamburger. He still cannot get semantics from syntax.
> The man would now be the entire system, yet he still would not understand Chinese.
Really, here the only issue is Searle's inability to grasp the concept that the process is what does the understanding, not the person (or machine, or neurons) that performs it.
Says who? I had already found this study, published almost a year ago, saying that they do: https://arxiv.org/abs/2503.23674
There doesn't seem to be a super-rigorous definition of the Turing Test, but I don't think it's reasonable to require it to fool an expert whose life depends on the correct choice. It already seems to be decently able to fool a person of average intelligence who has a basic knowledge of LLMs.
I agree that we don't really have AGI yet, but I'd hope we can come up with a better definition of what it is than "we'll know it when we see it". I think it is a legitimate point that we've moved the goalposts some.