In previous broadcast of Voice of Russia's '.RU' - daily Runet review we've mentioned that – dissecting the widely reported 'victory' of artificial intelligence over the weekend. A computer pretending to be Eugene Goostman, a 13-year-old Ukrainian boy, has supposedly beaten the ‘Turing test,' – it has convinced one third of judges that it was not a computer, but a living being.
A computer pretending to be Eugene Goostman, a 13-year-old Ukrainian boy, has supposedly beaten the ‘Turing test,’ – it has convinced one third of judges that it was not a computer, but a living being. Well, there is a number of problems with this. First of all, the whole concept of the Turing test is quite ambiguous, as I’ve mentioned in the previous broadcast. And here lies its inherent flaw – the machine doesn’t need to do anything but to convince the judge they’re talking to human – but there’s much more to intelligence than talking, right?
But let’s take the latest example. The test was held in London, at Reading University. The computer tricked judges (well, one third of them) into believing they were communicating with a thirteen year old boy. From Ukraine. Think about it. How may intelligent conversations have you held with thirteen-year-olds online? I don’t think most conversations are quite eloquent, deep or at all meaningful. And since the test was taken in English, the boy being from another country, we have yet another simplification. Any mistakes or misunderstandings can be chalked off due to lack of language skills. And then there’s the human factor – while a deep analysis of speech would probably determine tell-tale signs of common artificial behavior, a human judge can simply dismiss anything wrong with the conversation due to the aforementioned factors – young age and potentially lacking second language skills. So not only the ‘artificial intelligence’ was a chatbot, created specifically to mimic human conversation, it also did a poor job of doing so due to the limitations by design.
But wait, there’s more! Back in 2011 another chat bot scored higher marks in a similar test. Perhaps you’re familiar with Cleverbot – located at cleverbot.com, it’s arguably one of the smartest chat bots out there, capable of holding seemingly meaningful conversations and learning from each one. Well, a special version of the Cleverbot, running on extra-powerful machines, took a formal Turing Test at the Techniche 2011 festival in Guwahati, India. Thirty volunteers had a 4-minute conversation with an unknown counterpart either a human or a computer – in this case, Cleverbot, with a fifty-fifty chance. Both the participants – or the judges from the Turing test – and the audience, of whom there was over a thousand, then rated the responses, trying to determine whether the unseen counterpart was human. Guess what, Cleverbot was voted 59.3 per cent human. Interestingly, human chatters were correctly identified with only 63.3 per cent accuracy. But, again, this was only a chatbot, created specifically to mimic human conversations – it cannot do anything else and cannot act out of its own volition.
Everything which needs to be evaluated generally has a certain test designed to provide a framework for such evaluation. There’s no going around that. Humans adapt, and in order to better perform at these evaluations, humans generally prefer to take the path of least resistance – it’s visible everywhere. There are certain engine volume brackets for cars in racing, customs, taxation; engineers create largest engines at these brackets ceilings. The same thing boxers do – they gain maximum weight for certain categories; moreover, they even dehydrate themselves before weigh-in. Students prepare for exams by not studying subjects thoroughly, but by studying potential exam questions, leaving potential knowledge gaps. As you can see, a similar idea applies here – a chat bot is not true AI, and never will be.