The Minwa supercomputer was set to a task of scanning and sorting a large database's worth of images, and did so with a 95.42% rate of accuracy, Baidu reports.
This beat both Google's system, which scored a 95.2% and Microsoft's, at 95.06%.
All three systems have, over recent months, outperformed humans at this work, who usually rate about a 95% accuracy rate at this type of task.
"I am very excited about all the progress in computer vision that the whole community has made," Baidu’s chief scientist Andrew Ng told the Wall Street Journal about the rapid advances in this technology. "Computers can understand images so much better and do so many things that they couldn’t do just a year ago."
The ImageNet database supplied more than 1 million images for Minwa to sift through and sort into more than 1,000 different pre-defined categories: for example, sorting images of animals or food, by species or type.
This type of artificial intelligence — "deep learning" — is hot in Silicon Valley circles right now. Google, for one, has acquired several AI companies, including DeepMind, a UK-based company that has, among other things, created systems to learn how to play different kinds of computer games.
The language recognition system that Google developed using this kind of deep learning led to big improvements in Android's voice recognition capacity.
Baidu as well, is working on a larger supercomputer that will process 14,000 hours of English and Chinese-language audio in order to improve voice recognition systems. If the search giant fulfills its goal of building a supercomputer that can perform 7 quadrillion calculations per second, that would garner it a spot on the list of the world's top 10 most powerful supercomputers.
As tech giants like Google, Microsoft or Facebook build ever larger supercomputers in search of ever higher AI achievements, the image recognition test that ImageNet provides is starting to be seen as "passe as a benchmark," noted AI researcher Yann LeCun told the Journal.
"People are focusing on much larger data sets and more challenging tasks that involve object recognition, such as object detection and localization."