Researchers at Stanford University’s HAI Research Institute are advocating for a redesign of tests to better evaluate artificial intelligence (AI) capabilities. These researchers have noted that in various tests, AI has reached a level comparable to that of an average human. The annual reports published by the HAI Research Institute examine different versions of AI and their capabilities.
The latest 500-page review, published in April 2023, highlights the significant advancements in AI capabilities, particularly in image classification, language understanding, and everyday deduction based on visual cues. According to the report, the best AI models are now competing with humans in these tasks.
One notable example is OpenAI’s artificial intelligence, which solved 84.3 percent of math problems in a recent test, a significant improvement from its previous performance of 6.9 percent in 2021. However, despite these impressive advancements, there are still challenges and limitations to be addressed, such as the tendency for large language models to make errors or “hallucinations.”
To address these challenges and limitations and further compare AI capabilities with human skills, researchers at Stanford University are looking towards creating new tests. The introduction of new models like GPT-5 is expected to influence the development of these tests and shed light on the ongoing progress and challenges in the field of artificial intelligence.
Overall, the HAI Research Institute’s findings highlight the importance of continually improving tests to accurately assess AI capabilities and identify areas where human skills still surpass those of artificial intelligence.