While learning high-level math is not easy, teaching math concepts can often be just as difficult. That is why many teachers turn to ChatGPT for help. According to a recent Forbes article, 51 percent of teachers surveyed said they use ChatGPT to help teach, with 10 percent using it daily. ChatGPT can help deliver technical information in more basic terms, but it doesn’t always provide the right solution, especially for high-level math.
An international team of researchers tested what the software could handle by giving the generative AI program challenging graduate-level math questions. While ChatGPT fails a significant number of them, its correct answers suggest that it could be useful for mathematics researchers and teachers as a kind of specialized search engine.
Illustrating the mathematical muscles of ChatGPT
The media has described ChatGPT’s mathematical intelligence as good or incompetent. “Only the extremes are emphasized,” explains Frieder Simon, a University of Oxford PhD candidate and lead author of the study. For example, ChatGPT won Psychology Today’s Verbal-Linguistic Intelligence IQ Test, scoring 147 points, but failed miserably on Accounting Today’s CPA exam. “There is a middle [road] for some use cases; ChatGPT performs very well [for some students and educators]but for others, not so much,” added Simon.
At test-level high school and undergraduate math classes, ChatGPT performed well, ranking in the 89th percentile for the SAT math test. It even received a B on technology expert Scott Aaronson’s quantum computing final exam.
But different tests may be required to reveal the limitations of ChatGPT’s capabilities. “One thing the media has focused on is ChatGPT’s ability to pass a variety of popular standardized tests,” said Leah Henrickson, a professor of digital media at the University of Leeds. “These are tests that students spend literally years preparing for. We are often led to believe that these tests evaluate our intelligence, but more often than not, they evaluate our ability to remember facts.ChatGPT can pass these tests because it can remember the facts it acquired in its training.
Simon and his research team proposed a unique set of high-level math questions to determine whether ChatGPT also has testing and problem-solving skills. “[Previous studies looked at] whether the output is correct or incorrect,” added Simon. “And we want to go beyond this and implement a much better method where we can determine how ChatGPT fails, when it fails, and how it fails.” To create a more complex test system, the researchers compiled the prompts from several fields into a larger problem set they called GHOSTS.