Improved AI Models Do Worse at Identifying Prime Numbers

(p. A2) . . . new research released this week reveals a fundamental challenge of developing artificial intelligence: ChatGPT has become worse at performing certain basic math operations.

The researchers at Stanford University and the University of California, Berkeley said the deterioration is an example of a phenomenon known to AI developers as drift, where attempts to improve one part of the enormously complex AI models make other parts of the models perform worse.

“Changing it in one direction can worsen it in other directions,” said James Zou, a Stanford professor who is affiliated with the school’s AI lab and is one of the authors of the new research. “It makes it very challenging to consistently improve.”

. . .

The goal of the team of researchers, consisting of Lingjiao Chen, a computer-science Ph.D. student at Stanford, along with Zou and Berkeley’s Matei Zaharia, is to systematically and repeatedly see how the models perform over time at a range of tasks.

Thus far, they have tested two versions of ChatGPT: version 3.5, available free online to anyone, and version 4.0, available via a premium subscription.

The results aren’t entirely promising. They gave the chatbot a basic task: identify whether a particular number is a prime number. This is the sort of math problem that is complicated for people but simple for computers.

Is 17,077 prime? Is 17,947 prime? Unless you are a savant you can’t work this out in your head, but it is easy for computers to evaluate. A computer can just brute force the problem—try dividing by two, three, five, etc., and see if anything works.

To track performance, the researchers fed ChatGPT 1,000 different numbers. In March, the premium GPT-4, correctly identified whether 84% of the numbers were prime or not. (Pretty mediocre performance for a computer, frankly.) By June its success rate had dropped to 51%.

. . .

The phenomenon of unpredictable drift is known to researchers who study machine learning and AI, Zou said. “We had the suspicion it could happen here, but we were very surprised at how fast the drift is happening.”

For the full commentary, see:

Josh Zumbrun. “THE NUMBERS; AI Surprise: It’s Unlearning Basic Math.” The Wall Street Journal (Saturday, Aug. 5, 2023): A2.

(Note: ellipses added.)

(Note: the online version of the commentary has the date August 4, 2023, and has the title “THE NUMBERS; Why ChatGPT Is Getting Dumber at Basic Math.”)

Links, Diamond Videos, and Podcasts

UNECE · Innovation Matters: Innovative Dynamism

Innovation history and policies continue to be the themes of this second part of my conversation with Lars Anders Joensson on the United Nations’s Innovation Matters podcast. The discussion of “Innovation Matters: Innovative Dynamism” is mostly related to the process of innovative dynamism as discussed in my book Openness to Creative Destruction. Anders was especially energized in this second part of the conversation. (Recorded Weds., Aug. 3, 2022; posted Thurs., Sept. 19, 2024.) [To play this podcast, you click on the white-arrow-in-the-red-circle, in the upper left hand corner.]

UNECE · Innovation Matters: Openness to creative destruction (part 1) - lessons from history

I discuss innovation history and policies on the United Nations's Innovation Matters podcast in this first part of a conversation with Lars Anders Joensson that was recorded on Weds., Aug. 3, 2022 and was posted on Fri., Feb. 24, 2023. The discussion was mostly based on my book Openness to Creative Destruction. [To play this podcast, you click on the white-arrow-in-the-red-circle, in the upper left hand corner.]

I discuss "Policy Hurdles in the Fight against Aging" on Caleb O. Brown's Cato Daily Podcast that was recorded on Sun., April 3, 2022 and was posted on Fri., May 27, 2022. The discussion is based on research that I am conducting for a chapter of my next book which will be on Less Costs, More Cures: Unbinding Medical Entrepreneurs. [To play this podcast, you click on the white-arrow-in-the-light-blue-circle, in the lower left hand corner.]

On Nov. 3, 2021, I presented "Galilean Science: The Impediment to Progress When Science as Doctrine Wins Over Science as Process" at an Organisation [sic] for Economic Co-operation and Development (OECD) workshop on "AI and the Future of Science." I am grateful to Alistair Nolan for inviting me to participate.

Dr. Derek Yonai of the Koch Center for Leadership and Ethics posted on Tues., March 9, 2021 my half-hour "Innovation Unbound" lecture on how regulations bind innovators.

Petition Seeks to Increase Nebraska Minimum wage

The above story, by reporter Brent Weber, ran on WOWT’s 10 PM news on Tuesday, Aug. 10, 2021. It includes a couple of brief comments by me near the end.

Kate Wand slightly edited my AIER article "When I Knew More Than Hayek," and transformed it into a video she titled "Hayek, Covid & The Use of Knowledge in Society." This is the YouTube version of the video that "premiered" on Jan. 4, 2021. If you click above, the video should play right within my blog.

The YouTube version of the full hour and 15 minute EconTalk podcast on Openness to Creative Destruction, that was posted on August 12, 2019. The host and interviewer was Russ Roberts of Stanford University's Hoover Institution. If you click above, the podcast should play right within my blog.

Arthur Diamond: Sustaining Innovative Dynamism

The URL for the 29 minute "Arthur Diamond: Sustaining Innovative Dynamism" episode of Jim Pethokoukis's Political Economy podcast at the American Enterprise Institute (AEI) web site. Jim interviewed me on my book Openness to Creative Destruction. The episode was posted on July 29, 2020.

The YouTube version of the full hour and 8 minute Econonomics for Entrepreneurs podcast on Openness to Creative Destruction, that was posted on Oct. 22, 2019. The host and interviewer was Hunter Hastings of the Mises Institute. If you click above, the podcast should play right within my blog.

Innovation and Creative Destruction

The URL for the 55 minute "Innovation and Creative Destruction" episode of the Cato Institute's Free Thoughts podcast hosted by Aaron Ross Powell and Trevor Burrus. They interviewed me on my book Openness to Creative Destruction in an episode that was posted on February 29, 2020.

"Wilbur Wright Circles Manhattan": brief musings on Wilbur Wright, flight, and my Openness to Creative Destruction book.

Free to Try a Cure for Covid-19

The URL for the 35 minute "Free to Try a Cure for Covid-19" episode of David Forsyth's Freedom Adventure podcast. In an episode that was posted on Aug. 5, 2020, David interviewed me on how to speed therapies, or a vaccine, for Covid-19, and on my book Openness to Creative Destruction.

Arthur Diamond Interviews on Jim Blassingame's The Small Business Advocate

The URL leads to links to a series of interviews on topics including my book Openness to Creative Destruction, entrepreneurship, regulations, labor markets, and policies to speed vaccines and cures for Covid-19.

Art Diamond's personal website artdiamond.com

Art Diamond's academic website at UNO

"Cafe Hayek" (Don Boudreaux's excellent blog)

The StatCounter number above reports the number of "page loads" since the counter was installed late on 2/26/08. Page loads are defined on the site as "The number of times your page has been visited."

View My Stats

Leave a Reply Cancel reply