DeepMind Mastered “Go” Only After It Was Told the Score

(p. C3) To function well outside controlled settings, robots must be able to approximate such human capacities as social intelligence and hand-eye coordination. But how to distill them into code?
“It turns out those things are really hard,” said Cynthia Breazeal, a roboticist at the Massachusetts Institute of Technology’s Media Lab.
. . .
Even today’s state-of-the-art AI has serious practical limits. In a recent paper, for example, researchers at MIT described how their AI software misidentified a 3-D printed turtle as a rifle after the team subtly altered the coloring and lighting for the reptile. The experiment showed the ease of fooling AI and raised safety concerns over its use in real-world applications such as self-driving cars and facial-recognition software.
Current systems also aren’t great at applying what they have learned to new situations. A recent paper by the AI startup Vicarious showed that a proficient Atari-playing AI lost its prowess when researchers moved around familiar features of the game.
. . .
Google’s DeepMind subsidiary used a technique known as reinforcement learning to build software that has repeatedly beat the best human players in Go. While learning the classic Chinese game, the machine got positive feedback for making moves that increased the area it walled off from its competitor. Its quest for a higher score spurred the AI to develop territory-taking tactics until it mastered the game.
The problem is that “the real world doesn’t have a score,” said Brown University roboticist Stefanie Tellex. Engineers need to code into AI programs so-called “reward functions”–mathematical ways of telling a machine it has acted correctly. Beyond the finite scenario of a game, amid the complexity of real-life interactions, it’s difficult to determine what results to reinforce. How, and how often, should engineers reward machines to guide them to perform a certain task? “The reward signal is so important to making these algorithms work,” Dr. Tellex added.
. . .
If a robot needs thousands of examples to learn, “it’s not clear that’s particularly useful,” said Ingmar Posner, the deputy director of the Oxford Robotics Institute in the U.K. “You want that machine to pick up pretty quickly what it’s meant to do.”

For the full commentary, see:
Daniela Hernandez. “‘Can Robots Learn to Improvise?” The Wall Street Journal (Sat., Dec. 16, 2017): C3.
(Note: ellipses added.)
(Note: the online version of the commentary has the date Dec. 15, 2017.)

The paper by the researchers at Vicarious, is:
Kansky, Ken, Tom Silver, David A. Mely, Mohamed Eldawy, Miguel Lázaro-Gredilla, Xinghua Lou, Nimrod Dorfman, Szymon Sidor, Scott Phoenix, and Dileep George. “Schema Networks: Zero-Shot Transfer with a Generative Causal Model of Intuitive Physics.” Manuscript, 2017.

The paper, mentioned above, from the MIT Media Lab, is:
Athalye, Anish, Logan Engstrom, Andrew Ilyas, and Kevin Kwok. “Synthesizing Robust Adversarial Examples.” Working paper, Oct. 30, 2017.

Links, Diamond Videos, and Podcasts

UNECE · Innovation Matters: Innovative Dynamism

Innovation history and policies continue to be the themes of this second part of my conversation with Lars Anders Joensson on the United Nations’s Innovation Matters podcast. The discussion of “Innovation Matters: Innovative Dynamism” is mostly related to the process of innovative dynamism as discussed in my book Openness to Creative Destruction. Anders was especially energized in this second part of the conversation. (Recorded Weds., Aug. 3, 2022; posted Thurs., Sept. 19, 2024.) [To play this podcast, you click on the white-arrow-in-the-red-circle, in the upper left hand corner.]

UNECE · Innovation Matters: Openness to creative destruction (part 1) - lessons from history

I discuss innovation history and policies on the United Nations's Innovation Matters podcast in this first part of a conversation with Lars Anders Joensson that was recorded on Weds., Aug. 3, 2022 and was posted on Fri., Feb. 24, 2023. The discussion was mostly based on my book Openness to Creative Destruction. [To play this podcast, you click on the white-arrow-in-the-red-circle, in the upper left hand corner.]

I discuss "Policy Hurdles in the Fight against Aging" on Caleb O. Brown's Cato Daily Podcast that was recorded on Sun., April 3, 2022 and was posted on Fri., May 27, 2022. The discussion is based on research that I am conducting for a chapter of my next book which will be on Less Costs, More Cures: Unbinding Medical Entrepreneurs. [To play this podcast, you click on the white-arrow-in-the-light-blue-circle, in the lower left hand corner.]

On Nov. 3, 2021, I presented "Galilean Science: The Impediment to Progress When Science as Doctrine Wins Over Science as Process" at an Organisation [sic] for Economic Co-operation and Development (OECD) workshop on "AI and the Future of Science." I am grateful to Alistair Nolan for inviting me to participate.

Dr. Derek Yonai of the Koch Center for Leadership and Ethics posted on Tues., March 9, 2021 my half-hour "Innovation Unbound" lecture on how regulations bind innovators.

Petition Seeks to Increase Nebraska Minimum wage

The above story, by reporter Brent Weber, ran on WOWT’s 10 PM news on Tuesday, Aug. 10, 2021. It includes a couple of brief comments by me near the end.

Kate Wand slightly edited my AIER article "When I Knew More Than Hayek," and transformed it into a video she titled "Hayek, Covid & The Use of Knowledge in Society." This is the YouTube version of the video that "premiered" on Jan. 4, 2021. If you click above, the video should play right within my blog.

The YouTube version of the full hour and 15 minute EconTalk podcast on Openness to Creative Destruction, that was posted on August 12, 2019. The host and interviewer was Russ Roberts of Stanford University's Hoover Institution. If you click above, the podcast should play right within my blog.

Arthur Diamond: Sustaining Innovative Dynamism

The URL for the 29 minute "Arthur Diamond: Sustaining Innovative Dynamism" episode of Jim Pethokoukis's Political Economy podcast at the American Enterprise Institute (AEI) web site. Jim interviewed me on my book Openness to Creative Destruction. The episode was posted on July 29, 2020.

The YouTube version of the full hour and 8 minute Econonomics for Entrepreneurs podcast on Openness to Creative Destruction, that was posted on Oct. 22, 2019. The host and interviewer was Hunter Hastings of the Mises Institute. If you click above, the podcast should play right within my blog.

Innovation and Creative Destruction

The URL for the 55 minute "Innovation and Creative Destruction" episode of the Cato Institute's Free Thoughts podcast hosted by Aaron Ross Powell and Trevor Burrus. They interviewed me on my book Openness to Creative Destruction in an episode that was posted on February 29, 2020.

"Wilbur Wright Circles Manhattan": brief musings on Wilbur Wright, flight, and my Openness to Creative Destruction book.

Free to Try a Cure for Covid-19

The URL for the 35 minute "Free to Try a Cure for Covid-19" episode of David Forsyth's Freedom Adventure podcast. In an episode that was posted on Aug. 5, 2020, David interviewed me on how to speed therapies, or a vaccine, for Covid-19, and on my book Openness to Creative Destruction.

Arthur Diamond Interviews on Jim Blassingame's The Small Business Advocate

The URL leads to links to a series of interviews on topics including my book Openness to Creative Destruction, entrepreneurship, regulations, labor markets, and policies to speed vaccines and cures for Covid-19.

Art Diamond's personal website artdiamond.com

Art Diamond's academic website at UNO

"Cafe Hayek" (Don Boudreaux's excellent blog)

The StatCounter number above reports the number of "page loads" since the counter was installed late on 2/26/08. Page loads are defined on the site as "The number of times your page has been visited."

View My Stats

Leave a Reply Cancel reply