Latest “So-Called Reasoning Systems” Hallucinate MORE Than Earlier A.I. Systems

Since more sophisticated “reasoning” A.I. systems are increasingly inaccurate on the facts, it is unlikely that such systems will threaten any job where job performance depends on getting the facts right. Wouldn’t that include most jobs? The article quoted below suggests it would most clearly include jobs working with “court documents, medical information or sensitive business data.”

(p. B1) The newest and most powerful technologies — so-called reasoning systems from companies like OpenAI, Google and the Chinese start-up DeepSeek — are generating more errors, not fewer. As their math skills have notably improved, their handle on facts has gotten shakier. It is not entirely clear why.

Today’s A.I. bots are based on complex mathematical systems that learn their skills by analyzing enormous amounts of digital data. They do not — and cannot — decide what (p. B6) is true and what is false. Sometimes, they just make stuff up, a phenomenon some A.I. researchers call hallucinations. On one test, the hallucination rates of newer A.I. systems were as high as 79 percent.

. . .

The A.I. bots tied to search engines like Google and Bing sometimes generate search results that are laughably wrong. If you ask them for a good marathon on the West Coast, they might suggest a race in Philadelphia. If they tell you the number of households in Illinois, they might cite a source that does not include that information.

Those hallucinations may not be a big problem for many people, but it is a serious issue for anyone using the technology with court documents, medical information or sensitive business data.

“You spend a lot of time trying to figure out which responses are factual and which aren’t,” said Pratik Verma, co-founder and chief executive of Okahu, a company that helps businesses navigate the hallucination problem. “Not dealing with these errors properly basically eliminates the value of A.I. systems, which are supposed to automate tasks for you.”

. . .

For more than two years, companies like OpenAI and Google steadily improved their A.I. systems and reduced the frequency of these errors. But with the use of new reasoning systems, errors are rising. The latest OpenAI systems hallucinate at a higher rate than the company’s previous system, according to the company’s own tests.

The company found that o3 — its most powerful system — hallucinated 33 percent of the time when running its PersonQA benchmark test, which involves answering questions about public figures. That is more than twice the hallucination rate of OpenAI’s previous reasoning system, called o1. The new o4-mini hallucinated at an even higher rate: 48 percent.

When running another test called SimpleQA, which asks more general questions, the hallucination rates for o3 and o4-mini were 51 percent and 79 percent. The previous system, o1, hallucinated 44 percent of the time.

. . .

For years, companies like OpenAI relied on a simple concept: The more internet data they fed into their A.I. systems, the better those systems would perform. But they used up just about all the English text on the internet, which meant they needed a new way of improving their chatbots.

So these companies are leaning more heavily on a technique that scientists call reinforcement learning. With this process, a system can learn behavior through trial and error. It is working well in certain areas, like math and computer programming. But it is falling short in other areas.

For the full story see:

Cade Metz and Karen Weise. “A.I. Hallucinations Are Getting Worse.” The New York Times (Fri., May 9, 2025): B1 & B6.

(Note: ellipses added.)

(Note: the online version of the story was updated May 6, 2025, and has the title “A.I. Is Getting More Powerful, but Its Hallucinations Are Getting Worse.”)

A.I. Only “Knows” What Has Been Published or Posted

A.I. “learns” by scouring language that has been published or posted. If outdated or never-true “facts” are posted on the web, A.I. may regurgitate them. It takes human eyes to check whether there really is a picnic table in a park.

(p. B1) Last week, I asked Google to help me plan my daughter’s birthday party by finding a park in Oakland, Calif., with picnic tables. The site generated a list of parks nearby, so I went to scout two of them out — only to find there were, in fact, no tables.

“I was just there,” I typed to Google. “I didn’t see wooden tables.”

Google acknowledged the mistake and produced another list, which again included one of the parks with no tables.

I repeated this experiment by asking Google to find an affordable carwash nearby. Google listed a service for $25, but when I arrived, a carwash cost $65.

I also asked Google to find a grocery store where I could buy an exotic pepper paste. Its list included a nearby Whole Foods, which didn’t carry the item.

For the full commentary see:

Brian X. Chen. “Underneath a New Way to Search, A Web of Wins and Imperfections.” The New York Times (Tues., June 3, 2025): B1 & B4.

(Note: the online version of the commentary has the date May 29, 2024, and has the title “Google Introduced a New Way to Use Search. Proceed With Caution.”)

A.I. Hastens Search for Antibiotic Peptides in Extinct Species

In an earlier entry I commented on the use of A.I. to seek antibodies by George Church’s startup Lila. Now it appears that César de la Fuente is employing a similar approach. In both cases A.I. is being used to more efficiently do repetitive well-structured tasks. This is not the highest creative level of human intelligence, but it can free time for humans to exercise the highest level of human intelligence.

(p. A3) Buried in the DNA of the long extinct woolly mammoth is a compound that scientists hope will one day yield a lifesaving antibiotic.

In experiments, mammuthusin, as the compound is called, has eradicated superbugs—bacteria that are resistant to today’s antibiotics and cause infections that are hard to treat—says César de la Fuente, the bioengineer who helped discover the molecule.

. . .

To help combat superbugs, doctors say we need new antibiotics with novel chemical structures or mechanisms of action. But only a handful of such drugs has entered the market over the past several decades.

De la Fuente is banking on artificial intelligence to help end this dry spell. He and his collaborators have built deep-learning algorithms to comb through enormous genetic databases to find peptides, or protein fragments, that have antibacterial properties. They have used this method to analyze animal venoms, the human microbiome and archaea, an underexplored group of microorganisms. They have also mined the genetic codes from fossils of long-extinct animals and humans, including Neanderthals and Denisovans. “This deep-learning model has opened a window into the past,” de la Fuente says.

. . .

When the algorithms identify a new peptide with antibiotic potential, de la Fuente and his team use robots to manufacture the compound in their lab and then test it in mice infected with bacteria. So far, a few hundred peptides made in de la Fuente’s lab have safely and effectively cured sick mice.

One of them was mammuthusin, identified in the genetic code of Mammuthus primigenius, a species of mammoth that last roamed the Earth about 4,000 years ago. The researchers discovered the peptide after mining a National Center for Biotechnology Information database of DNA sequencing data obtained from the fossils of extinct animals. In experiments, mammuthusin was as potent as polymyxin B, an antibiotic often used as a last resort for serious infections, according to a paper published in the journal Nature in June [2024]. The mammoth peptide effectively eradicated a type of bacterium that the World Health Organization has designated a critical pathogen because of its resistance to many common antibiotics.

For the full story, see:

Dominique Mosbergen. “Search for New Antibiotics Turns Back Time.” The Wall Street Journal (Weds., May 28, 2025): A3.

(Note: ellipses, and bracketed year, added.)

(Note: the online version of the story has the date May 24, 2025, and has the title “A Search for New Antibiotics in Ancient DNA.” In the original of both the online and print versions, Mammuthus primigenius appeared in italics.)

The academic article published in Nature Biomedical Engineering in June 2024, and mentioned above, is:

Wan, Fangping, Marcelo D. T. Torres, Jacqueline Peng, and Cesar de la Fuente-Nunez. “Deep-Learning-Enabled Antibiotic Discovery through Molecular De-Extinction.” Nature Biomedical Engineering 8, no. 7 (July 2024): 854-71.

My Email Response to George Church on A.I. and Longevity

On May 17 I ran an entry commenting on George Church’s over-optimism about the use of A.I. to replicate the scientific method, and expressed wistful disappointment that Church’s longevity project had not advanced as quickly as 60 Minutes implied it would in 2019.

On May 20, Church sent me a cordial email disputing some of what I wrote in my entry. I responded to him on May 22, and asked him if he would mind if I ran his email and my response on my blog. He never responded to that request, so I will not reproduce his email here. But I see no harm in my including below the links he sent me. And then I will follow with my email response to him.

Here are the links that Church thought I should ponder:

2024 pmc.ncbi.nlm.nih.gov/articles/PMC10909732 (see Fig 1b)
2022 rejuvenatebio.com/animal-health-pipeline
2022 rejuvenatebio.com/pipeline
2023 biorxiv.org/content/10.1101/2023.11.13.566787v1.full

Here is my email response to Church:

Dear Prof. Church,

Thank you for taking the time to read and respond to my blog post. I appreciate the links you sent. The first link gives us the good news of progress toward increasing the lifespan of mice and in reducing their frailty, which could be interpreted as one part of reversing their aging. The fourth link also gives good news of the proof-of-concept of a new factor at the cell level that may be able to rejuvenate cells without the cancer of the Yamanaka factors.

But on 60 Minutes in 2019 you said age reversal was already “available to mice.” And you said the “veterinary product might be a couple years away and then that takes another 10 years to get through the human clinical trials.” That is not exactly a promise, but it does sound like a hopeful prediction. And I will admit that the timing matters to me. If your 60 Minutes prediction was right there’s a good chance I might live to see it; if it takes twice that long, I almost certainly will not.

In re-reading my post, I see a couple of revisions I would make. I would add that I wish you well in what you are trying to do, and strongly and sincerely hope that you succeed (whether through A.I or by other means). And I would add that I believe Elon Musk said that being overly optimistic is one way that great innovators push themselves toward great goals.

I appreciate your “fact checking” offer. I have a comment apropos that. You say that “The Lohr article doesn’t say “feeding” or “literature”. “ Here is the relevant exact quote from the Lohr article:

Lila has taken a science-focused approach to training its generative A.I., feeding it research papers, documented experiments and data from its fast-growing life science and materials science lab. That, the Lila team believes, will give the technology both depth in science and wide-ranging abilities, mirroring the way chatbots can write poetry and computer code.

So the Lohr article does say “feeding.” It doesn’t say “literature,” but does say “research papers” which I take to be the same thing. I appreciate that Lila also is collecting new data. But is it some generative intelligence in Lila that is identifying the new data to seek or is it George Church and his team?

I agree that A.I. can help crank through possibilities that have already been defined. I am dubious that A.I. can come up with the possibilities as well as George Church and his team can. It may seem harmless that A.I. is being over-hyped. But as an economist it is my job to notice that funding is scarce, so funding spent on A.I. is funding not spent on other inputs to science.

I fear that I may come across as a privileged spectator complaining about the bloodied combatant in the arena. But a big part of my research is aimed at reducing the regulations that burden medical entrepreneurs. For instance, I am working on a paper supporting Milton Friedman’s suggestion that the F.D.A. should just regulate for safety and stop regulating for efficacy. Without Phase 3, more can be tried, more quickly and more cheaply.

If you are willing, I would like to paste your response (or an edited version if you prefer) at the end of my original post. Let me know if that is OK.

Thanks!

Art

One Third of Near-Death Multiple Myeloma Patients Are Cured by a New CAR-T Immunotherapy

Many consider immunotherapy to be the most promising current approach to curing cancer. One way to implement immunotherapy is to develop CAR-T cells. But there apparently are many ways to develop a CAR-T cell and which, if any, will work is a matter of trial-and-error.

It seems overly-cautious for regulators to require that the most innovative and promising therapies must first be tried on the patients nearest to death, and so least likely to respond. Why not allow patients at earlier stages to volunteer to try the new therapies earlier? They would be taking a bigger risk, but also would have the possibility of a bigger benefit. They would avoid the suffering from current treatments that are known to have major side-effects, and also are known to only extend life for short periods of time; and they would gain a shot at a real long-term cure.

(p. A18) A group of 97 patients had longstanding multiple myeloma, a common blood cancer that doctors consider incurable, and faced a certain, and extremely painful, death within about a year.

They had gone through a series of treatments, each of which controlled their disease for a while. But then it came back, as it always does. They reached the stage where they had no more options and were facing hospice.

They all got immunotherapy, in a study that was a last-ditch effort.

A third responded so well that they got what seems to be an astonishing reprieve. The immunotherapy developed by Legend Biotech, a company founded in China, seems to have made their cancer disappear. And after five years, it still has not returned in those patients — a result never before seen in this disease.

These results, in patients whose situation had seemed hopeless, has led some battle-worn American oncologists to dare to say the words “potential cure.”

. . .

The new study, reported Tuesday [June 3, 2025] at the annual conference of the American Society of Clinical Oncology and published in The Journal of Clinical Oncology, was funded by Johnson & Johnson, which has an exclusive licensing agreement with Legend Biotech.

. . .

The Legend immunotherapy is a type known as CAR-T. It is delivered as an infusion of the patient’s own white blood cells that have been removed and engineered to attack the cancer. The treatment has revolutionized prospects for patients with other types of blood cancer, like leukemia.

Making CAR-T cells, though, is an art, with so many possible variables that it can be hard to hit on one that works.

. . .

The . . . study took on a . . . challenge — helping patients at the end of the line after years of treatments. Their immune systems were worn down. They were, as oncologists said, “heavily pretreated.” So even though CAR-T is designed to spur their immune systems to fight their cancer, it was not clear their immune systems were up to it.

Oncologists say that even though most patients did not clear their cancer, having a third who did was remarkable.

To see what the expected life span would be for these patients without the immunotherapy, Johnson & Johnson looked at data from patients in a registry who were like the ones in its study — they had failed every treatment. They lived about a year.

. . .

. . ., the hope is that perhaps by giving it earlier in the course of the disease, it could cure patients early on.

Johnson & Johnson is now testing that idea.

Dr. Kenneth Anderson, a myeloma expert at Dana-Farber Cancer Institute who was not involved with the study, said that if the treatment is used as a first-line treatment, “cure is now our realistic expectation.”

For the full story, see:

Gina Kolata. “From No Hope to Potential Cure for Deadly Blood Cancer, Study Shows.” The New York Times (Thurs., June 5, 2025): A18.

(Note: ellipses, and bracketed date, added.)

(Note: the online version of the story was updated June 5, 2025, and has the title “From No Hope to a Potential Cure for a Deadly Blood Cancer.”)

The academic article on the new cure is:

Jagannath, Sundar, Thomas G. Martin, Yi Lin, Adam D. Cohen, Noopur Raje, Myo Htut, Abhinav Deol, Mounzer Agha, Jesus G. Berdeja, Alexander M. Lesokhin, Jessica J. Liegel, Adriana Rossi, Alex Lieberman-Cribbin, Saad Z. Usmani, Binod Dhakal, Samir Parekh, Hui Li, Feng Wang, Rocio Montes de Oca, Vicki Plaks, Huabin Sun, Arnob Banerjee, Jordan M. Schecter, Nikoletta Lendvai, Deepu Madduri, Tamar Lengil, Jieqing Zhu, Mythili Koneru, Muhammad Akram, Nitin Patel, Octavio Costa Filho, Andrzej J. Jakubowiak, and Peter M. Voorhees. “Long-Term (≥5-Year) Remission and Survival after Treatment with Ciltacabtagene Autoleucel in Cartitude-1 Patients with Relapsed/Refractory Multiple Myeloma.” Journal of Clinical Oncology https://doi.org/10.1200/JCO-25-0076.

Electricity May Be a Pellet in the “Magic Buckshot” Against Cancer

In a recent entry I claimed that the cure for many diseases may not be Paul Ehrlich’s one “magic bullet” but may instead be “magic buckshot.” A recent article in The Wall Street Journal suggests that one pellet in the magic buckshot against cancer is electricity. As proof of concept, the article claims that after surgery, radiation, and chemotherapy for a glioblastoma brain cancer, adding electrodes to the skull that deliver low-intensity electricity to the brain, will add a median of 4.9 months to the patient’s lifespan.

The Wall Street Journal article mentioned above is:

Brianna Abbott. “Next Hope in Treating Cancer: Electricity.” The Wall Street Journal (Tues., May 20, 2025): A10.

(Note: the online version of the article has the date May 16, 2025, and has the title “The Next Frontier to Treat Cancer: Electricity.”)

The Wall Street Journal article links to the following article in JAMA:

Stupp, Roger, Sophie Taillibert, Andrew Kanner, William Read, David M. Steinberg, Benoit Lhermitte, Steven Toms, Ahmed Idbaih, Manmeet S. Ahluwalia, Karen Fink, Francesco Di Meco, Frank Lieberman, Jay-Jiguang Zhu, Giuseppe Stragliotto, David D. Tran, Steven Brem, Andreas F. Hottinger, Eilon D. Kirson, Gitit Lavy-Shahaf, Uri Weinberg, Chae-Yong Kim, Sun-Ha Paek, Garth Nicholas, Jordi Bruna, Hal Hirte, Michael Weller, Yoram Palti, Monika E. Hegi, and Zvi Ram. “Effect of Tumor-Treating Fields Plus Maintenance Temozolomide Vs Maintenance Temozolomide Alone on Survival in Patients with Glioblastoma: A Randomized Clinical Trial.” JAMA 318, no. 23 (Dec. 19, 2017): 2306-16.

During Covid-19 “Bureaucratic Authorities Erred in Pretending . . . Certainty”

(p. A13) Adam Kucharski, a professor of epidemiology at the London School of Hygiene & Tropical Medicine, takes the reader on a fascinating tour of the history of what has counted as proof.

. . .

What should we do, . . ., when a mathematical proof of truth is unavailable, but we must nonetheless act?

This leads us to a discussion of probability and statistics, and of pioneers such as William Gosset, a brewer at Guinness who figured out how to quantify random errors in experiments, and Janet Lane-Claypon, an English scientist who first thought to investigate confounding factors while analyzing children’s health. Some innovations, though, have hardened into unhelpful dogma. The scientific notion of “statistical significance” relies, Mr. Kucharski explains, on a wholly arbitrary cutoff, which incentivizes researchers to massage their data. Such issues, he says, can be hard for scientists, let alone the laity, to understand.

Mr. Kucharski speaks from experience, since he was one of the experts first called upon by the British government for advice on the Covid-19 pandemic. He explains brilliantly the fragmentary and confusing nature of the data then available, and the provisional conclusions they led to. As a public face of this effort, Mr. Kucharski was bombarded daily with abusive and threatening messages from angry citizens who simply didn’t believe what they were being told.

The lesson Mr. Kucharski draws isn’t that he and his colleagues were right (though they largely were), but that bureaucratic authorities erred in pretending there was certainty when all that was possible at the time was messy and provisional. Notoriously, in March 2020 the World Health Organization tweeted “FACT: #COVID19 is NOT airbone.” (As it turns out, it was, and it is.) The author regrets, too, that politicians claimed to be “following the science,” because science can never tell you what you should do.

For the full review see:

Steven Poole. “Bookshelf; Finding Truth In Numbers.” The Wall Street Journal (Friday, June 6, 2025): A13.

(Note: ellipses added.)

(Note: the online version of the review has the date June 5, 2025, and has the title “Bookshelf; ‘Proof’: Finding Truth in Numbers.”)

The book under review is:

Kucharski, Adam. Proof: The Art and Science of Certainty. New York: Basic Books, 2025.

Dream of the Magic Buckshot

Paul Ehrlich in the early 1900s sought a “magic bullet” for each disease (including cancer). Given the alternatives at the time, when he found Salvarsan it could be considered a magic bullet against syphilis. Today Dr. Dale Bredesen has replaced “magic” with “silver” and “bullet” with “buckshot.” In his effort to cure Alzheimer’s “Bredesen believed in firing a “silver buckshot” (a reference to the sprayed pellets that come out of shotgun shells) by modifying 36 factors simultaneously” (Gellman, p. 18).

I have not investigated Dr. Bredesen’s “cure” for Alzheimer’s and express no opinion on it. I will express the opinion that I do not like the arrogantly dismissive tone of Lindsay Gellman’sThe New York Times article, bowing to “experts,” but calling Bredesen “Mr.” instead of the “Dr.” he has earned.

And I do like the idea that sometimes what is effective against a disease (including cancer) is not a single drug or therapy, but several taken together. For example, multi-drug cocktails have been effectively used against HIV, childhood leukemia, and Hodgkin’s lymphoma.

Ehrlich’s big dream was to find a magic bullet for each disease. But maybe it is mostly more promising to dream of the magic buckshot.

[N.B., the “Paul Ehrlich” I refer to is not the contemporary environmental alarmist “Paul Ehrlich” who famously lost his bet with the heroic heretic Julian Simon.]

The NYT article quoted above:

Lindsay Gellman. “Lifestyle to Reverse Alzheimer’s Carries High Costs and, Many Say, False Hope.” The New York Times, First Section (Sun., May 25, 2025): 1 & 18.

(Note: the online version of the NYT article was updated May 31, 2025, and has the title “An Expensive Alzheimer’s Lifestyle Plan Offers False Hope, Experts Say.”)

How Did Ed Smylie and His Team Create the Kludge That Saved the Crew of Apollo 13?

Gary Klein in Seeing What Others Don’t analyzed cases of innovation, and sought their sources. One source he came up with was necessity. His compelling example was the firefighter Wag Dodge who, with maybe 60 seconds until he would be engulfed in flame, lit a match to the grass around him, and then laid down in the still-hot embers. The roaring fire bypassed the patch he pre-burned, and his life was saved. The story is well-told in Norman Maclean’s Young Men and Fire.

Pondering more cases of necessity might be useful to help us understand, and encourage, future innovation. One candidate might be the kludge that Ed Smylie and his engineers put together to save the Apollo 13 crew from suffocating after an explosion blew up their command capsule oxygen tank.

Necessity may be part of it, but cannot be the whole story. Humanity needed to fly for thousands of years, but it took Wilbur Wright to make it happen. (This point is made in Kevin Ashton’s fine and fun How to Fly a Horse.)

I have ordered the book co-authored by Lovell, and mentioned in a passage quoted below, in case it contains insight on how the Apollo 13 kludge was devised.

(p. B11) Ed Smylie, the NASA official who led a team of engineers that cobbled together an apparatus made of cardboard, plastic bags and duct tape that saved the Apollo 13 crew in 1970 after an explosion crippled the spacecraft as it sped toward the moon, died on April 21 [2025] in Crossville, Tenn. He was 95.

. . .

Soft-spoken, with an accent that revealed his Mississippi upbringing, Mr. Smylie was relaxing at home in Houston on the evening of April 13 when Mr. Lovell radioed mission control with his famous (and frequently misquoted) line: “Uh, Houston, we’ve had a problem.”

An oxygen tank had exploded, crippling the spacecraft’s command module.

Mr. Smylie, . . ., saw the news on television and called the crew systems office, according to the 1994 book “Lost Moon,” by Mr. Lovell and the journalist Jeffrey Kluger. The desk operator said the astronauts were retreating to the lunar excursion module, which was supposed to shuttle two crew members to the moon.

“I’m coming in,” Mr. Smylie said.

Mr. Smylie knew there was a problem with this plan: The lunar module was equipped to safely handle air flow for only two astronauts. Three humans would generate lethal levels of carbon dioxide.

To survive, the astronauts would somehow need to refresh the canisters of lithium hydroxide that would absorb the poisonous gases in the lunar excursion module. There were extra canisters in the command module, but they were square; the lunar module ones were round.

“You can’t put a square peg in a round hole, and that’s what we had,” Mr. Smylie said in the documentary “XIII” (2021).

He and about 60 other engineers had less than two days to invent a solution using materials already onboard the spacecraft.

. . .

In reality, the engineers printed a supply list of the equipment that was onboard. Their ingenious solution: an adapter made of two lithium hydroxide canisters from the command module, plastic bags used for garments, cardboard from the cover of the flight plan, a spacesuit hose and a roll of gray duct tape.

“If you’re a Southern boy, if it moves and it’s not supposed to, you use duct tape,” Mr. Smylie said in the documentary. “That’s where we were. We had duct tape, and we had to tape it in a way that we could hook the environmental control system hose to the command module canister.”

Mission control commanders provided step-by-step instructions to the astronauts for locating materials and building the adapter.

. . .

The adapter worked. The astronauts were able to breathe safely in the lunar module for two days as they awaited the appropriate trajectory to fly the hobbled command module home.

. . .

Mr. Smylie always played down his ingenuity and his role in saving the Apollo 13 crew.

“It was pretty straightforward, even though we got a lot of publicity for it and Nixon even mentioned our names,” he said in the oral history. “I said a mechanical engineering sophomore in college could have come up with it.”

For the full obituary, see:

Michael S. Rosenwald. “Ed Smylie Dies at 95; His Team of Engineers Saved Apollo 13 Crew.” The New York Times (Tuesday, May 20, 2025): B11.

(Note: ellipses, and bracketed year, added.)

(Note: the online version of the obituary was updated May 18, 2025, and has the title “Ed Smylie, Who Saved the Apollo 13 Crew With Duct Tape, Dies at 95.”)

Klein’s book that I praise in my introductory comments is:

Klein, Gary A. Seeing What Others Don’t: The Remarkable Ways We Gain Insights. Philadelphia, PA: PublicAffairs, 2013.

Maclean’s book that I praise in my introductory comments is:

Maclean, Norman. Young Men and Fire. new ed., Chicago: University of Chicago Press, 2017.

Ashton’s book that I praise in my introductory comments is:

Ashton, Kevin. How to Fly a Horse: The Secret History of Creation, Invention, and Discovery. New York: Doubleday, 2015.

The book co-authored by Lovell and mentioned above is:

Lovell, Jim, and Jeffrey Kluger. Lost Moon: The Perilous Voyage of Apollo 13. Boston, MA: Houghton Mifflin, 1994.

The Chicago School of Economics Was Once Uniquely Focused on Real World Problems

The Chicago School of Economics, most associated with Milton Friedman and George Stigler, saw itself as different from all the other top graduate programs in economics. At Chicago, the priority was solving applied problems, and only as much mathematics and theory should be used as was necessary to solve them. The other schools prioritized mathematical puzzle-solving and mathematical rigor and sophistication.

For those who might suspect Chicago was full of itself, the non-Chicago economists Arjo Klamar and David Colander dispelled the suspicion in their The Making of an Economist. After thorough interviewing and surveying of graduate students at the five or six top graduate programs, they concluded that graduate students at all but Chicago were cynically discouraged to realize that they were being trained to solve mathematical puzzles, while only those at Chicago still felt that they were being trained to matter in the real world.

I noticed that a recent obituary for the economist Stanley Fischer quotes Fischer as stating some diplomatic confirmation of the Klamar and Colander conclusion:

After earning his Ph.D. at M.I.T. in 1969, Mr. Fischer moved to the University of Chicago as a postdoctoral researcher and assistant professor. “At M.I.T. you did the mathematical work,” he told The New York Times in 1998, “and at Chicago you asked the question of how this applies to the real world” (Hagerty 2025, p. A17).

Alas, I fear that what was once true, is true no longer. I fear that if Klamar and Colander were to repeat their study today, they would find that Chicago has joined the other top programs in prioritizing mathematical puzzle-solving and mathematical rigor and sophistication.

The obituary of Stanley Fischer, quoted above, is:

James R. Hagerty. “Stanley Fischer, 81, Economist Who Helped Defuse Crises, Dies.” The New York Times (Mon., June 2, 2025): A17.

(Note: the online version of the Steve Lohr article was updated June 10, 2025, and has the title “Stanley Fischer, Who Helped Defuse Financial Crises, Dies at 81.”)

The Klamar and Colander book mentioned above is:

Klamer, Arjo, and David Colander. The Making of an Economist. Boulder, CO: Westview Press, 1990.

“Gold Standard” RCT Studies Do Not Always Agree on Broad Issues

Randomized double-blind clinical trials (RCTs) are usually labeled the “gold standard” of medical evidence. But any given clinical trial can be done in an infinite number of ways. The length and duration of the RCT can vary. The eligibility requirements can vary. The definition of the placebo or comparison treatment can vary.

So on the broad issue of whether red meat is good for the heart, an RCT that compares the heart effects of red meat versus the heart effects of chicken, can yield different results than an RCT that compares the heart effects of red meat versus the heart effects of a plant-based diet.

Both RCTs might be competently done, involving no dishonesty or fraud.

We tend to overgeneralize the results of an RCT, for instance saying “red meat is heart healthy,” or “red meat is not heart healthy.” Whereas all we are justified in saying is “red meat is equally heart healthy as chicken” and “read meat is less heart healthy than a plant-based diet.”

Since RCTs are expensive and time-consuming, physicians and patients will often have to choose between treatments where no RCT has been done where the researchers made the choices that are most relevant to the patient’s situation.

And in an environment where RCT costs are high and funding is scarce, are researchers to be condemned if among the myriad varying ways of setting up the RCT, they choose the ways most likely to yield the results that will be appealing to their funder?

The article quoted below, in passages I did not quote, assumes this is only an issue with industry-funded research. But government funding review panels also have preferred outcomes. For example, Charles Piller in Doctored has recently documented that government funders have been more likely to fund RCTs that support the amyloid hypothesis of the cause of Alzheimer’s.

So is there hope for those who want to take effective action against dire disease? Yes, we can recognize that not all sound actionable evidence comes from RCTs. We can stop mandating Phase 3 trials, so that a more diverse assortment of plausible therapies can be explored. We can encourage diverse, decentralized funding sources.

(p. D6) In a review published last week in the American Journal of Clinical Nutrition, scientists came to a concerning conclusion. Red meat appeared healthier in studies that were funded by the red meat industry.

. . .

Past research funded by the sugar industry, for instance, has downplayed the relationship between sugar and health conditions like obesity and heart disease. And studies funded by the alcohol industry have suggested that moderate drinking could be part of a healthy diet.

Miguel López Moreno, a researcher at Francisco de Vitoria University in Spain who led the new analysis, said in an email that he wanted to know if similar issues were happening with the research on unprocessed red meat.

. . .

Dr. Moreno and his colleagues found that the trials with funding from the red meat industry were nearly four times as likely to report favorable or neutral cardiovascular results after eating unprocessed red meat when compared with the studies with no such links.

. . .

These differing results may have stemmed from how the studies were set up in the first place, Dr. Tobias wrote in an editorial for the American Journal of Clinical Nutrition that accompanied the new study.

Individual nutrition studies can be good at showing how the health effects of certain foods compare with those of other specific foods. But to demonstrate whether a particular food, or food group like red meat, is good or bad for health in general, scientists must look at the results from many different studies that compare it to all possible food groups and diets.

The new review showed that, on the whole, the industry-funded red meat studies neglected to compare red meat to the full range of foods people might eat — including food we know to be good for the heart like whole grains or plant-based protein sources such as tofu, nuts or legumes. Instead, many of the studies compared unprocessed red meat to other types of animal protein like chicken or fish, or to carbohydrates like bagels, pasta or rice.

The independently funded studies, on the other hand, compared red meat to “the full spectrum” of different diets — including other types of meat, whole grains and heart-healthy plant foods like soy products, nuts and beans — Dr. Tobias said. This more comprehensive look offers a fuller picture of red meat’s risks or benefits, she said.

. . .

A spokeswoman for the National Cattlemen’s Beef Association said in an email that “beef farmers and ranchers support gold standard scientific research,” and that both animal and plant sources of protein can be part of a heart-healthy diet.

For the full story see:

Caroline Hopkins Legaspi. “Eyes on the Outcomes Of Red Meat Research.” The New York Times (Tues., May 27, 2025): D6.

(Note: ellipses added.)

(Note: the online version of the story has the date May 20, 2025, and has the title “Is Red Meat Bad for Your Heart? It May Depend on Who Funded the Study.”)

The academic article co-authored by Moreno and mentioned above is:

López-Moreno, Miguel, Ujué Fresán, Carlos Marchena-Giráldez, Gabriele Bertotti, and Alberto Roldán-Ruiz. “Industry Study Sponsorship and Conflicts of Interest on the Effect of Unprocessed Red Meat on Cardiovascular Disease Risk: A Systematic Review of Clinical Trials.” The American Journal of Clinical Nutrition 121, no. 6 (June 2025): 1246-57.

Some other articles discussing cases where industry funding is alleged to have funded biased research are:

Anahad O’Connor. “Sugar Backers Paid to Shift Blame to Fat.” The New York Times (Tues., Sept. 13, 2016): A1 & ?.

(Note: the online version of the story has the date Sept. 12, 2016, and has the title “How the Sugar Industry Shifted Blame to Fat.”)

Alice Callahan. “Is Fake Meat Superior to the Real Thing?” The New York Times (Tues., Feb. 18, 2025): D7.

(Note: the online version of the story has the date Feb. 17, 2025, and has the title “Is Fake Meat Better for You Than Real Meat?”)

Roni Caryn Rabin. “U.S. Wooed Alcohol Industry for a Drinking Study.” The New York Times, First Section (Sun., March 18, 2018): 1 & ??.

(Note: the online version of the story has the date March 17, 2018, and has the title “Federal Agency Courted Alcohol Industry to Fund Study on Benefits of Moderate Drinking.”)