After consultant oncologist Dr. David Watkins publicly voiced concerns about the safety of a well-funded startup’s triage chatbot, the company shot back, calling him a “troll on Twitter.”
Prior to his Tuesday appearance on BBC Newsnight, Watkins had highlighted concerns with Babylon Health’s Chatbot for three years, under the Twitter handle of @DrMurphy11. In particular, he had zeroed in on the startup’s cardiac chest pain results.
One of his examples: For a fictitious, 67-year-old smoker reporting central chest pain, the app suggested a likely diagnosis of gastritis or a sickle cell crisis.
“Sadly, some of the fundamental Chatbot flaws that were initially raised in 2017 are still evident today,” Watkins wrote in a message.
Dozens of other startups are using AI to triage symptoms, but Babylon Health has reached a position of special prominence. The company struck a partnership with Britain’s National Health Service to offer telehealth appointments, and is also working with the NHS to build an app to triage non-emergency medical help. It also raised a whopping $550 million last year, led by Saudi Arabia’s sovereign wealth fund.
The company was founded in 2013 by Ali Parsa, a former investment banker with Goldman Sachs and founder of hospital management company Circle Health. Babylon Health once claimed its algorithm performed better than doctors on the Royal College of General Practitioners’ official exam — a test that practicing doctors in the U.K. must pass. But RCGP dismissed the company’s claims as dubious, saying the exam prep materials the company used to test its algorithm didn’t represent the full range of the test.
Keith Grimes, Babylon Health’s Clinical Innovation Director, defended the company’s chatbot in a head-to-head skirmish with Watkins on BBC.
“it’s not giving out fake or false information. It’s giving out safe information to patients to allow them to seek help in the right timing,” he said.
In a statement on Monday, Babylon Health published data on the various searches Watkins had run on its chatbot. The company said of the 100 tests he had raised concerns about, a panel of clinicians investigated them, correcting 20 errors in the company’s AI. Babylon did not confirm which errors it had fixed.
The company also challenged Watkins to be “part of an open, independent analysis of your AI testing.” Ironically, Babylon has not yet published any peer-reviewed studies of its triage system. Instead, the company pointed to the fact that its technology was validated as safe by the NHS, and that no patients had reported harm from using its service.
Most chatbots require limited regulatory oversight. Babylon is approved as a Class 1 medical device, an approval for low-risk medical devices, such as a cane or contact lenses. As such, the company’s chatbot can provide medical advice, but cannot make a diagnosis.