Notes from Deep Medicine
Misdiagnosis in the United States is disconcertingly common. A review of three very large studies concluded that there are about 12 million significant misdiagnoses a year. These mistakes result from numerous factors, including failing to order the right test, misinterpreting a test that was performed, not establishing a proper differential diagnosis, and missing an abnormal finding. … The situation in the United States is worse that that because misdiagnosis leads to mistreatment. … Shockingly, up to one-third of medical operations performed are unnecessary. … To help bring this point home, for every one hundred Medicare recipients age sixty-five or older, each year there are more than fifty CT scans, fifty ultrasounds, fifteen MRIs, and ten PET scans. It’s estimated that 30 to 50 percent of the 80 million CT scans in the United States are unnecessary.
While it was a coup to get the medical societies to fess up about their top five (and often top ten) misused procedures, there was ultimately little to show for the effort. Subsequent evaluation from a national sample showed that the top seven low-value procedures were still being used regularly and unnecessarily. Two primary factors seem to account for this failure. The first reason, called the threapeutic illusion by Dr. David Casarett of the University of P4nnsylvania, was the established fact that, overall, individual physicians overestimate the benefits of what they themselves do. Physicians typically succumb to confirmation bias - because they already believe that the procedures and tests they order will have the desired benefit, they continue to believe it after the procedures are done, even when there is no objective evidence to be found. The second reason was the lack of any mechanism to affect change in physician’s behavior. Although Choosing Wsiely partnered with Consumer Reports to disseminate the lists in print and online, there was little public awareness of the long list of recommendations, so there was no grassroots, patient-driven demand for better, smarter testing. Furthermore, the ABIMF had no ability to track which doctors order what procedures and why, so there was no means to reward physicians for ordering fewer unnecessary procedures, nore one to penalize physicians for performing more.
In 2017, the RightCare Alliance … made a second attempt at reform. It published a series of major papers in the Lancet that quantified unnecessary procedures in a number of countries. The United States was the worst offender, at as much as 60 percent. Again, medical imaging for conditions such as back pain was at the top of the list. Much like Choosing Wisely was intended to shape physician behavior, the RightCare Alliance hoped its extensive data would be incorporated into future medical practice. There are no data to suggest that has happened.
Part of this growth is fueled by a shared belief among both patients and physicians that medications, and in particular very expensive ones, will have remarkable efficacy. When doctors prescribe any medication, they have a cognitive bias that it will work. Patients, too, believe the medicine will work. … Overall, 75 percent of patients receiving these leading medications do not have the desired or expected benefit. With several of these drugs with sales of more than $10 billion per year (such as Humira, Enbrel, Remicade), you can quickly get a sense of the magnitude of waste incurred.
Despite that poor showing, in recent years modible apps for checking symptoms, such as Ada, Your.MD, and Babylon, have proliferated. They incorporate components of artificial intelligence, but they haven’t yet been shown to simulate the accuracy of diagnoses made by doctors (which we should not necessarily regard as the gold standard). These start-up companies are beginning to incorporate information beyond lists of symptoms, asking a series of questions like the patient’s health history. The back-and-forth questions are hoped to narrow the differential and promote accurary. One such app, Buoy Health, draws upon more than 18,000 clinical publications, descriptions of 1700 medical conditions, and data provided by more than 5 million patients.
Harari, in Homo Deus, projected that “twentieth century medicine aimed to heal the sick, but twenty-first century medicine is increasingly aiming to upgrade the healthy.” … It’s a triple whammy for the lower socioeconomic class because the biases in AI frequently adversely affect them, they are most vulnerable to job loss, and access to AI medical tools may be much harder to come by.
Today, lab scores are also considered against population-based scales, relying on a dumbed-down method os ascertaining whether a given metric is in a “normal” range. This approach reflects the medical community’s fixation on the average patient, who does not exist. For example, there’s no consideration of ancestry and ethnic specificity for lab tests, when we know that key results - such as hemoglobin A1C, which is used to monitor diabetes, or serum creatinine, which is used to monitor kidney function - are very different for people of African ancestry than for those of European ancestry. Moreover, plenty of information is hidden inside the so-called normal range. Take a male patient whose hemoglobin has steadily declined over the past five years, from 15.9 to 13.2 g/dl. Both the starting and end points are within the normal range, so such a change would never get flagged by lap reports, and most busy doctors wouldn’t connect the dots of looking back over an extended period. But such a decrement could be an early sign of a disease process in the individual, whether it be hidden bleeding or cancer. … Instead of CDSS, I’d call it AIMS, for augmented individualized medical support.
Perhaps less pernicious but still worrisome is reliance on “wellness” programs, which most medium to large employers in the United States have, despite the fact that, overall, they have not been validated to promote health outcomes. … One way such programs could be improved, however, is through the use of virtual medical coaches, which could gather and make use of far more granular and deeper information about each individual.