Dozens of hospitals across the country are using an artificial intelligence system created by Epic, the big electronic health record vendor, to predict which Covid-19 patients will become critically ill, even as many are struggling to validate the tool’s effectiveness on those with the new disease.
The rapid uptake of Epic’s deterioration index is a sign of the challenges imposed by the pandemic: Normally hospitals would take time to test the tool on hundreds of patients, refine the algorithm underlying it, and then adjust care practices to implement it in their clinics.
Covid-19 is not giving them that luxury. They need to be able to intervene to prevent patients from going downhill, or at least make sure a ventilator is available when they do. Because it is a new illness, doctors don’t have enough experience to determine who is at highest risk, so they are turning to AI for help — and in some cases cramming a validation process that often takes months or years into a couple weeks.
“Nobody has amassed the numbers to do a statistically valid” test of the AI, said Mark Pierce, a physician and chief medical informatics officer at Parkview Health, a nine-hospital health system in Indiana and Ohio that is using Epic’s tool. “But in times like this that are unprecedented in U.S. health care, you really do the best you can with the numbers you have, and err on the side of patient care.”
Epic’s index uses machine learning, a type of artificial intelligence, to give clinicians a snapshot of the risks facing each patient. But hospitals are reaching different conclusions about how to apply the tool, which crunches data on patients’ vital signs, lab results, and nursing assessments to assign a 0 to 100 score, with a higher score indicating an elevated risk of deterioration. It was already used by hundreds of hospitals before the outbreak to monitor hospitalized patients, and is now being applied to those with Covid-19.
At Parkview, doctors analyzed data on nearly 100 cases and found that 75% of hospitalized patients who received a score in a middle zone between 38 and 55 were eventually transferred to the intensive care unit. In the absence of a more precise measure, clinicians are using that zone to help determine who needs closer monitoring and whether a patient in an outlying facility needs to be transferred to a larger hospital with an ICU.
Meanwhile, the University of Michigan, which has seen a larger volume of patients due to a cluster of cases in that state, found in an evaluation of 200 patients that the deterioration index is most helpful for those who scored on the margins of the scale.
For 13% of patients whose scores remained on the low end during the first 48 hours of hospitalization, the health system determined they were unlikely to experience a life-threatening event and that physicians could consider moving them to a field hospital for lower-risk patients, according to data reported in a non-peer-reviewed preprint. On the opposite end of the spectrum, it found 17% of patients who scored on the higher end of the scale were much more likely to need ICU care and should be closely monitored.
Clinicians in the Michigan health system have been using the score thresholds established by the research to monitor the condition of patients during rounds and in a command center designed to help manage their care. But clinicians are also considering other factors, such as physical exams, to determine how they should be treated.
“This is not going to replace clinical judgement,” said Karandeep Singh, a physician and health informaticist at the University of Michigan who participated in the evaluation of Epic’s AI tool. “But it’s the best thing we’ve got right now to help make decisions.”
Stanford University has also been testing the deterioration index on Covid-19 patients, but a physician in charge of the work said the health system has not seen enough patients to fully evaluate its performance. “If we do experience a future surge, we hope that the foundation we have built with this work can be quickly adapted,” said Ron Li, a clinical informaticist at Stanford.
Executives at Epic said the AI tool, which has been rolled out to monitor hospitalized patients over the past two years, is already being used to support care of Covid-19 patients in dozens of hospitals across the United States. They include Parkview, Confluence Health in Washington state, and ProMedica, a health system that operates in Ohio and Michigan.
“Our approach as Covid was ramping up over the last eight weeks has been to evaluate — does it look very similar to (other respiratory illnesses) from a machine learning perspective and can we pick up that rapid deterioration?” said Seth Hain, a data scientist and senior vice president of research and development at Epic. “What we found is yes, and the result has been that organizations are rapidly using this model in that context.”
Some hospitals that had already adopted the index are simply applying it to Covid-19 patients, while others are seeking to validate its ability to accurately assess patients with the new disease. It remains unclear how the use of the tool is affecting patient outcomes, or whether its scores accurately predict how Covid-19 patients are faring in hospitals. The AI system was initially designed to predict deterioration of hospitalized patients facing a wide array of illnesses. Epic trained and tested the index on more than 100,000 patient encounters at three hospital systems between 2012 and 2016, and found that it could accurately characterize the risks facing patients.
When the coronavirus began spreading in the United States, health systems raced to repurpose existing AI models to help keep tabs on patients and manage the supply of beds, ventilators and other equipment in their hospitals. Researchers have tried to develop AI models from scratch to focus on the unique effects of Covid-19, but many of those tools have struggled with bias and accuracy issues, according to a review published in the BMJ.
The biggest question hospitals face in implementing predictive AI tools, whether to help manage Covid-19 or advanced kidney disease, is how to act on the risk score it provides. Can clinicians take actions that will prevent the deterioration from happening? If not, does it give them enough warning to respond effectively?
In the case of Covid-19, the latter question is the most relevant, because researchers have not yet identified any effective treatments to counteract the effects of the illness. Instead, they are left to deliver supportive care, including mechanical ventilation if patients are no longer able to breathe on their own.
Knowing ahead of time whether mechanical ventilation might be necessary is helpful, because doctors can ensure that an ICU bed and a ventilator or other breathing assistance is available.
Singh, the informaticist at the University of Michigan, said the most difficult part about making predictions based on Epic’s system, which calculates a score every 15 minutes, is that patients’ ratings tend to bounce up and down in a sawtooth pattern. A change in heart rate could cause the score to suddenly rise or fall. He said his research team found that it was often difficult to detect, or act on, trends in the data.
“Because the score fluctuates from 70 to 30 to 40, we felt like it’s hard to use it that way,” he said. “A patient who’s high risk right now might be low risk in 15 minutes.”
In some cases, he said, patients bounced around in the middle zone for days but then suddenly needed to go to the ICU. In others, a patient with a similar trajectory of scores could be managed effectively without need for intensive care.
But Singh said that in about 20% of patients it was possible to identify threshold scores that could indicate whether a patient was likely to decline or recover. In the case of patients likely to decline, the researchers found that the system could give them up to 40 hours of warning before a life-threatening event would occur.
“That’s significant lead time to help intervene for a very small percentage of patients,” he said. As to whether the system is saving lives, or improving care in comparison to standard nursing practices, Singh said the answers will have to wait for another day. “You would need a trial to validate that question,” he said. “The question of whether this is saving lives is unanswerable right now.”
Update: This story has been updated with new data published in a preprint by researchers at the University of Michigan.