Machine Learning Turns Up COVID Surprise
A hospital visit can be boiled down to an initial ailment and an outcome. But health records tell a different story, full of doctors’ notes and patient histories, vital signs and test results, potentially spanning weeks of a stay. In health studies, all of that data is multiplied by hundreds of patients. It’s no wonder, then, that as AI data processing techniques grow increasingly sophisticated, doctors are treating health as an AI and Big Data problem.
In one recent effort, researchers at Northwestern University have applied machine learning to electronic health records to produce a more granular, day-to-day analysis of pneumonia in an intensive care unit (ICU), where patients received assistance breathing from mechanical ventilators. The analysis, published 27 April in the Journal of Clinical Investigation, includes clustering of patient days by machine learning, which suggests that long-term respiratory failure and the risk of secondary infection are much more common in COVID-19 patients than the subject of much early COVID fears—cytokine storms.
“Most methods that approach data analysis in the ICU look at data from patients when they’re admitted, then outcomes at some distant time point,” said Benjamin D. Singer, a study co-author at Northwestern University. “Everything in the middle is a black box.”
The hope is that AI can make new clinical findings from daily ICU patient status data beyond the COVID-19 case study.
The day-wise approach to the data led researchers to two related findings: secondary respiratory infections are a common threat to ICU patients, including those with COVID-19; and a strong association between COVID-19 and respiratory failure, which can be interpreted as an unexpected lack of evidence for cytokine storms in COVID-19 patients. An eventual shift to multiple-organ failure might be expected if patients had an inflammatory cytokine response, which the researchers did not find. Reported rates vary, but cytokine storms have since the earliest days of the pandemic been considered a dangerous possibility in severe COVID-19 cases.
Some 35 percent of patients were diagnosed with a secondary infection, also known as ventilator-associated pneumonia (VAP), at some point during their ICU stay. More than 57 percent of Covid-19 patients developed VAP, compared to 25 percent of non-Covid patients. Multiple VAP episodes were reported for almost 20 percent of Covid-19 patients.
Catherine Gao, an instructor of medicine at Northwestern University and one of the study’s co-authors said the machine learning algorithms they used helped the researchers “see clear patterns emerge that made clinical sense.” The team dubbed their day-focused machine learning approach CarpeDiem, after the Latin phrase meaning “seize the day.”
CarpeDiem was built using the Jupyter Notebook platform, and the team has made both the code and de-identified data available. The data set included 44 different clinical parameters for each patient day, and the clustering approach returned 14 groups with different signatures of six types of organ dysfunction: respiratory, ventilator instability, inflammatory, renal, neurologic and shock.
“The field has focused on the idea that we can look at early data and see if that predicts how [patients] are going to do days, weeks, or months later,” said Singer. The hope, he said, is that research using daily ICU patient status rather than just a few time points can tell investigators—and the AI and machine learning algorithms they use—more about the efficacy of different treatments or responses to changes in a patient’s condition. One future research direction would be to examine the momentum of illness, Singer said.
The technique the researchers developed (which they called the “patient-day approach”) might catch other changes in clinical states with less time between data points, said Sayon Dutta, an emergency physician at Massachusetts General Hospital who helps develop predictive models for clinical practice using machine learning and was not involved in the study. Hourly data could present its own problems to a clustering approach, he said, making patterns difficult to recognize. “I think splitting the day up into 8-hour chunks instead might be a good compromise of granularity and dimensionality,” he said.
Calls to incorporate new techniques to analyze the large amounts of ICU health data pre-date the COVID-19 pandemic. Machine learning or computational approaches more broadly could be used in the ICU in a variety of ways, not just in observational studies. Possible applications could use daily health records, as well as real-time data recorded by healthcare devices, or involve designing responsive machines that incorporate a range of available information.
The overall mortality rates were around 40 percent in both patients who developed a secondary infection, and those who did not. But among study patients with one diagnosed case of VAP, if their secondary pneumonia was not successfully treated within 14 days, 76.5 percent eventually died or were sent to hospice care. The rate was 17.6 percent among those whose secondary pneumonia was considered cured. Both groups included roughly 50 patients.
Singer stresses that the risk of secondary pneumonia is typically a necessary one. “The ventilator is absolutely life-saving in these instances. It’s up to us to figure out how to best manage complications that arise from it,” he said. “You have to be alive to experience a complication.”