The outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which is the virus responsible for the 2019 coronavirus disease pandemic (COVID-19), has cost the lives of millions of people worldwide and remains a major global threat. The pandemic put considerable pressure on hospitals and health centers, as there were limited resources to meet the growing demand from patients infected with this new virus.
In these circumstances, clinical decision support systems based on predictive analysis may be useful in managing the emergency. For example, early detection of COVID-19 in patients most likely to suffer from critical illness and death may help provide adequate care as well as optimize the use of limited resources.
To study: Early detection of outcomes in patients with COVID-19. Image credit: sasirin pamai / Shutterstock.com
Clinical features of COVID-19
COVID-19 is a respiratory disease that presents a wide range of clinical and serious presentations. While some infected patients may be asymptomatic or present with mild symptoms, others may develop acute respiratory disorder syndrome (ARDS) which is sometimes followed by various complications, including renal, cardiac, gastrointestinal, thrombotic, and neurological effects.
Several clinical studies have helped to characterize COVID-19 in different cohorts of patients by identifying risk factors and comorbidities of the disease, as well as evaluating the efficacy of different therapeutic approaches that are being implemented worldwide. Although several studies are being conducted, the detailed mechanism of the disease is not yet fully understood. Therefore, risk assessment at the time of hospitalization is difficult to perform, especially in patients with various risk factors.
There are several vaccines available against COVID-19; however, many researchers believe that most people in low-income countries are unlikely to be vaccinated against COVID-19 until at least the end of 2022. In addition, the emergence of SARS-CoV-2 variants it has also threatened the effectiveness of these vaccines. Therefore, it can be concluded that the best treatment against COVID-19 has yet to be established.
Predictive technologies and models based on artificial intelligence can be used to help assess risk. These include the automatic analysis of lung radiographs and computed tomography (CT) scans to aid in the diagnostic and prognostic processes of COVID-19. Machine learning has also been used to distinguish between COVID-19 and other forms of pneumonia. Prognostic studies based on clinical data are also used; however, they require a broader study, as most models are not yet mature enough.
A new one Nature study aimed at predicting clinical data based on clinical outcomes. The current study had two main objectives, the first was to determine clinical variables that could predict the final clinical outcomes and therefore be useful in clinical decision making. The second goal was to build predictive models that could identify critical patients during hospitalization.
About the study
The current study included two data sets of patients admitted to three different units of Pisa University Hospital in Italy during the first and second waves of the COVID-19 pandemic. The data were cured manually, as well as from electronic records obtained from the three units.
The data set contained more than 125 variables, of which six clinical predictive variables were selected using a hybrid filter / wrapper function based on a genetic algorithm. The six variables selected included troponin levels, age, blood urea nitrogen (BUN) levels, P / F ratio, presence of myalgia, and presence of chronic obstructive pulmonary disease (COPD).
These variables were used to construct predictive models that were based on logistic regression (LR), decision trees (DT), random forests (RF), naive bays (NB), and support vector machines (SVM). . The performance of the predictive models was evaluated based on their accuracy and F1 score. Cross-validation was performed in all patients in whom these variables were measured.
Study methods were also compared with standard selection of filter functions by removing recursive features (RFE). Finally, all clinical variables were grouped to provide a better view of the selected biomarkers.
The results of the study indicated that the six variables selected by the general algorithm provided strong support for the medical literature on COVID-19. The P / F ratio, for example, is an important clinical variable that aids in clinical decision-making of patients ’external ventilation and oxygenation requirements.
Older age is also an important risk factor that the algorithm has selected correctly. COPD is also an important variable that helps identify patients with chronic lung disease.
The variable troponin helps to recognize the link between cardiovascular disease and COVID-19, while the variable BUN helps to identify any pre-existing chronic kidney disease. The last variable of myalgia suggests that, depending on the case, the disease may develop against different target organs.
The standard function selection procedure involved two different coverage thresholds of 90% and 75% of patient coverage. For the 90% threshold, it was observed that the main variables of P / F and age were the same as those selected by the general algorithm. It was also observed that troponin and BUN levels were not selected as they were not below the coverage threshold of 90% of patients.
The results also indicated that four of the five predictive models included in the study provided a classification accuracy of more than 85% during the first wave of the pandemic. In addition, the best result was shown by LR followed by RF, DT and SVM. During the second wave, only the DT showed an accuracy level of 85%, while the others showed much lower accuracy levels.
Many clinical variables were found to be missing for first-wave datasets. The involvement of the missing values allowed the authors to test the performance of the predictive models as well as the validity of the selected variables.
For the selected variables, no difference was observed between the actual and assigned values. For the predictive models, it was observed that LR and RF showed a slight reduction in accuracy, while SVM and NB experienced a slight increase.
Among the five predictive models used in the study, only the DT and LR models were considered interpretable. The LR model could be interpreted by studying regression coefficients, while the DT model provides a visual description of the data set that is based on clinical variables. The DT model also helps to easily classify a new patient.
Finally, the grouping of the variables determined that the six selected variables belonged to five different clusters, which also helped determine the other diseases that presented common clinical symptoms. This could help determine the link between these diseases and COVID-19.
Although the researchers in the current study made considerable efforts to follow appropriate external validation practices, one of the main limitations is that this study included data from a single hospital. Therefore, the current model should be applied to larger data sets and to different locations so that they can assist in clinical decision making in a time of emergency.