《1. Introduction》

1. Introduction

The outbreak of coronavirus disease 2019 (COVID-19), a disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), began in early December 2019 [1,2]. As of 2 September 2020, more than 25 million individuals have been confirmed to be COVID-19 patients around the world, with an overall mortality rate of more than 3.3% [3]. Among these patients, some have developed pneumonia and rapidly progressed into severe acute respiratory distress syndrome (ARDS), with a very poor prognosis and even higher mortality [4,5]. In addition to pneumonia and ARDS, SARS-CoV-2 leads to damage to other organs and systems, such as large-vessel strokes [6]. In a retrospective cohort study from China, 26% of hospitalized patients required intensive care unit (ICU) care [7]. By 22 April 2020, among 5700 COVID-19 patients in New York who were discharged or died, the overall death rate was 9.7% and 24.5% [8]. Almost all critically ill patients in Italy required respiratory support, and nearly nine in ten of these critically ill patients needed endotracheal intubation [9]. Despite all these efforts, the mortality remained high [7–9]. In the process of caring for COVID-19 patients, particularly the critically ill, healthcare providers are subjected to a deluge of lab results for an increasing number of hospitalized patients. It is arduous to identify the most important information for decision-making, especially in urgent or emergent situations. Therefore, it is imperative to identify risk factors and parameters to build an accurate prognostic model for early intervention and management.

Artificial intelligence (AI) technologies have had surprising effectiveness in the medical domain, with a performance exceeding that of humans, especially for many image classification tasks [10– 12]. Several AI-based studies have been conducted and showed promising results in addressing the challenges of controlling and predicting COVID-19 spread and death toll [13–20]. Interpretable AI-based models (e.g., tree models) can enhance the confidence of medical professionals by helping them understand machine decisions. Inspired by the interpretability properties of decision trees, our previous work [19] successfully identified three laboratory features from common blood tests that can accurately predict the mortality of patients with COVID-19. It has been demonstrated that particular laboratory features, including lymphopenia, lactate dehydrogenase (LDH), inflammatory markers (e.g., C-reactive protein (CRP) and ferritin), D-dimer (> 1 μg·mL–1 ), prothrombin time (PT), troponin, and creatine phosphokinase (CPK), are associated with poor outcomes [7,21,22]. Older age has also been shown to be associated with increased mortality [8,23–25].

The identification of risk and prognostic factors for mortality rate are crucial for identifying patients’ outcome at an early stage in order to support clinical decision-making [7,26]. In this study, we built an AI model that can generate real-time risk scores and help to identify patients with a higher risk of mortality before they become critically ill, allowing prompt early intervention. In addition, our scores allow clinicians to monitor the disease progression and adjust therapies accordingly.

《2. Materials and methods》

2. Materials and methods

《2.1. Study design and support》

2.1. Study design and support

This study was approved by the Tongji Hospital Ethics Committee. Two separate cohorts of COVID-19 patients were used for model development and validation. The electronic medical records of 1479 COVID-19 cases admitted to Tongji Hospital, Wuhan, China, from 10 January to 8 March 2020 were used to train the model. Furthermore, the electronic medical records of 141 inpatients from Jinyintan Hospital, Wuhan, China, from 29 December 2019 to 28 March 2020, and of 432 inpatients from The Third People’s Hospital of Shenzhen, Shenzhen, China, from 11 January to 12 April 2020, were used to validate the model. Epidemiological, demographic, clinical, laboratory, medications, nursing record, and outcome data were extracted from the electronic medical records. Data monitoring and recording were performed in the same way for both cohorts. The clinical outcomes were followed up to 8 March 2020, as shown in Table 1.

《Table 1》

Table 1 Clinical features of the studied patients.

Q1 and Q3 are the first and third quantiles.

The diagnoses of the COVID-19 patients were based on the following diagnostic criteria from the National Health Commission of the People’s Republic of China [27]: ① SARS-CoV-2 nucleic acid positive in respiratory or blood samples detected by reverse transcription polymerase chain reaction (RT-PCR); and ② high homology between virus sequence detected in respiratory or blood samples and the known sequence of SARS-CoV-2.

《2.2. Development of an AI-based risk score system》

2.2. Development of an AI-based risk score system

A logistic regression (LR) classifier was applied to train the model to fit the outcome from three predictors, including concentration of LDH (CLDH), concentration of high-sensitivity CRP (hs-CRP; Chs-CRP), and proportion of lymphocyte (Plymphocyte), which were chosen in a previous study [19]. These factors have frequently been observed as key risk factors for COVID-19 patients [28–30]. All patients’ measurements were collected within ten days of their definite outcomes, and were used for model training. The output classes were defined as the outcome of the patient—either death or survival—after ICU time. The LR model aims to predict the risk groups of hospitalized patients (as low, intermediate, or high risk) according to the different levels of their risk scores: In the development and validation cohorts, 0–30 was defined as low risk, 30–50 as intermediate risk, and 50–100 as high risk.

《2.3. Performance assessment and comparison》

2.3. Performance assessment and comparison

Our score system was benchmarked against several state-ofthe-art models developed using other machine learning approaches, and standard metrics were used to quantify the performance of different models. The area under the curve (AUC) of the receiver operating characteristic (ROC) curve on one specific day was used to evaluate model effectiveness. Furthermore, the associated cumulative AUC score [31] was introduced as a timedependent measure to evaluate the risk of mortality for individual patients, computed backward in time from the day of discharge or death. The performance of our system was also compared with those of other standard models, such as the confusion, respiratory rate, blood pressure (qSOFA); confusion, urea nitrogen, respiratory rate, blood pressure, age ≥ 65 years (CURB 65); and confusion, respiratory rate, blood pressure, age ≥ 65 years (CRB 65) in both the development and validation cohorts [32–36].

《3. Results》

3. Results

《3.1. Patients’ characteristics and outcomes》

3.1. Patients’ characteristics and outcomes

A total of 1479 COVID-19 patients were eligible for this study, and their relevant clinical information was collected and analyzed. Clinical characteristics, epidemiological history, symptom onset, outcomes, and the results of lab tests were all included (Table 1). The median age was 62, with 49.1% of the sample being female. The majority of patients (71.9%) were local residents of Wuhan. In addition, 8.3% of the 1479 patients were familial clusters and 3.9% had a history of close contact. It was notable that 21.6% of patients had no known history of close contact or exposure, indicating the existence of other untraced transmission routes. The COVID-19 patients exhibited variable clinical symptoms: 72.5% of patients manifested fever, followed by respiratory symptoms such as cough (35.7%), shortness of breath (9.5%), and fatigue (5.5%). Gastrointestinal and neurological symptoms were also reported. The patients had complained of more than one symptom at a time. Tongji Hospital and Jinyintan Hospital received a large number of severe and critical patients and, as a consequence, the mortality rate at these hospitals was high at 17.4% and 58.1%, respectively, in the earlier stage. In contrast, The Third People’s Hospital of Shenzhen had only four deaths out of a total of 432 patients. Hence, this paper mainly focuses on Tongji Hospital and Jinyintan Hospital. Figures for The Third People’s Hospital of Shenzhen are mostly provided in Appendix A.

《3.2. Model development and performance》

3.2. Model development and performance

The risk of mortality for individual patients was predicted with the following simple and explainable LR model, as described in Section 2.2:

where CLDH, Chs-CRP, and PLymphocyte are input predictors of the LR model, r is the risk score, and σ is the sigmoid function; that is,

To simplify the use of the model in a clinical setting, Tables S1–S5 in Appendix A include lookup tables for quickly computing the risk score and the probability of death for a patient. Because different patients had different admission dates and varying lengths of stay, the predictive performance was evaluated backward in time; that is, as a function of the number of days between the blood sample and the eventual outcome (i.e., death or discharge). Its predictability is illustrated in Fig. 1 and Fig. S1 in Appendix A. The model achieved a cumulative AUC value of more than 95% (90%, 98%) for 20 d in advance for Tongji Hospital (Jinyintan Hospital, The Third People’s Hospital of Shenzhen).

《Fig. 1》

Fig. 1. The performance of the proposed model (AUC score and cumulative AUC score) as a function of the number of days until the outcome for all patients in (a) the development cohort (Tongji Hospital) and (b) external validation cohort 1 (Jinyintan Hospital).

Fig. 2 and Fig. S2 in Appendix A plot the distributions of the scores for surviving and deceased patients and the probabilities of death, using measurements taken within ten days of patients’ outcome. The risk score clearly separates the blood samples of surviving and deceased patients in all datasets, including both external validation datasets that were not used in model development. From a particular blood sample, a physician can easily calculate the probability of death; the higher the score, the higher this probability and risk for a patient.

《Fig. 2》

Fig. 2. Distributions of scores for surviving and deceased patients for (a) Tongji Hospital and (b) Jinyintan Hospital from blood samples taken within ten days of patients’ outcome. Probability of death as a function of the risk score for (c) Tongji Hospital and (d) Jinyintan Hospital. The model (red curve) almost perfectly follows the probability of death (blue) calculated directly from the data.

《3.3. Validation of the risk score》

3.3. Validation of the risk score

Next, the risk score can be used to categorize patients into different risk groups upon admission, as shown in the Kaplan–Meier survival curve (Fig. 3). The Kaplan–Meier curve depicts the probability that patients with COVID-19 who have survived upon admission will also survive in the final outcome. We applied the risk scores of patients at admission and classified the patients into three groups according to their scores: a low-risk group (65.6%), an intermediate-risk group (5.9%), and a high-risk group (28.5%). In the development cohort, it was observed that the 30-day mortality rates for low-, intermediate-, and high-risk groups were 1.8%, 12.5%, and 53.7%, respectively, showing a significant difference in mortality rate. The 30-day mortality rates for the low-, intermediate-, and high-risk groups in external validation cohort 2 are shown in Fig. S3 in Appendix A. These results demonstrated that the risk score could be used to predict the mortality for individual patients as early as at patients’ admission.

《Fig. 3》

Fig. 3. Kaplan–Meier survival curve for (a) the development cohort and (b) external validation cohort 1. In external validation cohort 1, 23.4% of the patients were in the lowrisk group, 9.9% were in the intermediate-risk group, and 66.7% were in the high-risk group.

《3.4. Comparison with other standard scores》

3.4. Comparison with other standard scores

The score from the proposed model was compared with the scores of other well-used models reported previously, such as qSOFA, CURB 65, and CRB 65, in both the development and external validation cohorts. The minimal requirement for different scores is 829 patients with available measurements in the development cohort. As shown in Fig. 4, the AUC for the scores of our model, CRB 65, CURB 65, and qSOFA were 0.9551, 0.7393, 0.8130, and 0.7480, respectively. The ROC and AUC for the external validation dataset are shown in Figs. S4 and S5 in Appendix A. It can be seen that the proposed score system is better than the standard score systems for predicting the outcome of patients with COVID-19.

《Fig. 4》

Fig. 4. A comparative analysis of the ROC of different scoring systems for the 829 patients from the development cohort who had available measurements at admission (minimal requirement for different scores) shows that the proposed model has a larger AUC than the other models reported previously.

《4. Conclusions》

4. Conclusions

The proportion of critical or fatal cases is quite high among hospitalized COVID-19 patients [8,37]. Although the mortality rate is only 1.4%–2.3% based on large-scale epidemiological studies [5], about one in three to four hospitalized patients have been admitted to the ICU [4,8,37,38], and 71.0%–97.3% of the critically ill patients eventually needed respiratory support [8,37–39], while 15.0% of the ICU patients required extracorporeal membrane oxygenation (ECMO) [4]. Despite many practices, including respiratory support, different medication regimens, and even ECMO, the casefatality rate for these critically ill patients has still been very high [4,8,37–39]. Retrospective studies have suggested that the onset of dyspnea was relatively late (median 6.5 d after symptoms onset), but the progression to ARDS could be swift thereafter (median 2.5 d after onset of dyspnea) among patients who developed critical illness [38–40]. In addition, the high mortality rate in Wuhan during the early stage, and the mortality rates in some other areas around the world, exceeded the capacity of local medical resources. These findings suggest that it is essential to promptly identify patients who are likely to have poor prognosis and higher risk of becoming critically ill.

Although COVID-19 is a multifaceted disease with uncertainty surrounding effective treatments and wide variation in clinical course and prognosis, multiple laboratory features, including lymphopenia, LDH, inflammatory markers, D-dimer, PT, troponin, and CPK, are associated with poor prognosis [28–30]. Our study demonstrates that the risk of death among patients with COVID-19 is predictable using a risk score computed from only three predictors: CLDH, Chs-CRP, and Plymphocyte. As shown in Fig. S6 in Appendix A, these three predictors provided a good separation between surviving and deceased patients, in blood samples taken within ten days of patients’ outcome. Front-line clinicians can monitor the disease progression of a patient by applying the proposed risk score to available blood samples. This will allow clinicians to monitor and screen out high-risk patients in real time and as laboratory data become available. Overall, the model serves as an accurate indicator for early detection and intervention to reduce the mortality rate, and can potentially be used to monitor the progression of the disease in order to allow healthcare providers to effectively review and adjust clinical management.

The significance of our work is five-fold. First, the model may identify high-risk patients early enough to provide them with alternative therapies such as using appropriate respiratory support and other treatments as soon as possible. Second, the model provides a continuous probability of death instead of classifying risks based on thresholds, as in previous studies. Thresholds are useful on extreme values, but can be misleading when risk scores are near the thresholds. Instead, probabilities of outcomes provide a level of confidence in the prediction. Third, this model provides a simple formula to precisely and quickly quantify the risk of death from just three features of a blood sample. Fourth, the three key features can be conveniently collected at any hospital, even in areas where healthcare resources are limited. The features are objective and quantitative, and therefore avoid any bias of subjective clinical judgments. The global outbreak of COVID-19 has led to a shortage of medical resources in many countries and regions—especially a lack of respiratory specialists and ICU specialists. Our algorithm can be used as a simple tool for non-specialist physicians to classify the severity of the disease in the early stage for high-risk patients. Last but not least, our research has been constructed using transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) [41] guidance with internal and external validation datasets from multiple centers, and the validation of our model has been confirmed by two cohorts of patients from different hospitals.

There are, however, several limitations of the model. First, the patients in the development cohort were from Tongji Hospital, and most were severe or critical. Hence, the cohort may not accurately represent patients with asymptomatic or mild or moderate cases of COVID-19, and the samples could have selection bias. Second, we did not model the effects of different therapies since treatments were not controlled and varied from patient to patient. Finally, this study provides evidence that the risk score could help clinicians to determine early intervention for patients with COVID19 in three Chinese hospitals. Further investigation and validation are required, involving other hospitals and countries. In particular, it is possible that different hospitals may have distinct laboratory, therapy, and discharge protocols, and that these may affect blood samples and thereby affect the interpretation of the risk score. Another limitation is that the patients in this study are predominantly from the mainland of China during the early stage. The demographic coverage with respect to worldwide patients is still lacking, due to the limited geographical location diversity. However, the proposed model can be a baseline risk-prediction model that updates dynamically when new samples are available from other countries.

In conclusion, a simple prognostic risk score system was developed based on an LR classifier to predict the risk of mortality for COVID-19 patients, and was validated with independent cohorts from multiple centers. This risk score system may help healthcare providers to promptly identify patients with poor prognosis and initiate appropriate intervention early on in order to improve the prognosis.

《Acknowledgments》

Acknowledgments

This study was supported by the Special Fund for Novel Coronavirus Pneumonia from the Department of Science and Technology of Hubei Province (2020FCA035) and the Fundamental Research Funds for the Central Universities, Huazhong University of Science and Technology (2020kfyXGYJ023).

Ye Yuan and Li Yan conceived the study; Li Yan and Qiang Zhong collected data; Ye Yuan, Chuan Sun, and Xiuchuan Tang discovered the model; Chenyu Sun, Li Yan, Hai-Tao Zhang, Yang Xiao, Laurent Mombaerts, Hui Xu, Ioannis Ch. Paschaldis, Jorge Goncalves, and Ye Yuan drafted the manuscript; all authors provided critical review of the manuscript and approved the final draft for publication. Data and code availability: The code implementation is available from Ye Yuan.

《Compliance with ethics guidelines》

Compliance with ethics guidelines

Ye Yuan, Chuan Sun, Xiuchuan Tang, Cheng Cheng, Laurent Mombaerts, Maolin Wang, Tao Hu, Chenyu Sun, Yuqi Guo, Xiuting Li, Hui Xu, Tongxin Ren, Yang Xiao, Yaru Xiao, Hongling Zhu, Honghan Wu, Kezhi Li, Chuming Chen, Yingxia Liu, Zhichao Liang, Zhiguo Cao, Hai-Tao Zhang, Ioannis Ch. Paschaldis, Quanying Liu, Jorge Goncalves, Qiang Zhong, and Li Yan declare that they have no conflict of interest or financial conflicts to disclose.

《Appendix A. Supplementary data》

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.eng.2020.10.013.