Skip to main content

Development and validation of an online model to predict critical COVID-19 with immune-inflammatory parameters

Abstract

Background

Immune and inflammatory dysfunction was reported to underpin critical COVID-19(coronavirus disease 2019). We aim to develop a machine learning model that enables accurate prediction of critical COVID-19 using immune-inflammatory features at admission.

Methods

We retrospectively collected 2076 consecutive COVID-19 patients with definite outcomes (discharge or death) between January 27, 2020 and March 30, 2020 from two hospitals in China. Critical illness was defined as admission to intensive care unit, receiving invasive ventilation, or death. Least Absolute Shrinkage and Selection Operator (LASSO) was applied for feature selection. Five machine learning algorithms, including Logistic Regression (LR), Support Vector Machine (SVM), Gradient Boosted Decision Tree (GBDT), K-Nearest Neighbor (KNN), and Neural Network (NN) were built in a training dataset, and assessed in an internal validation dataset and an external validation dataset.

Results

Six features (procalcitonin, [T + B + NK cell] count, interleukin 6, C reactive protein, interleukin 2 receptor, T-helper lymphocyte/T-suppressor lymphocyte) were finally used for model development. Five models displayed varying but all promising predictive performance. Notably, the ensemble model, SPMCIIP (severity prediction model for COVID-19 by immune-inflammatory parameters), derived from three contributive algorithms (SVM, GBDT, and NN) achieved the best performance with an area under the curve (AUC) of 0.991 (95% confidence interval [CI] 0.979–1.000) in internal validation cohort and 0.999 (95% CI 0.998–1.000) in external validation cohort to identify patients with critical COVID-19. SPMCIIP could accurately and expeditiously predict the occurrence of critical COVID-19 approximately 20 days in advance.

Conclusions

The developed online prediction model SPMCIIP is hopeful to facilitate intensive monitoring and early intervention of high risk of critical illness in COVID-19 patients.

Trial registration

This study was retrospectively registered in the Chinese Clinical Trial Registry (ChiCTR2000032161).

Graphical abstracthelper lymphocytve

vv

Background

Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1], ferociously hit the world. Up to September 20, 2020, there had been 30,675,675 confirmed cases and 954,417 deaths worldwide [2]. The reported mortality in critically ill COVID-19 patients was approximately 40%, by contrast with 2.3% for overall patients [3]. Moreover, most patients with critical COVID-19 had relatively mild symptoms prior to physiological deterioration. Therefore, early identification of critically ill patients is crucial for the management of COVID-19.

Immune/inflammatory response of SARS-CoV-2 infection is believed to play an essential role in the progression of COVID-19, though not fully understood [4]. Inflammatory markers, such as C reactive protein (CRP), procalcitonin (PCT), and ferritin, were markedly elevated in critically ill COVID-19 patients [5, 6]. Cytokines play an immunomodulating function, and uncontrolled cytokine storm is responsible for multiorgan dysfunction and poor outcomes of COVID-19 [7]. With both innate and adaptive immune compartments contribution, cytokine storm in COVID-19 is widely concerned [8, 9]. As expected, the differences of multiple cytokines and immune cells between critically ill and non-critically ill patients were observed in clinical practice [4]. Besides, early seroconversion and high antibody (serologic IgM and IgG antibodies against SARS-CoV-2) titer were linked with attenuated clinical symptoms [10].

Immune response of SARS-COV-2 infection is a complex process that has not yet been fully elucidated. Multiple indicators involved may lead to anxiety and confusion of clinicians in patient management. The clinical and imaging features of patients with different disease severity were compared [11, 12], and several prediction models have been established. These prognostic factors mainly included demographic, imaging, and clinical features [13]. Immune-inflammatory parameters have displayed promising prognostic implications, but have not been utilized to enable prediction of critical COVID-19. Traditional methods are not competent in dealing with complex parameters, while machine learning, a sub-discipline of artificial intelligence, may be helpful [14].

The objective of this study is to develop and validate a machine learning model that accurately predicts the occurrence of critical illness in patients with COVID-19 based on immune-inflammatory parameters.

Methods

Study design and participants

We conducted a retrospective study that included 2451 consecutive COVID-19 patients with outcomes (discharge or death) between January 27, 2020, and March 30, 2020, from Sino-French New City Campus of Tongji Hospital and Optical Valley Campus of Tongji Hospital in China, who were diagnosed according to the 7th edition of the Diagnosis and Treatment Protocol of COVID-19 by the National Health Commission of the People’s Republic of China [15]. Exclusion criteria were (1) patients under 18 years old, or with pregnancy; (2) patients transferred from Fangcang hospitals for social-distancing; (3) patients died within 24 h of admission, and patients re-hospitalized or discharged for special reasons such as dialysis. Details of excluded patients were as follows: 80 cases without matched diagnosis, 216 cases from Fangcang shelter hospitals, 37 cases died within 24 h, and 42 cases under 18 years old, et al.

As 375 patients were excluded, 2076 patients were finally included in this study and divided into three groups. Specifically, 50% and 50% of patients from Sino-French New City Campus of Tongji Hospital were randomly divided into the training cohort and the internal validation cohort. Patients from the Optical Valley Campus of Tongji Hospital were used as an external validation cohort. Critical illness was defined as admission to intensive care unit, undergoing invasive ventilation, or death [16]. A total of 36 raw immune/inflammatory parameters (natural killer [NK] cells count, NK cell percent, T-helper + T-suppressor lymphocyte [Th + Ts] percent, Th/Ts, Th + Ts count, [T + B + NK] count, [T + B + NK] percent, Th count, Th percent, Ts count, Ts percent, total B count, total B percent, total T lymphocyte-T-helper lymphocyte-T-suppressor lymphocyte [total T-THS], total T count, total T percent, complement 3 [C3], complement 4 [C4], ferritin, lymphocyte [LYM] count, lymphocyte [LYM] percent, C reactive protein, procalcitonin, interferon-γ [IFN-γ], tumor necrosis factor α [TNF-α], interleukin-1β [IL-1β], interleukin-2 receptor [IL-2R], interleukin-4 [IL-4], interleukin-6 [IL-6], interleukin-8 [IL-8], interleukin-10 [IL-10], immunoglobulin A [IgA], immunoglobulin G [IgG], immunoglobulin M [IgM], SARS-CoV-2 specific antibody IgM [C-IgM], and SARS-CoV-2 specific antibody IgG [C-IgG]) were collected from electronic medical records at admission. These features were collected using the same pre-designed data collection table across cohorts. Trained researchers entered and double-checked the data independently.

Data preprocessing

The medical records contained missing entries (Additional file 2a). To address it, we filtered out patients that harbored more than or equal to 30% missing features, resulting in 222 patients left in Sino-French New City Campus of Tongji Hospital and 228 patients left in the Optical Valley Campus of Tongji Hospital (Additional file 2b). Then, we abandoned the immune-inflammatory parameters missing more than or equal to 30% across the remaining patients, and only 28 features were qualified (Additional file 2c). We utilized the missForest [17] algorithm to estimate the missing entries in the data (Additional file 3). The rationale of choosing 30% as exclusion criteria is to include more patients under the premise of ensuring the imputation robustness. Daniel et al. have demonstrated that missForest can deal with features up to 30% missing values with profound performance [17].

Feature selection

We first applied LASSO (Least Absolute Shrinkage and Selection Operator) logistic regression to identify the most predictive variables guided by several researches [16, 18]. LASSO utilizes the L1 penalty to make the coefficients of weak features turn to zero during fitting [19]. We regarded features with zero coefficients as redundant, and only non-zero coefficient features were included for model training.

Model development

We fitted the selected features into five machine learning models, including Logistic Regression (LR), Support Vector Machine (SVM), Gradient Boosted Decision Tree (GBDT), K-Nearest Neighbor (KNN), and Neural Network (NN), to predict patients’ critical illness status with COVID-19. We chose the five models because they are classic models that are representative, widely used in the field of EHR prediction, and sensitive to different data modalities. For instance, based on the decision tree model, GBDT, where features are merely used to split the node, is not sensitive to scale and distribution of features. Scaling or not will not affect the result of the split [20], which also applies to KNN. Therefore, scaling is not required in GBDT and KNN for input training data. LR, SVM, and NN propose models by training weights with the steepest gradient descent algorithm and the steepest gradient ascent algorithm, respectively. They are sensitive to feature scale, so standardizing data is needed to eliminate the differences between features and speed up model convergence [21]. Patients with predictive probability larger or equal to 0.5 are considered high risk, otherwise low risk. To build the ensemble model, we tried different combination of baseline models and found that the combination of SVM, GBDT, and NN with respective weighted voting 0.3, 0.5, and 0.2 achieved the highest AUC. R library “caret” was utilized for model training and prediction with tenfold cross-validation. The LR, SVM, GBDT, KNN, and NN were called with method “glm,” “svmLinearWeights,” “gbm,” “knn,” and “avNNet” with default settings, respectively. Data were scaled and centered before training and testing.

Statistical analysis

All statistical analysis was performed with R (version 3.6.2). The receiver operating characteristics (ROC) curve and the area under the curve (AUC) analysis were conducted with R “pROC” package. The calibration curve was depicted with R “rms” package. Accuracy (ACC), sensitivity (SE), specificity (SP), positive predictive value (PPV), negative predictive value (NPV), Cohen’s kappa coefficient (Kappa), F1 score, and Brier score were calculated with R “caret,” “epiR,” and “rms” packages. Kaplan-Meier plot with log-rank test was conducted with R “survival” and “survminer” packages. Model importance was calculated using R package “caret.” The correlation between selected features and critical illness status were calculated Spearman correlation. Significance of the difference between the median values of critical illness and non-critical illness were conducted by the Asymptotic Two-Sample Brown-Mood Median Test using R “coin” package. P values less than 0.05 were considered statistically significant. Univariate and multivariate Cox regression were conducted with R “survival” package. Ninety-five percent confidence interval (CI) are reported if necessary.

Results

Baseline characteristics of patients

A total of 450 patients were finally included in this study, with 111 patients in the training cohort, 111 patients in the internal validation cohort, and 228 patients in external validation cohort. Median age in the training cohort, internal validation cohort, and external validation cohort was 62 (54.5–72) years, 64 (52–70.5) years, and 63 (50-70) years, respectively. Common comorbidities, such as hypertension and diabetes, and major symptoms of COVID-19, including fever, cough, dyspnea, sputum, and fatigue, were similar among the three cohorts. The number of critically ill patients in these three cohorts was 12 (10.81%), 16 (14.41%), and 25 (10.96%) in turn. Detailed demographic and essential clinical characteristics are listed in Table 1.

Table 1 Baseline characteristics of individuals by cohorts

Features included in models

After feature filtering, 28 features were left for feature selection, including NK cell count, NK cell percent, (Th + Ts) percent, Th/Ts, (Th + Ts) count, (T + B + NK) count, (T + B + NK) percent, Th count, Th percent, Ts count, Ts percent, total B count, total B percent, total T-THS, total T count, total T percent, LYM count, LYM percent, CRP, PCT, TNF-α, IL-1β, IL-2R, IL-6, IL-10, IL-8, C-IGG, and C-IGM (Fig. 1a). Missing feature value imputation was then conducted utilizing random forest. LASSO logistic regression identified six features (Th/Ts, CRP, PCT, IL-2R, IL-6, [T + B + NK] count) with the most predictive performance for model development. Among these features, (T + B + NK) count was negatively correlated with critical illness (–0.0023), while the other five features, Th/Ts (0.1534), CRP (0.0145), PCT (0.0137), IL-2R (4e − 04), and IL-6 (1e − 04), were positively correlated with critical illness (Fig. 1b).

Fig. 1
figure1

Feature selection by LASSO. a LASSO variable trace profiles of the 6 features. The vertical dashed line shows the best lambda value (0.025) chosen by tenfold cross validation. b Feature with zero coefficient (colored with gray) at lambda = 0.025, was considered less crucial to the patient’s critical illness status and removed by Lasso logistic regression analysis. Feature with positive coefficient (colored with red) are regarded high risk in respect to critical illness. LASSO, least absolute shrinkage and selection operator. CRP, C reactive protein. PCT, procalcitonin. IL-2R, interleukin 2 receptor. IL-6, interleukin 6. T + B + NK, T lymphocyte and B lymphocyte and natural killer cells. Th/Ts,T-helper/T-suppressor lymphocyte

As shown in Fig. 2a, we conducted the Spearman correlation analysis between the six features and critical illness status, the results of which were consistent with that of LASSO analysis. The five unfavorable prognostic features identified by LASSO were positively correlated with critical illness at varying degrees. The top-weighted features IL-6 (R = 0.55), PCT (R = 0.55), CRP (R = 0.52), IL-2R (R = 0.45), and Th/Ts (R = 0.23) were consistent with previously reported risk factors intimately associated with poor outcome of COVID-19 [4,5,6, 22, 23].

Fig. 2
figure2

Statistical analysis of six features selected by Lasso. a Spearman correlation of critical illness status and features. The wider chord, the stronger positive correlation is. b Density plot of each feature across patients with different critical illness status, respectively. The vertical dashed line signifies the feature median value, interquartile range is also annotated. The significant test is Asymptotic Two-Sample Brown-Mood Median Test. CRP, C reactive protein. PCT, procalcitonin. IL-2R, interleukin 2 receptor. IL-6, interleukin 6. T + B + NK, T lymphocyte and B lymphocyte and natural killer cells. Th/Ts,T-helper/T-suppressor lymphocyte

Significant differences (p < 0.05) of the six features between critically ill and non-critically ill patients with COVID-19 were presented in the standard box plots (Fig. 2b). The values of Th/Ts ratio, IL-2R, CRP, IL-6, and PCT, were significantly higher in critically ill patients than that in non-critically ill group, while (T + B + NK) count was lower in critically ill patients (Additional file 1).

Model performance

In general, all five models (LR, SVM, GBDT, KNN, and NN) showed varying but promising critical illness risk prediction performance across cohorts. The AUC was 0.965 with LR, 0.962 with SVM, 0.956 with GBDT, 0.964 with KNN, and 0.964 with NN for the internal validation cohort (Fig. 3a). The AUC was 0.998 with LR, 0.999 with SVM, 0.998 with GBDT, 0.978 with KNN, and 0.999 with NN for the external validation cohort (Fig. 3b). Among them, the ensemble model SPMCIIP (severity prediction model for COVID-19 by immune-inflammatory parameters) derived from three algorithms (SVM, GBDT, and NN) achieved the best predictive performance. Relative importance of features included in SPMCIIP and its baseline models is shown in Additional file 4.

Fig. 3
figure3

Performance evaluation on the validation dataset. a, b ROC curve and AUC of SVM, LR, GBDT, KNN, and NN in internal validation cohort and external validation cohort, respectively. c, d KM curve of low-risk and high-risk subgroup predicted by SVM model in internal validation cohort and external validation cohort, respectively. The light red or blue areas refer to the 95% confidence interval. p value is computed by log-rank test. Hazard ratio (HR) and its 95% confidence interval are obtained with univariate Cox model. SVM, supported vector machine. LR, logistic regression. GBDT, gradient boosted decision tree. KNN, k-nearest neighbor. NN, neural network. HR, hazard ratio

For the internal validation cohort, SPMCIIP achieved an AUC of 0.991 (95% CI 0.979–1.000) to identify patients having a high risk of developing critical illness with an accuracy of 96.4% (95% CI 91.0%–99.0%). For external validation cohort, SPMCIIP demonstrated an AUC of 0.999 (95% CI 0.998–1.000) and an accuracy of 99.1% (95% CI 96.9%–99.9%). The calibration curve of SPMCIIP in two validation cohorts is depicted in Additional file 5, showing that SPMCIIP also displayed the minimal Brier score of 0.025 for internal validation cohort and 0.007 for external validation cohort. All other metrics and the performance of the baseline models are listed in Table 2.

Table 2 Performance metrics for mortality risk prediction of models in cohorts

Taking critical illness as endpoint and time from admission to occurrence of critical COVID-19 or discharge as the endpoint, Kaplan-Meier analysis further confirmed the strong risk stratification ability of SPMCIIP. SPMCIIP robustly stratified high-risk patients and low-risk patients with p < 0.0001 in both internal and external validation cohorts. The univariate Cox analysis also demonstrated the strong positive correlation between SPMCIIP predicted critical illness subgroup and the ground truth critical illness survival for internal (HR, 74.6, 95% CI 16.81–331.10) and external (HR, 17,301.52, 95% CI 0.15–1,938,070,573.08) validation cohorts, respectively (Fig. 3c, d).

We also developed an online calculator where directly inputting the values of parameters could yield the risk of developing critical COVID-19 (https://spmciip.deepomics.org/). After the clinicians fill in the online form with corresponding features, SPMCIIP returns a personalized probability and risk group of critical illness. Illustration of an example of the online prediction system is presented in Fig. 4.

Fig. 4
figure4

Illustration of the online prediction model—SPMCIIP

Discussion

In this study, we developed and validated an ensemble machine learning model based on immune-inflammatory parameters to predict the risk of critical COVID-19. We conducted and reported this multicenter retrospective study following appropriate standards [24]. Importantly, SPMCIIP displayed an AUC exceeding 0.99 to accurately predict critical COVID-19 in both internal and external validation cohorts. With an expeditious risk stratification of patients’ prognosis, clinicians can strengthen the management of patients at high risk of critical illness, which assists to curb mortality and rationally allocate medical resources.

The six features involved in SPMCIIP had been proven correlated with critical illness in COVID-19 patients. Severity of COVID-19 is due to the viral infection and the host response, and critical COVID-19 is a distinct clinical and immune sepsis subphenotype [25]. Innate immune hyperactivation and adaptive immune dysregulation after SARS-CoV-2 infection are considered to play important roles in the development of severe COVID-19 [26]. The vast release of cytokines in response to the viral infection can result in a cytokine storm and symptoms of sepsis. Uncontrolled inflammation inflicts multi-organ damage, leads to multi-organ failure including acute respiratory distress syndrome, and finally results in poor prognosis of COVID-19 [8, 27, 28]. The increase of inflammatory factors and cytokines was observed, especially in critically ill patients. High level of IL-6 was early reported to be correlated with SARS-CoV-2 viral load in the blood of critically ill COVID-19 patients [29]. IL-6 can end the activation of normal T cells, which may be a reason for lymphopenia; robust proinflammatory function; and inducing a variety of acute-phase proteins, such as CRP. It is even reported that the immune dysregulation is driven by IL-6 in COVID-19 [30]. With a median incubation time of 5–7 days, and 3–4 days additionally from hospitalization to requirement of mechanical ventilation or admission to ICU [31], this subacute pattern of progression in COVID-19 patients raises the possibility of immunosuppression, due both to T cell depletion and exhaustion after over-activation [26, 32]. Consistent with it, CD4+ T, CD8+ T, and NK cells were observed lower in patients with severe disease [33]. Corresponding to this finding, single-cell sequencing of peripheral blood mononuclear cells reveals that the expression of multiple genes related to apoptosis pathway was upregulated in T, B, NK cell subsets of COVID-19 patients comparing with healthy people [34]. Lymphopenia, especially the depletion of T cells, may relate to apoptosis following overactive inflammatory responses. Further, CD4+ T cell and NK cell cytopenia are recognized as characteristics of infection by SARS-CoV-2 [30]. In addition, procalcitonin is correlated with increased probability of bacterial pathogens [35], and several studies have demonstrated that higher procalcitonin was presented in critically ill COVID-19 patients [6, 36, 37]. This finding indicates bacterial co-infection in critically ill patients. More accurately, the prevalence of bacterial co-infection in critically ill COVID-19 patients (14%, 95% CI 5-26) in ICU is higher than that in hospitalized COVID-19 patients (7%, 95% CI 3-12%), according to a recent meta-analysis [38].

Though the process of COVID-19 infection has not been fully clarified, the driving role of immune dysfunction on critical COVID-19 is becoming more evident, fueling us to leverage immunological features in predicting critical illness. Machine learning can help clinicians predict the health trajectory of patients, and aid preventative efforts for improving outcomes [39]. Besides, machine learning models could predict disregarding human fatigue, geographic barriers, and temporal restrictions in an automated manner. Therefore, a machine learning model based on immune-inflammatory parameters could offer great opportunities to accurate prediction of critical COVID-19 when medical resources are scarce and COVID-19 infections surge.

Importantly, SPMCIIP can predict the risk of progressing to critical COVID-19 nearly 20 days in advance. Because the impacts of cytokine release syndrome caused by SARS-CoV-2 infection on COVID-19 have been increasingly revealed, and understandings of the use of corticosteroids and other anti-inflammatory drugs continue to grow [40, 41], early identification of patients harboring high risk of critical illness potentially facilitates timely intervention in compliance with guidelines and eliminate the occurrence of cytokine storm-derived multiorgan failure and other refractory states.

The merits of SPMCIIP include its excellent performance in predicting critical COVID-19. Many machine learning models for prognosis prediction of COVID-19 have been built based on imaging and clinical features [16, 42], but few models could yield an AUC as high as 99% to predict critical COVID-19. In the case of limited medical conditions, such as clinics and small hospitals, a prediction model with parameters easily determined is appropriate, once medical conditions permit, the six features included in the model are able to be detected, it is recommended to use the online model SPMCIIP. The predictive advantage of SPMCIIP may attribute to the algorithms we adopted, which covered most types of classification models in machine learning and enabled dealing with complex data. More importantly, the predictive superiority of SPMCIP is owing to the immune and inflammatory features used for model development. While myriad risk factors associated with occurrence of critical COVID-19 have been unveiled, it is gradually recognized that the interplay between immunity and inflammation is the predominant factor that affects the outcome of COVID-19 [43,44,45]. Our results further demonstrated the heterogeneity of immune response in COVID-19 patients and its important prognostic value delineated previously [46, 47]. The predictive strength of SPMCIIP could stem from the detailed feature information of included patients, though the number of eligible patients is relatively limited (450/2076).

Our research has some limitations. First, patients included in this study are primarily locals in Wuhan, China. Validations of SPMCIIP in other regions and ethnicities can provide more solid evidence. Second, this is a retrospective study. Our models should be independently validated in large-scale prospective cohorts before the contribution to improved survival can be elucidated.

Conclusions

In this multicenter retrospective study, we developed and validated an online model, SPMCIIP, which included six immune and inflammatory parameters and could accurately predict the critical illness risk of COVID-19 patients, thus triaging patients for appropriate treatment and optimizing the use of medical resources.

Availability of data and materials

The datasets generated and analyzed during the current study are available from the corresponding author upon reasonable request.

Abbreviations

ACC:

Accuracy

AUC:

Area under the curve

C3:

Complement 3

C4:

Complement 4

CI:

Confidence interval

C-IgG:

SARS-CoV-2 specific antibody IgG

C-IgM:

SARS-CoV-2 specific antibody IgM

COVID-19:

Coronavirus disease 2019

GBDT:

Gradient Boosted Decision Tree

CRP:

C reactive protein

IFN-γ:

Interferon-γ

IgA:

Immunoglobulin A

IgG:

Immunoglobulin G

IgM:

Immunoglobulin M

IL-10:

Interleukin 10

IL-1β:

Interleukin 1β

IL-2R:

Interleukin 2 receptor

IL-4:

Interleukin 4

IL-6:

Interleukin 6

IL-8:

Interleukin 8

KNN:

K-Nearest Neighbor

LASSO:

Least Absolute Shrinkage and Selection Operator

LR:

Logistic Regression

LYM:

Lymphocyte

NK cells:

Natural killer cells

NN:

Neural Network

NPV:

Negative predictive value

PCT:

Procalcitonin

PPV:

Positive predictive value

ROC:

Receiver operating characteristics

SARS-CoV-2:

Severe acute respiratory syndrome coronavirus 2

SE:

Sensitivity

SN:

Sensitivity

SP:

Specificity

SPMCIIP:

A severity prediction model for COVID-19 by immune-inflammatory parameters

SVM:

Support Vector Machine

Th:

T-helper lymphocyte

Ts:

T-suppressor lymphocyte

TNF-α:

Tumor necrosis factor α

References

  1. 1.

    of the International CS. The species severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nat Microbiol. 2020;5(4):536–44.

    Article  Google Scholar 

  2. 2.

    Weekly epidemiological update coronavirus disease 2019 (COVID-19) 26 Dec 2020. https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200921-weekly-epi-update-6.pdf?sfvrsn=d9cf9496_6 Accessed 26 Dec 2020.

  3. 3.

    Cummings MJ, Baldwin MR, Abrams D, Jacobson SD, Meyer BJ, Balough EM, et al. Epidemiology, clinical course, and outcomes of critically ill adults with COVID-19 in New York City: a prospective cohort study. Lancet. 2020;395(10239):1763–70.

    CAS  Article  Google Scholar 

  4. 4.

    Akbari H, Tabrizi R, Lankarani KB, Aria H, Vakili S, Asadian F, et al. The role of cytokine profile and lymphocyte subsets in the severity of coronavirus disease 2019 (COVID-19): a systematic review and meta-analysis. Life Sci. 2020;258:118167.

    CAS  Article  Google Scholar 

  5. 5.

    Chen G, Wu D, Guo W, Cao Y, Huang D, Wang H, et al. Clinical and immunological features of severe and moderate coronavirus disease 2019. J Clin Invest. 2020;130(5):2620–9.

    CAS  Article  Google Scholar 

  6. 6.

    Zhu Z, Cai T, Fan L, Lou K, Hua X, Huang Z, et al. Clinical value of immune-inflammatory parameters to assess the severity of coronavirus disease 2019. Int J Infect Dis. 2020;95:332–9.

    CAS  Article  Google Scholar 

  7. 7.

    Chousterman BG, Swirski FK, Weber GF. Cytokine storm and sepsis disease pathogenesis. Seminars Immunopathol. 2017;39(5):517–28.

    CAS  Article  Google Scholar 

  8. 8.

    Mehta P, McAuley DF, Brown M, Sanchez E, Tattersall RS, Manson JJ. COVID-19: consider cytokine storm syndromes and immunosuppression. Lancet. 2020;395(10229):1033–4.

    CAS  Article  Google Scholar 

  9. 9.

    Moore JB, June CH. Cytokine release syndrome in severe COVID-19. Science. 2020;368(6490):473–4.

    CAS  Article  Google Scholar 

  10. 10.

    Kong WH, Zhao R, Zhou JB, Wang F, Kong DG, Sun JB, et al. Serologic response to SARS-CoV-2 in COVID-19 patients with different severity. Virol Sin. 2020;35(6):752–7.

  11. 11.

    Guan WJ, Ni ZY, Hu Y, Liang WH, Ou CQ, He JX, et al. Clinical characteristics of coronavirus disease 2019 in China. New Engl J Med. 2020;382(18):1708–20.

    CAS  Article  Google Scholar 

  12. 12.

    Petrilli CM, Jones SA, Yang J, Rajagopalan H, O’Donnell L, Chernyak Y, et al. Factors associated with hospital admission and critical illness among 5279 people with coronavirus disease 2019 in New York City: prospective cohort study. BMJ. 2020;369:m1966.

    Article  Google Scholar 

  13. 13.

    Wynants L, Van Calster B, Collins GS, Riley RD, Heinze G, Schuit E, et al. Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal. BMJ. 2020;369:m1328.

    Article  Google Scholar 

  14. 14.

    Panch T, Szolovits P, Atun R. Artificial intelligence, machine learning and health systems. J Glob Health. 2018;8(2):020303.

    Article  Google Scholar 

  15. 15.

    Diagnosis and treatment program of 2019 novel coronavirus pneumonia (Trial Version 7). http://www.gov.cn/zhengce/zhengceku/2020-03/04/content_5486705.htm Accessed 26 Dec 2020.

  16. 16.

    Liang W, Liang H, Ou L, Chen B, Chen A, Li C, et al. Development and validation of a clinical risk score to predict the occurrence of critical illness in hospitalized patients with COVID-19. JAMA Int Med. 2020;180(8):1081–9.

    CAS  Article  Google Scholar 

  17. 17.

    Stekhoven DJ, Bühlmann P. MissForest--non-parametric missing value imputation for mixed-type data. Bioinformatics. 2012;28(1):112–8.

    CAS  Article  Google Scholar 

  18. 18.

    Fu H, Zhu Y, Wang Y, Liu Z, Zhang J, Xie H, et al. Identification and validation of stromal immunotype predict survival and benefit from adjuvant chemotherapy in patients with muscle-invasive bladder cancer. Clin Cancer Res. 2018;24(13):3069–78.

    CAS  Article  Google Scholar 

  19. 19.

    Glmnet Vignette. http://www.web.stanford.edu/~hastie/Papers/Glmnet_Vignette.pdf Accessed 26 Dec 2020.

  20. 20.

    Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. LightGBM: a highly efficient gradient boosting decision tree. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc.; 2017. p. 3149–57.

    Google Scholar 

  21. 21.

    Chauhan VK, Chauhan VK, Dahiya K, Dahiya K, Sharma A, Sharma A. Problem formulations and solvers in linear SVM: a review. Artif Intel Rev. 2019;52(2):803–55.

    Article  Google Scholar 

  22. 22.

    Liu J, Li S, Liu J, Liang B, Wang X, Wang H, et al. Longitudinal characteristics of lymphocyte responses and cytokine profiles in the peripheral blood of SARS-CoV-2 infected patients. EBioMedicine. 2020;55:102763.

    Article  Google Scholar 

  23. 23.

    Xie L, Wu Q, Lin Q, Liu X, Lin W, Hao S, et al. Dysfunction of adaptive immunity is related to severity of COVID-19: a retrospective study. Ther Adv Respir Dis. 2020;14:1753466620942129.

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Heus P, Reitsma JB, Collins GS, Damen J, Scholten R, Altman DG, et al. Transparent reporting of multivariable prediction models in journal and conference abstracts: TRIPOD for abstracts. Ann Intern Med. 2020.

  25. 25.

    Kuri-Cervantes L, Pampena MB, Meng W, Rosenfeld AM, Ittner CAG, Weisman AR, et al. Comprehensive mapping of immune perturbations associated with severe COVID-19. Sci Immunol. 2020;5(49):eabd7114.

    CAS  Article  Google Scholar 

  26. 26.

    Vardhana SA, Wolchok JD. The many faces of the anti-COVID immune response. J Exp Med. 2020;217(6):e20200678. 

  27. 27.

    Tay MZ, Poh CM, Rénia L, MacAry PA, Ng LFP. The trinity of COVID-19: immunity, inflammation and intervention. Nat Rev Immunol. 2020;20(6):363–74.

    CAS  Article  Google Scholar 

  28. 28.

    Zhou Z, Ren L, Zhang L, Zhong J, Xiao Y, Jia Z, et al. Heightened innate immune responses in the respiratory tract of COVID-19 patients. Cell Host Microbe. 2020;27(6):883–90.e2.

    CAS  Article  Google Scholar 

  29. 29.

    Walls AC, Park YJ, Tortorici MA, Wall A, McGuire AT, Veesler D. Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell. 2020;181(2):281–92.e6.

    CAS  Article  Google Scholar 

  30. 30.

    Giamarellos-Bourboulis EJ, Netea MG, Rovina N, Akinosoglou K, Antoniadou A, Antonakos N, et al. Complex immune dysregulation in COVID-19 patients with severe respiratory failure. Cell Host Microbe. 2020;27(6):992–1000.e3.

    CAS  Article  Google Scholar 

  31. 31.

    Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet. 2020;395(10223):497–506.

    CAS  Article  Google Scholar 

  32. 32.

    Diao B, Wang C, Tan Y, Chen X, Liu Y, Ning L, et al. Reduction and functional exhaustion of T cells in patients with coronavirus disease 2019 (COVID-19). Front Immunol. 2020;11:827.

    CAS  Article  Google Scholar 

  33. 33.

    Song JW, Zhang C, Fan X, Meng FP, Xu Z, Xia P, et al. Immunological and inflammatory profiles in mild and severe cases of COVID-19. Nat Commun. 2020;11(1):3410.

    CAS  Article  Google Scholar 

  34. 34.

    Zhu L, Yang P, Zhao Y, Zhuang Z, Wang Z, Song R, et al. Single-cell sequencing of peripheral mononuclear cells reveals distinct immune response landscapes of COVID-19 and influenza patient. Immunity. 2020;53(3):685–96.e3.

    Article  Google Scholar 

  35. 35.

    Self WH, Balk RA, Grijalva CG, Williams DJ, Zhu Y, Anderson EJ, et al. Procalcitonin as a marker of etiology in adults hospitalized with community-acquired pneumonia. Clin Infect Dis. 2017;65(2):183–90.

    CAS  Article  Google Scholar 

  36. 36.

    Liu D, Li R, Yu R, Wang Y, Feng X, Yuan Y, et al. Alteration of serum markers in COVID-19 and implications on mortality. Clin Transl Med. 2020;10(3):e119.

    PubMed  Google Scholar 

  37. 37.

    Vazzana N, Dipaola F, Ognibene S. Procalcitonin and secondary bacterial infections in COVID-19: association with disease severity and outcomes. Acta Clinica Belg. 2020:1–5. published online ahead of print.

  38. 38.

    Lansbury L, Lim B, Baskaran V, Lim WS. Co-infections in people with COVID-19: a systematic review and meta-analysis. J Infect. 2020;81(2):266–75.

    CAS  Article  Google Scholar 

  39. 39.

    Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920–30.

    Article  Google Scholar 

  40. 40.

    Horby P, Lim WS, Emberson JR, Mafham M, Bell JL, Linsell L, et al. Dexamethasone in hospitalized patients with covid-19 - preliminary report. New Engl J Med. 2020:NEJMoa2021436. published online ahead of print.

  41. 41.

    Zhang W, Zhao Y, Zhang F, Wang Q, Li T, Liu Z, et al. The use of anti-inflammatory drugs in the treatment of people with severe coronavirus disease 2019 (COVID-19): the perspectives of clinical immunologists from China. Clin Immunol. 2020;214:108393.

    CAS  Article  Google Scholar 

  42. 42.

    Liang W, Yao J, Chen A, Lv Q, Zanin M, Liu J, et al. Early triage of critically ill COVID-19 patients using deep learning. Nat Commun. 2020;11(1):3543.

    CAS  Article  Google Scholar 

  43. 43.

    Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. 2020;395(10229):1054–62.

    CAS  Article  Google Scholar 

  44. 44.

    Tian J, Yuan X, Xiao J, Zhong Q, Yang C, Liu B, et al. Clinical characteristics and risk factors associated with COVID-19 disease severity in patients with cancer in Wuhan, China: a multicentre, retrospective, cohort study. Lancet Oncol. 2020;21(7):893–903.

    CAS  Article  Google Scholar 

  45. 45.

    Wu C, Chen X, Cai Y, Xia J, Zhou X, Xu S, et al. Risk factors associated with acute respiratory distress syndrome and death in patients with coronavirus disease 2019 pneumonia in Wuhan, China. JAMA Intern Med. 2020;180(7):934–43.

    CAS  Article  Google Scholar 

  46. 46.

    Mathew D, Giles JR, Baxter AE, Greenplate AR, Wu JE, Alanio C, et al. Deep immune profiling of COVID-19 patients reveals patient heterogeneity and distinct immunotypes with implications for therapeutic interventions. bioRxiv. 2020:2020.05.20.106401. published online ahead of print.

  47. 47.

    Sominsky L, Walker DW, Spencer SJ. One size does not fit all - patterns of vulnerability and resilience in the COVID-19 pandemic and why heterogeneity of disease matters. Brain Behav Immun. 2020;87:1–3.

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We thank all health workers and people behind them that fight against COVID-19.

Funding

This study was supported by the National Natural Science Foundation of China [grant number 81772787, 81873452, 81702572, 81702574, and 81974405].

Author information

Affiliations

Authors

Contributions

Q-LG had full access to all data in the study and took responsibility for the integrity of data and the accuracy of the data analysis. YG designed the study. L-XC did the analysis. X-KF developed the website. YG, L-XC, and J-HC interpreted the data and wrote the paper. H-YL, S-QZ, S-YW, YW, R-DY, YY, and SX provided patients’ samples and clinical data, and double-checked and entered the data into database. Q-LG, S-CL, C-RL, and WZ advised on the conception and design of the study. All authors vouched for the respective data and analysis, approved the final version, and agreed to publish the manuscript.

Corresponding authors

Correspondence to Wei Zhang or Shuaicheng Li or Qinglei Gao.

Ethics declarations

Ethics approval and consent to participate

The Research Ethics Commission of Tongji Hospital of Huazhong University of Science and Technology approved this study and granted a waiver of informed consent (TJ-IRB20200406). This study was registered in the Chinese Clinical Trial Registry (ChiCTR2000032161).

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

Differential variables between critical ill and non-critical ill patients. The significant test is Asymptotic Two-Sample Brown-Mood Median Test. Abbreviations: Th/Ts, T-helper/T-suppressor lymphocyte. IL-2R, interleukin 2 receptor. CRP, C reactive protein. IQR, interquartile ranges.

Additional file 2.

Visualization of the denosing and filtering process. a, Heatmap of raw lab test data. b, Heatmap of lab test data after removing patients with more than and equal to 30% missing entries across the SF and OV hospitals. c, Heatmap of lab test data after removing lab test features with more than and equal to 30% missing entries across the SF and OV hospitals. Black tiles refer to missing entries. Abbreviations: NK, Natural killer cells, Th, T-helper lymphocyte. Ts, T-suppressor lymphocyte. C3, complement 3. C4, complement 4. CRP, C reactive protein. PCT, procalcitonin IFN-γ, interferon-γ. TNF-α, tumor necrosis factor α. IL-1β, interleukin 1β. IL-2R, interleukin 2 receptor. IL-4, interleukin 4. IL-6, interleukin 6. IL-8, interleukin 8. IL-10, interleukin 10. IGA, immunoglobulin A. IGG, immunoglobulin G. IGM, immunoglobulin M. C-IGM, SARS-COV-2 specific antibody IgM. C-IGG, SARS-COV-2 specific antibody IgG. SF, Sino-French New City Campus of Tongji Hospital. OV, Optical Valley Campus of Tongji Hospital.

Additional file 3

Visualization of the imputation process. a, c Heatmap of SF and OV lab test data before imputation. b, d Heatmap of SF and OV lab test data after imputation. Black tiles refer to missing entries. Abbreviations: NK, Natural killer cells, Th, T-helper lymphocyte. Ts, T-suppressor lymphocyte. CRP, C reactive protein. PCT, procalcitonin. IFN-γ, interferon-γ. TNF-α, tumor necrosis factor α. IL-1β, interleukin 1β. IL-2R, interleukin 2 receptor. IL-4, interleukin 4. IL-6, interleukin 6. IL-8, interleukin 8. IL-10, interleukin 10. C-IGM, SARS-COV-2 specific antibody IgM. C-IGG, SARS-COV-2 specific antibody IgG. SF, Sino-French New City Campus of Tongji Hospital. OV, Optical Valley Campus of Tongji Hospital.

Additional file 4

Relative feature importance of SVM, GBDT, NN and SPMCIIP model. Abbreviations: SVM, supported vector machine. GBDT, Gradient Boosted Decision Tree. NN, neural network. SPMCIIP, Severity prediction model for COVID-19 by immune-inflammatory parameters. CRP, C reactive protein. IL-2R, interleukin 2 receptor. IL-6, interleukin 6. NK, Natural killer cells. PCT, procalcitonin. Th, T-helper lymphocyte. Ts, T-suppressor lymphocyte.

Additional file 5

Calibration curves of SPMCIIP model in cohorts. Calibration curves of SPMCIIP model in a internal validation cohort and b external validation cohort, respectively. The triangle represents the observation group. Each group contained an average of 20 observations. The dashed line is the ideal calibration curve. The bottom vertical lines refer to the predicted probability distribution. Red curve is the fitted nonparametric calibration curve. Abbreviations: AUC, Area under the curve.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gao, Y., Chen, L., Chi, J. et al. Development and validation of an online model to predict critical COVID-19 with immune-inflammatory parameters. j intensive care 9, 19 (2021). https://doi.org/10.1186/s40560-021-00531-1

Download citation

Keywords

  • COVID-19
  • Critical illness
  • Machine learning
  • Immune-inflammatory parameters
  • Online model