Validation of a risk prediction model for COVID-19: the PERIL prospective cohort study
Abstract
Aim: This study aims to perform an external validation of a recently developed prognostic model for early prediction of the risk of progression to severe COVID-19. Patients & methods/materials: Patients were recruited at their initial diagnosis at two facilities within Hamad Medical Corporation in Qatar. 356 adults were included for analysis. Predictors for progression of COVID-19 were all measured at disease onset and first contact with the health system. Results: The C statistic was 83% (95% CI: 78%–87%) and the calibration plot showed that the model was well-calibrated. Conclusion: The published prognostic model for the progression of COVID-19 infection showed satisfactory discrimination and calibration and the model is easy to apply in clinical practice.d
In December 2019, the novel coronavirus SARS-CoV-2, responsible for COVID-19, emerged in Wuhan city, China. It has since rapidly spread all over the world. By 19 January 2020, four countries had reported laboratory-confirmed cases of COVID-19. Four days later on 23 January 2020, the city of origin was placed under lockdown, and multiple countries soon followed suit [1]. The WHO announced COVID-19 as a pandemic in March 2020 and subsequently over 110 million confirmed cases were recorded, with more than 2.45 million deaths [2]. The pandemic has had a massive negative impact on the global economy and incurred enormous detrimental effects on mental health worldwide [3,4]. The pandemic had also placed a catastrophic burden on healthcare systems and a reliable validated tool to stratify patients according to the risk of progression to severe illness from disease onset was urgently needed to guide resource allocation.
Since the start of the pandemic, various prediction models have been created using patients' demographics, medical history, signs and symptoms or laboratory investigations to predict COVID-19 prognosis. These models can be categorized into two main types: those that were developed from variables measured at disease onset and those that used variables measured during the course of the illness [5–8]. According to a systematic review by Wynants et al., most of the existing prediction models for COVID-19 lacked validation, were inadequately reported, or were at high risk of bias, which has discouraged their use because of the limitations associated with the reliability of predictions from such models [9]. We had also developed one such tool [10], and in this paper we aim to generate evidence of its reliability as a tool to triage COVID-19 patients at disease onset and thus provide a useful tool to alleviate the burden on the healthcare system.
Many attempts have been made to stratify the risk of severe COVID-19 at disease onset with subsequent severity defined as the need for intensive care unit (ICU) admission, need for invasive ventilation or death [11]. The latter definitions of progression have clear advantages because they use hard end points that are reliably measurable. Biomarkers are an obvious choice for such prediction models but there has not been such a model that has been extensively validated. The advantage of using biomarkers for patient triage is that it is likely to be more objective and reliable than patient-reported symptoms. In this study, we aim to externally validate our previously developed and biomarker based Kuwait Prognosis Indicator (KPI) model [10].
Patients & methods
Study population of the external validation cohort
A prospective cohort study, Predicting Risk Early in COVID-19 (PERIL), was undertaken to validate the KPI model. Between 23 August 2020 and 22 November 2020, there were 356 adult patients recruited at their initial diagnosis at two facilities within Hamad Medical Corporation in Qatar: the Communicable Disease Center (CDC), which dealt with referral cases that are asymptomatic at diagnosis and then allocated patients to an isolation facility, and the Hazm Mebaireek General Hospital, a secondary care facility that receives patients considered to be symptomatic at presentation for observation. In both centers, biomarkers were measured on the day of referral (either for symptom onset or positivity on contact tracing) to these centers (see recruitment flow chart in Figure 1). Qatar CDC guidelines were followed for all the recruited patients, whether they were isolated at home or within special facilities, or they were admitted to the hospital for observation.
Ethical consideration
This study was approved by the medical ethics committee of Qatar University and Hamad Medical Corporation (protocol nos. QU-IEB 1434-E/20 and MRC 05-137, respectively) and written informed consent was obtained from all participants. Results have been reported to conform with the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) statement [12].
Predictor assessment
Predictors for the progression of COVID-19 (age, serum procalcitonin, CRP, lymphocyte percentage, monocyte percentage and serum albumin) were all measured at disease onset and first contact with the health system. These predictors were selected after a careful review of the COVID-19 literature as the most relevant for a risk prediction model with a focus on biomarkers. Participants were identified either through contact tracing, routine testing or self-reported symptoms. Table 1 provides information on the predictors and cutoffs used.
Kuwait Prognosis Indicator score for COVID-19. Please give your patient zero points if criterion not met | |
---|---|
Criterion | Points |
Age ≥ 41 years | 4 |
CRPs ≥ 7 mg/l | 2 |
Procalcitonin ≥ 0.05 ng/ml | 16 |
Lymphocyte % ≥ 31.5% | -9 |
Monocyte % ≥ 9.2% | -8 |
Albumin ≥ 39.5 g/l | -15 |
Total |
Outcome assessment
Severe COVID-19 was defined as progression to ICU admission, need for invasive ventilation or in-hospital death. This was considered the most pragmatic and reliable definition of severity [11].
Sample size
For studies validating prognostic models, there is no solid sample size recommendation, but a minimum of 100 patients with events and at least 100 patients without events has been suggested [13].
Missing data
In the event of missing data, the use of multiple imputation by chained equations (MICE) was planned. This method imputes missing data multiple times to account for uncertainty.
Statistical analysis
Baseline characteristics of patients in the PERIL cohort were reported as median and interquartile range or number and percent and were stratified by severity status. The original prognostic model was applied exactly to our study cohort as it was published previously [10]. Discrimination was assessed using Harrell's C statistic. This is equal to the area under the receiver operating characteristic curve, and it indicates that when the model predicts that a participant will have a high risk for severe disease, the participant is more likely to develop severe disease.
Calibration of the validated model was assessed by comparing the calculated and predicted probabilities of severity for each individual with the actual observed outcomes in the calibration plots. If the calculated predicted probabilities equal to the actual proportions for the groups, then the model is well-calibrated. Statistically, the well-calibrated model will have a calibration plot with an intercept of 0 and a slope of 1, and the groups of predicted probabilities should lie relatively close to this line. All analyses were performed using Stata MP 15.1 (StataCorp, TX, USA).
Comparison of development & validation studies
The setting, eligibility criteria and predictors in this validation study were similar to those in the development study [10]. However, the outcome definition we used in the development study was a composite outcome that combined soft and hard end points. In this validation study, we only used hard end points and no soft end points were collected.
Results
PERIL cohort
Three hundred-fifty-six participants were recruited (Figure 1). Of these participants, 209 were recruited through the CDC (59%) and 147 through the Hazm Mebaireek General Hospital (41%). After a complete chart review and confirmation of the data collected, no missing data was encountered. Table 2 shows the baseline characteristics of the participants. Severe COVID-19 was diagnosed in 75 (21%) participants, of whom 7 (9%) died. Compared with the development study [10], the distribution of age (median 39 years vs 45 years) and sex of participants (males 72.5 vs 75%) were also similar, as were the percentage who died (2.9 vs 2%).
Characteristic | All | Percent/IQR | Non-severe (n = 281) | Percent/IQR | Severe (n = 75) | Percent/IQR |
---|---|---|---|---|---|---|
Demographics | ||||||
Age (years) | 45 | 36.5 to 56 | 34 | 44 to 55 | 49 | 43 to 58 |
Male | 267 | 75% | 193 | 68.7% | 74 | 98.7% |
Comorbidities | ||||||
Cardiovascular disease | 124 | 34.8% | 90 | 32.0% | 34 | 45.3% |
Diabetes | 112 | 31.5% | 75 | 26.7% | 37 | 49.3% |
COPD/asthma | 11 | 3.1% | 8 | 2.8% | 3 | 4.0% |
Renal disease | 18 | 5.1% | 8 | 2.8% | 10 | 13.3% |
Biomarkers | ||||||
Lymphocyte % | 23.2 | 14 to 33.8 | 28 | 17.9 to 37.15 | 11.7 | 7.2 to 18.5 |
Monocyte % | 7.7 | 5.3 to 10.9 | 8.4 | 6.05 to 11.4 | 4.8 | 3.2 to 8.2 |
CRP mg/l | 18.1 | 3.6 to 106.2 | 8.65 | 2.75 to 60.95 | 150.3 | 69.2 to 229.4 |
Albumin g/l | 37 | 32 to 40 | 34 | 38 to 41 | 28 | 32 to 34 |
Procalcitonin (ng/ml) | 0.07 | 0.04 to 0.21 | 0.06 | 0.03 to 0.11 | 0.65 | 0.18 to 1.28 |
Medical course | ||||||
Significant medications | 185 | 52% | 110 | 39.1% | 75 | 100.0% |
X-ray changes | 208 | 58.4% | 135 | 48.0% | 73 | 97.3% |
Length of stay (days) | 6 | 0–14 | 0 0 to 10 | 14 20 to 28 | ||
ICU admission | 75 | 21.1% | ||||
Death | 7 | 2.0% | ||||
Prognostic score | ||||||
KPI score | 6 | -10 to 22 | -1 | -13 to 14 | 22 | 18 to 22 |
Calibration & discrimination of the prognostic model
The C statistic for the model was 0.83 (95% CI 0.78–0.87; Figure 2). The model showed good calibration, as evidenced both by the calibration plot and the calibration slope and intercept (Figure 3). It was not possible to split the predicted probabilities into ten groups so the calibration plot had fewer than 10 points. The model included only a few categorical variables (i.e. was a sum score model) in which a limited number of predicted probabilities were possible [14].
Clinical performance was evaluated using interval likelihood ratios (Table 3). In the low-risk group, the likelihood ratio for progression was 0.036 (95% CI: 0.005–0.251). In the high-risk group, the likelihood ratio for progression was 3.138 (95% CI: 2.472–3.984).
Interval of KPI score | Severe COVID | Not severe COVID | Likelihood ratio | 95% CI |
---|---|---|---|---|
-32 to -7 (Low risk) | 1 | 105 | 0.0356 | 0.00504 to 0.251 |
-6 to 15 (Intermediate risk) | 16 | 106 | 0.564 | 0.356 to 0.892 |
16 to 22 (High risk) | 58 | 69 | 3.138 | 2.472 to 3.984 |
Total | 75 | 280 |
Discussion
In the development study [10], a KPI model was created to predict progression to severe disease, thus enabling pre-stratification of patients with COVID-19 according to their risks of progression. The KPI model was based on age and five laboratory tests at presentation. The development study assessed the model to have an area under the curve (AUC) of 0.83 (95% CI: 0.78–0.89) and the model was externally validated on an open-source dataset demonstrating an AUC of 0.89 (95% CI: 0.85–0.92). In this prospective validation study, we aimed to prospectively validate the KPI model for early prediction of the risk of progression to severe COVID-19 as measured by a composite of three hard outcomes – ICU admission, need for invasive ventilation, or death. Despite this validation being in a different health system, the C-statistic in our model showed similar discrimination as in the development cohort with an AUC of 0.83 (95% CI: 0.78–0.87). The model also showed good calibration results.
Many prediction models have been created since the pandemic started [15–39]. Several of these have used variables that were not useful for risk stratification [5–8]. Of those that considered baseline patient variables, some reported less solid outcomes such as hospitalization, a progression of signs and symptoms, or imaging results (Table 4) making these less reliable and not as useful to decision-making.
Variables included | AUC | External validation AUC | Calibration | Ref. |
---|---|---|---|---|
Demographics Biomarkers Vital signs | 0.80 to 0.84 | 0.72 to 0.83 | Yes | [28] |
Demographics History Biomarkers Vital signs | 0.74 (ICU) 0.83 (mortality) | – | – | [29] |
Demographics History | 0.80 | – | – | [37] |
Demographics History | 0.83 | – | – | [38] |
Demographics Vital signs Biomarkers | 0.83 to 0.86 | 0.83 to 0.85 | Yes | [34] |
Demographics History Biomarkers Vital signs | 0.79 | 0.77 | Yes | [30] |
Demographics Vital signs Biomarkers | 0.798 | – | – | [31] |
Demographics History Signs and symptoms | 0.897 | 0.885 | – | [35] |
Demographics Biomarkers | 0.90 | 0.84 to 0.93 | Yes | [32] |
Demographics History Biomarkers Vital signs | 0.87 | – | Yes | [39] |
Biomarkers | 0.92 | – | – | [33] |
Demographics Biomarkers | 0.83 | 0.89 0.83 | Yes | [10] |
In keeping with our model, eleven prediction models (Table 4) measured patient variables at baseline and used hard outcomes similar to ours and many of these used one or more of the biomarkers used in the KPI model [28–33]. Vaid et al. created an in-hospital mortality and critical events prediction model for COVID-19 using demographics, biomarkers and vital signs, enrolling 1514 patients and reporting that higher age and CRP were among the strongest predictors of mortality [28]. Another prediction model developed by Zhao et al. used data from 454 patients and reported procalcitonin to be an important predictor of both ICU admission and death [29]. Moreover, the bootstrap analysis in the model created by Altschul et al. revealed that from the mortality predictors, age, oxygen saturation, urea nitrogen, CRP, international normalized ratio and procalcitonin were reproducibly selected in more than 70% [31]. However, procalcitonin was later removed from the model due to a large number of missing values. In another model developed by Wu et al. in Wuhan, the variables that were selected in the model included age, lymphocyte (proportion) and CRP, which are also found in our KPI model [32]. Hu et al. identified albumin at admission to be an important early predictor of severe COVID-19 and constructed a prediction model that used only two biomarkers – albumin and lymphocyte count [33]. The range of discrimination in all these models in terms of the AUC was between 0.74 and 0.92 and five of these were externally validated [28,30,32,34,35]. Among these models, the Wu et al. model and the KPI model are the only ones that have now been validated in two different countries [10,32].
Wu et al. developed the model using demographic and laboratory variables and reported an AUC of 0.86 in the training dataset (n = 239) and an AUC of 0.9 in the testing dataset (n = 60) [32]. The model was further externally validated on five test datasets, which showed AUCs ranging from 0.84 to 0.93 with accuracies ranging from 74.4 to 87.5% in China, Italy, and Belgium. Although the model shows better discrimination than the KPI model, it has a higher risk of bias due to the smaller number of events per variable (EPV <10) [36].
A limitation of our model is the use of categorical data that may have resulted in decreased predictive ability, but this was a pragmatic choice. Nevertheless, the discrimination is comparable to what has been previously reported. In addition, the use of biomarker assays may be a limitation in rural and remote communities where they may not be available. A strength of this validation study is that patients were prospectively followed, and all tests and outcome measurements were conducted prospectively.
Conclusion
In conclusion, we externally validated and calibrated our KPI model, presenting an easy-to-use six variable interface that can aid clinicians in accurately stratifying patients likely to progress to more severe disease such as ICU admission, need for invasive ventilation and death. The model used measures collected early at diagnosis, and given that risk is a continuum, this model may also be expected to define moderate-severity cases that need supportive care (e.g., oxygen), but do not end up in the ICU. The performance of the KPI model within the validation cohort showed that this tool could speed up patient triage and will therefore optimize the allocation of resources in health systems overburdened by a COVID-19 wave.
The Kuwait Prognosis Indicator (KPI) is a novel biomarker-based risk prediction tool for COVID-19.
The KPI tool contains hard biomarker predictors as opposed to soft predictors such as patient signs and symptoms.
The KPI is easy to use and is now validated.
The KPI has good discrimination of those that will need intensive care unit care.
Financial disclosure
This project was supported by the Qatar University Emergency Response Grant (QUERG-CENG-2020-1). The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
Competing interests disclosure
The authors have no competing interests or relevant affiliations with any organization or entity with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.
Writing disclosure
No writing assistance was utilized in the production of this manuscript.
Ethical conduct of research
This study was approved by the medical ethics committee of Qatar University and Hamad Medical Corporation (protocol nos. QU-IEB 1434-E/20 and MRC 05-137, respectively) and written informed consent was obtained from all participants.
Papers of special note have been highlighted as: • of interest; •• of considerable interest
References
- 1. CDC Museum COVID-19 Timeline (2023). www.cdc.gov/museum/timeline/covid19.html • This reference details the timeline of the COVID-19 outbreak.
- 2. WHO Coronavirus Disease (COVID-19) Dashboard (2021). https://covid19.who.int • This reference shows the impact and disease burden of COVID-19.
- 3. Economic burden of COVID-19: a systematic review. Clinicoecon Outcomes Res. 14, 293–307 (2022).
- 4. . The outbreak of COVID-19 coronavirus and its impact on global mental health. Int. J. Soc. Psychiatry 66(4), 317–320 (2020).
- 5. Mortality prediction model for the triage of COVID-19, pneumonia, and mechanically ventilated ICU patients: a retrospective study. Ann. Med. Surg. (Lond) 59, 207–216 (2020).
- 6. . Clinical features of COVID-19 mortality: development and validation of a clinical prediction model. Lancet Digit Health 2(10), e516–e525 (2020).
- 7. A machine learning-based model for survival prediction in patients with severe COVID-19 infection. medRxiv
DOI: 10.1101/2020.02.27.20028027 2020.2002.2027.20028027 (2020). - 8. . Association of radiologic findings with mortality of patients infected with 2019 novel coronavirus in Wuhan, China. PLOS One 15(3), e0230548 (2020).
- 9. Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal. BMJ 369, m1328 (2020). •• This study reviewed multiple prediction models created for COVID-19 and assessed their limitations.
- 10. A biomarker based severity progression indicator for COVID-19: the Kuwait Prognosis Indicator score. Biomarkers 25(8), 641–648 (2020). •• This is the study of the original model that is being validated in our paper.
- 11. Centers for Disease Control and Prevention (2021). www.cdc.gov/coronavirus/2019-ncov/need-extra-precautions/people-with-medical-conditions.html#:∼:text=Severe%20illness%20from%20COVID%2D,ventilation%2C%20or%20death.
- 12. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration. Ann. Intern. Med. 162(1), W1–73 (2015). • This study is important for those developing prediction models in terms of reporting of results.
- 13. . Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. J. Clin. Epidemiol. 58(5), 475–483 (2005). • This study is important for those validating prediction models in terms of effective sample size.
- 14. . A calibration hierarchy for risk models was defined: from utopia to empirical data. J. Clin. Epidemiol. 74, 167–176 (2016).
- 15. A risk prediction model for evaluating the disease progression of COVID-19 pneumonia. Front Med (Lausanne) 7, 556886–556886 (2020).
- 16. Prediction of COVID-19 patients at high risk of progression to severe disease. Frontiers in Public Health 8(758), 574915 (2020). • This study proves procalcitonin to be an important predictor of both intensive care unit admission and death.
- 17. Nomogram for predicting COVID-19 disease progression based on single-center data: observational study and model development. JMIR Med. Inform. 8(9), e19588 (2020).
- 18. A tool for early prediction of severe coronavirus disease 2019 (COVID-19): a multicenter study using the risk nomogram in Wuhan and Guangdong, China. Clin. Infect. Dis. 71(15), 833–840 (2020).
- 19. Prognostic factors for COVID-19 pneumonia progression to severe symptom based on the earlier clinical features: a retrospective analysis. medRxiv
DOI: 10.1101/2020.03.28.20045989 2020.2003.2028.20045989 (2020). - 20. . Individualized prediction nomograms for disease progression in mild COVID-19. J. Med. Virol. 92(10), 2074–2080 (2020).
- 21. Development and validation of a model for individualized prediction of hospitalization risk in 4,536 patients with COVID-19. PLOS One 15(8), e0237419 (2020).
- 22. Prediction for progression risk in patients with COVID-19 pneumonia: the CALL score. Clin. Infect. Dis. 71(6), 1393–1399 (2020).
- 23. CT quantification of pneumonia lesions in early days predicts progression to severe illness in a cohort of COVID-19 patients. Theranostics 10(12), 5613–5622 (2020). • This study is important for those developing prediction models in terms of assessing risk of bias.
- 24. A fully automatic deep learning system for COVID-19 diagnostic and prognostic analysis. Eur. Respir. J. 56(2), 2000775 (2020).
- 25. Machine learning-based CT radiomics method for predicting hospital stay in patients with pneumonia associated with SARS-CoV-2 infection: a multicenter study. Ann. Transl. Med 8(14), 859 (2020).
- 26. Novel biomarkers for the prediction of COVID-19 progression a retrospective, multi-center cohort study. Virulence 11(1), 1569–1581 (2020).
- 27. Development and validation a nomogram for predicting the risk of severe COVID-19: a multi-center study in Sichuan, China. PLOS One 15(5), e0233328 (2020).
- 28. Machine learning to predict mortality and critical events in a cohort of patients with COVID-19 in New York City: model development and validation. J. Med. Internet Res. 22(11), e24018–e24018 (2020).
- 29. Prediction model and risk scores of ICU admission and mortality in COVID-19. PLOS One 15(7), e0236618 (2020).
- 30. Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score. Bmj 370, m3339 (2020).
- 31. A novel severity score to predict inpatient mortality in COVID-19 patients. Sci. Rep. 10(1), 16726–16726 (2020).
- 32. Development of a clinical decision support system for severity risk prediction and triage of COVID-19 patients at hospital admission: an international multicentre study. Eur. Respir. J. 56(2), 2001104 (2020).
- 33. Early prediction and identification for severe patients during the pandemic of COVID-19: a severe COVID-19 risk model constructed by multivariate logistic regression analysis. J. Glob. Health 10(2), 020510–020510 (2020).
- 34. Development and external validation of a prediction risk model for short-term mortality among hospitalized U.S. COVID-19 patients: a proposal for the COVID-AID risk tool. PLOS One 15(9), e0239536 (2020).
- 35. An easy-to-use machine learning model to predict the prognosis of patients with COVID-19: retrospective cohort study. J. Med. Internet Res. 22(11), e24225–e24225 (2020).
- 36. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann. Intern. Med. 170(1), W1–W33 (2019).
- 37. Development and validation of the patient history COVID-19 (PH-Covid19) scoring system: a multivariable prediction model of death in Mexican patients with COVID-19. Epidemiol. Infect. 148, e286–e286 (2020).
- 38. Predicting mortality due to SARS-CoV-2: a mechanistic score relating obesity and diabetes to COVID-19 outcomes in Mexico. J. Clin. Endocrinol. Metab. 105(8), 2752–2761 (2020).
- 39. Development and validation of a clinical score to estimate progression to severe or critical state in COVID-19 pneumonia hospitalized patients. Sci. Rep. 10(1), 19794 (2020).