Research | Open | Published:
The development and validation of a “5A” severity scale for predicting in-hospital mortality after accidental hypothermia from J-point registry data
Journal of Intensive Carevolume 7, Article number: 27 (2019)
The Correction to this article has been published in Journal of Intensive Care 2019 7:34
Accidental hypothermia is a serious condition that requires immediate and accurate assessment to determine severity and treatment. Currently, accidental hypothermia is evaluated using the Swiss grading system which uses core body temperature and clinical findings; however, research has shown that core body temperature is not associated with in-hospital mortality in urban settings. Therefore, we developed and validated a severity scale for predicting in-hospital mortality among urban Japanese patients with accidental hypothermia.
Data for this multi-center retrospective cohort study were obtained from the J-point registry. We included patients with accidental hypothermia who were admitted to an emergency department. The total cohort was divided into a development cohort and validation cohort, based on the location of each institution. We developed a logistic regression model for predicting in-hospital mortality using the development cohort and assessed its internal validity using bootstrapping. The model was then subjected to external validation using the validation cohorts.
Among the 572 patients in the J-point registry, 532 were ultimately included and divided into the development cohort (N = 288, six hospitals, in-hospital mortality 22.0%) and the validation cohort (N = 244, six hospitals, in-hospital mortality 27.0%). The 5 “A” scoring system based on age, activities-of-daily-living status, near arrest, acidemia, and serum albumin level was developed based on the variables’ coefficients in the development cohort. In the validation cohort, the prediction performance was validated.
Our “5A” severity scoring system could accurately predict the risk of in-hospital mortality among patients with accidental hypothermia.
Accidental hypothermia (AH) involves an unintentional decrease in core body temperature to ≤ 35 °C . This condition is associated with high risks of hemodynamic collapse and mortality (24–40%) [2,3,4], as the cooling heart results in decreased cardiac output and electrical conduction abnormalities leading to life-threatening dysrhythmias, such as bradycardic atrial fibrillation or ventricular fibrillation . Therefore, patients with AH must be immediately assessed to determine their severity and select appropriate advanced resuscitation and critical care techniques.
Although AH patients require immediate assessment of the severity and critical care, there is no established risk assessment tool specialized for AH patients. This might lead to inappropriate decision-making due to a lack of accurate information for the prognosis. The severity of AH is traditionally evaluated using the Swiss grading system  which is based on core body temperature and simple clinical findings. However, other research has indicated that core body temperature is not associated with in-hospital mortality in urban settings [2, 4, 5]. Moreover, mortality is known to be associated with various other factors, such as age, activities of daily living (ADL), hemodynamic instability, hyperkalemia, and acidemia [1, 2, 4,5,6,7,8,9]. Unfortunately, it is difficult to understand how these factors might influence mortality, especially in an emergency setting. Thus, a simple and user-friendly severity scale is needed to estimate mortality after AH in urban settings. The present study aimed to develop and validate a severity scaling system for predicting in-hospital mortality using data from Japanese patients who experienced AH in urban settings.
This multi-center retrospective cohort study complied with the TRIPOD statement (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) regarding the reporting of the study’s methods and results .
We obtained epidemiological and clinical information from the J-point registry database which collects data from a network of Japanese centers that treat patients with AH . Eight centers are designated as Critical Care Medical Centers (CCMCs), and four sites are the emergency departments (EDs) of non-CCMC general hospitals in urban areas of the Kyoto, Osaka, and Shiga prefectures in Japan. Each year, the centers had a median of 19,651 ED visits (interquartile range 13,281–27,554 visits). In Japan, CCMCs are certified by the Ministry of Health, Labour and Welfare based on EDs that treat patients for shock, trauma, resuscitation, and critical care which serve approximately 500,000 residents in each region; in these CCMCs, advanced treatment like extracorporeal membrane oxygenation (ECMO) is generally available . The non-CCMC centers are public or private general hospitals that cover a smaller regional community, and, generally, advanced treatment such as ECMO is unavailable.
The J-point registry includes patients who are retrospectively identified at each center using the International Classification of Diseases, Tenth Revision (ICD-10) code for hypothermia (T68). These patients were treated for hypothermia between April 1, 2011, and March 31, 2016, and had a body temperature of unknown or ≤ 35.0 °C. Patients were excluded from the registry if they or their family members explicitly refused to be included in the registry. Clinical data were extracted by emergency physicians using a predefined data extraction sheet. The collected data were re-checked by the J-Point Registry Working Group members and either confirmed or checked with the appropriate institution if there were concerns regarding the data’s validity. Based on these factors, 572 patients were registered in the J-point registry. The ethics committee of each center approved the registry protocol and retrospective analysis of de-identified data.
The present study included adult patients (≥ 16 years old) with a core body temperature of ≤ 35 °C at ED admission and excluded patients with a non-AH core body temperature (> 35 °C or unknown) and missing data regarding age, sex, and mortality. The model was planned to undergo both internal and external validation [12, 13]. Thus, a development cohort was created based on centers from Kyoto city (four CCMCs and two non-CCMCs), while the validation cohort was created based on centers from Shiga and Osaka prefecture and Kyoto prefecture except for Kyoto city (four CCMCs and two non-CCMCs). This approach was selected because random sample splitting is not recommended for relatively small cohorts (to avoid over-fitting the data), which should instead be subdivided based on a time period or geographical location [12, 14]. The validation cohort was considered sufficient for external validation because the sample splitting was based on geographical location and not random allocation [12, 14].
Data collection and patient outcomes
The institutions were categorized as CCMC or non-CCMC, and the annual number of ED visits, the average number of hospital beds, and patients’ characteristics including sex, age, independent or disturbed ADLs, and comorbidities were collected (Additional file 1). The patients’ clinical characteristics were defined as vital signs at hospital arrival (core body temperature, systolic blood pressure [SBP], and Glasgow [GCS] and Japan [JCS] Coma Scales) and biological data (serum pH, potassium [K+][mEq/L], and albumin [g/dL]). Details of the patients’ clinical characteristics are provided in Additional file 1. Treatment characteristics were defined as external and minimally invasive rewarming methods (warm intravenous fluid, forced warm air, warm blanket, and others) and active internal rewarming (lavage, intravascular rewarming device, and veno-venous and veno-arterial ECMO) (Additional file 1: Table S1). The outcome of interest was defined as in-hospital mortality, which was also determined retrospectively.
Prognostic variable selection, data preparation, and handling missing data
Based on previous studies and expert opinions, we selected the admission values for age, ADL, body temperature, level of consciousness, hemodynamic state, serum pH, albumin, and K+ as potential predictor candidates of in-hospital mortality [1, 2, 4,5,6,7,8,9]. To ensure that the model is user-friendly, especially for emergency settings, we categorized the potential covariates based on their normal limit or commonly used ranges. Level of consciousness was classified as mild (GCS 13–15 or JCS 0–3), moderate (GCS 9–12 or JCS 10–30), and severe (GCS < 9 or JCS 100–300). Details of the JCS are described in Additional file 1. A status of “near arrest” was defined as an SBP of ≤ 60 mmHg, unmeasurable values, and cardiac arrest. In terms of missing values, variables with < 3% missing data were analyzed based on complete case analysis as such an analysis might then be feasible . If missing values were > 3%, missing data were categorized as “unknown,” because unmeasured values might be informative in clinical settings (e.g., in minor cases, blood gas analysis tends to be omitted). Tables 1 and 2 show the distributions of the covariate categories for each cohort.
We did not calculate the required sample size, because the J-point registry contains the largest number of AH cases among the available literature, and we aimed to empirically include all available data to maximize the model’s power and generalizability . There is a consensus on the importance of having an adequate sample size; however, there is no generally accepted approach for estimating the required sample size when developing and validating risk prediction models .
Development and evaluation of the prediction model
In the development cohort, predictors were selected using a stepwise backward method based on the lowest Akaike’s information criterion from the potential predictor candidates mentioned above. It allowed us to develop a parsimonious predictor model for variable retention, and multivariable logistic regression was subsequently applied. Backward elimination is generally preferred as an automated selection procedure because all correlations between the predictors are considered in the modeling procedure . Each variable’s coefficient β and odds ratio were reported with the 95% confidence interval (CI). The model’s performance was evaluated based on Somers’ Dxy, the C index, the R2 value, the calibration intercept and slope, and the Brier score. Calibration plots were also created to graphically depict the association between the predicted and observed in-hospital mortality rates based on locally weighted scatterplot smoothing . Internal validation involved a bootstrapping procedure using 200 samples drawn with replacement from the original sample .
The fixed model was applied to the validation cohort for external validation, and the discrimination and calibration performances were compared to those from the development cohort. Finally, we set the clinically useful simplified risk stratification using a simple integer risk score based on each variable’s coefficient β . To assess discrimination performance, we compared the c-index of our risk scoring system with that of the core body temperature on admission, which is categorized by the Swiss grading system, commonly used to assess the severity in AH . The diagnostic abilities [sensitivity, specificity, positive likelihood ratio (LR+) and negative likelihood ratio (LR−)] of each score were calculated. The calibration performance of risk stratification was graphically evaluated in terms of the relationship between the predicted and observed in-hospital mortality. All statistical results were considered significant at two-sided P values of < 0.05. Statistical analyses were performed using JMP Pro® 14 software (SAS Institute Inc., Cary, NC) and R software (version 1.1.456; R Studio Inc.) with the “rms” package .
Among the 572 patients in the J-point registry, we excluded 31 patients with a non-AH body temperature (> 35 °C or unknown), 8 non-adult patients (< 16 years old), and 1 patient with missing data. Thus, 532 patients were ultimately included, with an overall in-hospital mortality of 24.4%. The patients were then divided into the development cohort (N = 288, six hospitals [four CCMCs and two non-CCMCs], in-hospital mortality 22.0%) and the validation cohort (N = 244, six hospitals [four CCMCs and two non-CCMCs], in-hospital mortality 27.0%) (Fig. 1). The characteristics of the institutions and patients are shown in Tables 1 and 2, with the characteristics and distributions being generally similar between the cohorts. Missing values in pH and albumin were > 3% in each variable; thus, these missing values were categorized as “unknown,” and we conducted a complete case analysis.
Performance and internal and external validation of the model
The 5 “A” predictors (age, ADL, near arrest state, acidemia, and albumin) were selected. The variables’ coefficient β, adjusted odds ratio with 95% CI, and the formula for predicted in-hospital mortality are shown in the Additional file 1: Table S2 and Formula. Evaluation of the model and the calibration plot in the development and validation cohorts were shown in Additional file 1: Table S3 and Figure S1 respectively, in Additional file 1. The calibration plot in both cohorts revealed a relatively good calibration, although the bias-corrected line revealed slight overestimation of the mortality risk.
Risk scores and their performance
Based on the coefficient β of each predictor, a severity scoring scale was created (Fig. 2, Additional file 1: Table S4). The scoring system was based on age (60–69 years, + 1 point; 70–79 years, + 2 points; ≥ 80 years, + 3 points), ADL status (disturbed, + 1 point), hemodynamic status (near arrest, + 2 points), pH (7.35–7.2, + 1 point; < 7.2, + 2 points), and serum albumin level (≤ 3 mg/dL, + 1 point). The c-index of our scoring system was 0.776 and 0.731 in the development and validation cohorts, respectively. It was higher than that of the Swiss grading system (0.731 vs 0.558) in the validation cohort with statistical significance (Additional file 1: Table S5). The severity scale for predicting in-hospital mortality was defined as low risk (≤ 3 points), mild risk (4 points), moderate risk (5 points), and severe risk (≥ 6 points) (Fig. 2). In the validation cohort, the mean predicted mortality and observed mortality in each group were 7.1% (95% CI, 5.8–8.4%) and 12.6%, respectively, in the low-risk group; 20.5% (95% CI, 18.7–22.3%) and 26.3%, respectively, in the mild-risk group; 35.4% (95% CI, 33.3–37.6%) and 42.5%, respectively, in the moderate-risk group; and 67.4% (95% CI, 65.1–69.6%) and 55.6%, respectively, in the severe-risk group (Figs. 2 and 3). The diagnostic abilities of in-hospital mortality prediction in the low-risk group (≤ 3 points) were sensitivity 0.89 (95% CI, 0.82–0.97) and LR − 0.33 (95% CI, 0.16–0.68), which were suitable for rule-out, and in the severe-risk group (≥ 6 points), were specificity 0.91 (95% CI, 0.87–0.95) and LR + 3.37 (95% CI, 1.86–6.10), which were slightly suitable for rule-in (Table 3). Graphical evaluation of the severity scoring system revealed good calibration with the actual results in the validation cohort and was the same as in the development cohort (Fig. 3).
The present study revealed that a “5A” severity scoring scale (based on age, ADL, near arrest, acidemia, and albumin) had better ability to predict mortality after AH than the Swiss staging system based on the core body temperature, with good discrimination and calibration values based on internal and external validation. Furthermore, the severity scoring system will help emergency physicians to rapidly predict a patient’s prognosis and make management decisions. To the best of our knowledge, this is the first scale to be subjected to internal and external validation for predicting prognosis among patients with AH in urban areas.
Previous literature and the present study’s strengths
Two reports have described methods for predicting prognosis after cardiac arrest due to AH [16, 17]. The ICE survival score (based on sex, asphyxiation, and serum K+) and the HOPE score (based on sex, asphyxia, age, K+, duration of cardiopulmonary resuscitation, and temperature) could predict prognosis after treatment using extracorporeal life support for AH cardiac arrest. However, these scores were developed based on a literature review, which included observational cohorts and case reports, and might have been affected by publication bias and selection bias. Moreover, these scores were not evaluated using bootstrapping as internal validation or a separate dataset for external validation which might have increased the risk of overfitting; thus, it raises questions regarding the applicability of these scores to other populations.
In contrast, the present study has several strengths. First, to our knowledge, ours is the largest cohort of patients with AH which allowed us to create two cohorts based on geographical location and subject the model to external validation. Second, we performed a bootstrapping procedure to assess overfitting and over-optimism during internal validation. Third, most patients were elderly which agrees with a recent report that indicated that most AH cases in Japan involve elderly people [2, 3, 9]. Population aging is a common public health issue in industrialized countries all over the world, and it is assumed that most victims of AH in developed countries will also be elderly. Most previous studies regarding AH have focused on younger patients [6,7,8, 16,17,18,19], with average ages of 35 years in the ICE score study and 36 years in the HOPE score study [16, 17]; therefore, these scores are not applicable for the general population. Thus, we believe that our model is more generalizable for patients who experience AH in urban areas.
The present study evaluated clinically relevant variables that can be summarized as the “5A”s (age, ADL, near arrest, acidemia, and albumin). In this context, the patient’s values for age, ADL, and serum albumin may reflect a vulnerable physiological status, and these variables are commonly used as prognostic factors in critical care [20,21,22,23]. Hemodynamic instability and pH are also important factors in major critical care severity scoring scales  as they reflect the extent of vital organ hypoperfusion. Thus, we believe that the variables in our model could reflect hypothermia-related physiological changes. Similar to other studies , we did not incorporate body temperature as a predictor, as we hypothesized that a hypothermia-related decrease in organs’ oxygen consumption could protect them despite the presence of hypoperfusion, which would prevent body temperature from being strongly associated with prognosis.
During the model’s development, we used bootstrapping to account for slight over-optimism (e.g., correcting the C statistic from 0.794 to 0.746). We also found overestimation among the severe population in the bias-corrected calibration plot which appears to be mainly related to the small number of severe cases. The validation process also revealed slight overestimation among the severe population. Thus, we should interpret the findings carefully among severe cases.
We believe that this severity scoring scale allows clinicians to rapidly assess the severity of AH patients, provide patients and their families with accurate information, and improve their prognosis by more appropriately selecting severe cases which require advanced resuscitation and critical care. In outdoor activity (e.g., skiing, climbing) associated settings, most AH patients are healthy young athletes. In such situations, even if the probability of death is over 60%, aggressive treatment by physicians is reasonable. On the other hand, in urban areas, most AH patients are elderly [3, 9]. Since there is no established risk assessment tool available for their treatment, we are apprehensive about the appropriate treatment, because enough information for the prognosis is not available. For instance, elderly patients with impending death might be treated too invasively, without discussing the prognosis with their relatives, or those with good survival prospects might undergo early withdrawal of the treatment. The severity scoring system based on easily accessible data (“5A”) enables easy prognosis assessment by physicians. Aggressive treatment might be reasonable for patients found to be in the low-risk group (≤ 3 points), even if they are elderly. Physicians can easily identify the condition requiring critical care for those in the severe-risk group (≥ 6 points), and based on the possibility of prognosis, they can decide the indication after discussing with the patient’s relatives. Therefore, our risk scoring system can lead to rational decision-making based on the probability of prognosis evidence.
This study has several limitations. First, despite the generalizability of our model to similar urban areas, it is unclear if this model can be applied to other settings, for instance, an outdoor activity (i.e., skiing, mountain climbing, etc.) associated setting, in which most patients are healthy young athletes. Since our model was developed from an urban population, in which treatment is focused on elderly people who stay indoors [2, 9], the population and characteristics between these settings are totally different. A second limitation is the relatively small sample size which could have increased the risk of overfitting and optimism . A third limitation is the absence of complete detailed data in the J-point registry regarding the AH event, the clinical course after admission, treatment, the neurological outcome, and the cause of death. Fourth, we did not compare the usefulness or diagnostic ability with general risk assessment tool for critically ill patients such as SOFA or APACHE2. Therefore, further research is needed to determine the validity, generalizability, and clinical usefulness of our model in other cohorts and to evaluate its clinical utility.
The present study revealed that the 5A severity scale had good discrimination and calibration for predicting in-hospital mortality after AH based on internal and external validation. We believe that this severity scoring scale can be useful to rapidly assess the severity of patients with AH.
Brown DJ, Brugger H, Boyd J, Paal P. Accidental hypothermia. N Engl J Med. 2012;367(20):1930–8.
Matsuyama T, Morita S, Ehara N, Miyamae N, Okada Y, Jo T, Sumida Y, Okada N, Watanabe M, Nozawa M, et al. Characteristics and outcomes of accidental hypothermia in Japan: the J-point registry. Emerg Med J. 2018.
Medicine. JAfA. The clinical characteristics of hypothermic patients in the winter of Japan—the final report of hypothermia STUDY 2011. Journal of Japanese Association for Acute Medicine. 2013;24:12.
Vassal T, Benoit-Gonin B, Carrat F, Guidet B, Maury E, Offenstadt G. Severe accidental hypothermia treated in an ICU: prognosis and outcome. Chest. 2001;120(6):1998–2003.
Okada Y, Matsuyama T, Morita S, Ehara N, Miyamae N, Jo T, Sumida Y, Okada N, Kitamura T, Iiduka R. Prognostic factors for patients with accidental hypothermia: a multi-institutional retrospective cohort study. Am J Emerg Med. 2018.
Schaller MD, Fischer AP, Perret CH. Hyperkalemia. A prognostic factor during acute severe hypothermia. Jama. 1990;264(14):1842–5.
Mair P, Kornberger E, Furtwaengler W, Balogh D, Antretter H. Prognostic markers in patients with severe accidental hypothermia and cardiocirculatory arrest. Resuscitation. 1994;27(1):47–54.
Silfvast T, Pettila V. Outcome from severe accidental hypothermia in southern Finland--a 10-year review. Resuscitation. 2003;59(3):285–90.
Morita S, Matsuyama T, Ehara N, Miyamae N, Okada Y, Jo T, Sumida Y, Okada N, Watanabe M, Nozawa M, et al. Prevalence and outcomes of accidental hypothermia among elderly patients in Japan: data from the J-point registry. Geriatr Gerontol Int. 2018.
Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Bmj. 2015;350:g7594.
Ministry of Health, Labour and Welfare website [https://www.mhlw.go.jp/index.html].
Steyerberg EW, Harrell FE, Jr.: Prediction models need appropriate internal, internal-external, and external validation. J Clin Epidemiol 2016, 69:245–247.
Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating, vol.: hardcover. New York. London: Springer; 2009.
Moons KG, Altman DG, Reitsma JB, Ioannidis JP, Macaskill P, Steyerberg EW, Vickers AJ, Ransohoff DF, Collins GS. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162(1):W1–73.
Harrell JFE: Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis, vol.: [pbk.]. New York: Springer; 2010.
Saczkowski RS, Brown DJA, Abu-Laban RB, Fradet G, Schulze CJ, Kuzak ND. Prediction and risk stratification of survival in accidental hypothermia requiring extracorporeal life support: an individual patient data meta-analysis. Resuscitation. 2018;127:51–7.
Pasquier M, Hugli O, Paal P, Darocha T, Blancher M, Husby P, Silfvast T, Carron PN, Rousson V. Hypothermia outcome prediction after extracorporeal life support for hypothermic cardiac arrest patients: the HOPE score. Resuscitation. 2018;126:58–64.
Farstad M, Andersen KS, Koller ME, Grong K, Segadal L, Husby P. Rewarming from accidental hypothermia by extracorporeal circulation. A retrospective study. In: Eur J Cardiothorac Surg. Volume 20, edn. Germany; 2001. p. 58–64.
Ruttmann E, Weissenbacher A, Ulmer H, Muller L, Hofer D, Kilo J, Rabl W, Schwarz B, Laufer G, Antretter H, et al. Prolonged extracorporeal membrane oxygenation-assisted support provides improved survival in hypothermic patients with cardiocirculatory arrest. J Thorac Cardiovasc Surg. 2007;134(3):594–600.
Heyland D, Cook D, Bagshaw SM, Garland A, Stelfox HT, Mehta S, Dodek P, Kutsogiannis J, Burns K, Muscedere J, et al. The very elderly admitted to ICU: a quality finish? Crit Care Med. 2015;43(7):1352–60.
Yap FH, Joynt GM, Buckley TA, Wong EL. Association of serum albumin concentration and mortality risk in critically ill patients. Anaesth Intensive Care. 2002;30(2):202–7.
Sung J, Bochicchio GV, Joshi M, Bochicchio K, Costas A, Tracy K, Scalea TM. Admission serum albumin is predicitve of outcome in critically ill trauma patients. Am Surg. 2004;70(12):1099–102.
Kim SW, Han HS, Jung HW, Kim KI, Hwang DW, Kang SB, Kim CH. Multidimensional frailty score for the prediction of postoperative mortality risk. JAMA Surg. 2014;149(7):633–40.
Knaus WA, Draper EA, Wagner DP, Zimmerman JE. APACHE II: a severity of disease classification system. Crit Care Med. 1985;13(10):818–29.
We thank the members of the J-Point registry group for their contributions.
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Availability of data and materials
Data sharing is not applicable to this article as ethics committee did not approve it.
Ethics approval and consent to participate
The ethics committee of each center approved the registry protocol and retrospective analysis of the de-identified data.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The definition of patient characteristics and laboratory data. Table S1. The range of the laboratory data on arrival at the emergency department. Table S2. Coefficient β and adjusted odds ratio with 95% confidence intervals. Table S3. Model performance in the development cohort assessed by bootstrap and that in validation cohort. Figure S1. Calibration Plot in each cohort. Table S4. The conversion of the coefficient values to the score. Table S5. Comparing the discrimination performance in validation cohort. (DOCX 353 kb)