A joint study by Smart119 Inc. and Chiba University Graduate School of Medicine and using machine learning to predict mortality and the length of stay of patients admitted to the intensive care unit (ICU) has been published in the international scientific

2022.11.01

Potential for contributing to more appropriate therapeutic strategies and allocation of medical resources in ICUs.

Smart119 Inc., a National Chiba University-based medical start-up company (Head office: Chiba, Japan; President/CEO: Taka-aki Nakada), conducted a study using machine learning to establish an algorithm to predict "life prognosis" and "length of stay" for patients in the ICU, verify its accuracy, and elucidate the risk factors for mortality using cluster analysis in collaboration with the Department of Emergency and Critical Care Medicine and the Department of Artificial Intelligence Medicine (both from Graduate School of Medicine, National Chiba University). We are pleased to announce that a research paper summarizing the results of this study (first author: Shinya Iwase, corresponding author: Taka-aki Nakada) has been published in the international scientific journal "Scientific Reports" (published by Nature Research, UK).

◆Published in: "Scientific Reports", an international scientific journal
Prediction algorithm for ICU mortality and length of stay using machine learning
https://www.nature.com/articles/s41598-022-17091-5

Commenting on this announcement, Dr. Taka-aki Nakada, Smart119 Inc. CEO and Professor at National Chiba University, said, "We have confirmed that the latest AI technology can predict life prognosis and length of stay of ICU patients with high accuracy. It is Smart119's mission to optimize emergency care with advanced ICT."

Smart119 Inc. raised a total of 390 million yen from Sony Innovation Fund, Nissay Capital, Mitsui Sumitomo Insurance Capital, and others.

Potential to contribute to more accurate medical decision and allocation of medical resources in ICUs

A huge amount of data is collected in ICUs, where critically ill patients are constantly monitored using advanced medical equipment. It is not easy for doctors and nurses on site to grasp and analyze all those big data. On the other hand, big data is suitable for machine learning analysis, and machine learning of vital data such as blood pressure, respiration, heart rate, and blood data may be able to predict the prognosis and severity of a patient's condition. If the prognosis and length of stay of ICU patients can be predicted with high accuracy, it may lead to better clinical decision-making, treatment decisions, and allocation of medical resources.

Based on this perspective, Shinya Iwase, Specially-Appointed Professor of Department of Emergency and Critical Care Medicine, Graduate School of Medicine, Chiba University, et al. conducted a study to establish an algorithm to predict "life prognosis," "stays within one week among survivors," and "stays over two weeks among survivors" of patients admitted to the ICU, verify its accuracy, and clarify the risk factors of death. In this research, which is expected to contribute to advances in predictive medicine for ICU patients, Smart119 Inc. was mainly responsible for analyzing big data using machine learning and developing algorithms for predicting mortality and length of stays.

This study used electronic health record data collected on ICU admission of approximately 12,800 patients admitted to the Chiba University Hospital ICU from November 2010 to March 2019. We designed a classification algorithm model using 80% of the sample for machine learning (approximately 10,200 patients), and conducted validation using 20% for testing (approximately 2,600 patients). The validation employed three different machine learning methods, including "Random Forest (RF)."

As a result of validation, it was verified that the life prognosis and length of stay of ICU patients can be predicted with high accuracy from various vital data (more than 100 types) collected on admission. Specifically, the "Random Forest (RF)" *1 showed an AUC value of 0.945 *2 for ICU mortality, 0.881 for stays within one week among survivors, and 0.889 for stays over two weeks among survivors. Other machine learning methods have also been found to have high predictive values. [Figures 1, 2-1, 2-2]

[Figure 1] Predictive accuracy and key variables for ICU mortality. Approximately 2,600 patient data for testing were included in the analysis. (a) ROC curves *3 and AUCs for ICU mortality were obtained from three different machine learning methods (Random Forest, XGBoost, Neural Network) and logistic regression. (b) Relative importance of variables for ICU mortality in Random Forest.

[Figure 2-1] Predictive accuracy and key variables for length of ICU stays. Approximately 2,600 patient data for testing were included in the analysis. (a, b) Machine learning methods using Random Forest and logistic regression were used to derive ROC curves and AUCs for short (a) and long (b) ICU stays.

[Figure 2-2] (c, d) Relative importance of variables for short (c) and long (d) ICU stays. Relative importance of variables for short (c) and long (d) ICU stays in Random Forest, showing that elective surgery, HR (heart rate), LDH (lactate dehydrogenase) and UN (urea nitrogen) are important to predict the length of ICU stays.

Based on the prediction by machine learning and cluster analysis by UMAP *4, we found that the blood concentration of "LDH (lactate dehydrogenase)" is an important factor that strongly influences the life prognosis and length of stay of ICU patients, regardless of the nature of patients' diseases. This is a groundbreaking achievement because there have been many reported cases of studies using machine learning to predict mortality of ICU patients with high accuracy, but very few reports of analyses to identify important variables in mortality prediction algorithms. [Figures 3-1, 3-2]

[Figure 3-1] Cluster analysis based on mortality risk in the ICU. Approximately 2,600 patient data for testing were included in the analysis. (a) Cluster analysis by UMAP of ICU patients was based on ICU mortality risk and the distribution of each variable. The analysis resulted in five clusters of patients.

[Figure 3-2] (b) The top three variables (Lac (lactate), LDH, and PLT (platelet count)) contributed to predicting ICU mortality, while the other two factors (Diagnosis and Department) characterized each cluster.

Precise prediction of patient prognosis and length of stay from vital data on ICU admission would not only reduce the burden on healthcare professionals, but also improve the QOL (quality of life) of patients, and provide information and evidence for decision making regarding future treatment options to families of patients who have lost consciousness. Precise prediction at an early stage is expected to improve the quality of patient care and clinical outcomes in ICUs, and to optimize allocation of medical resources and healthcare costs.

*1 random forest: A machine learning algorithm used for classification, regression, and clustering. It is considered to be capable of higher performance of discrimination and prediction than general decision trees.
*2 Area under the curve: A value that indicates the accuracy of a classification algorithm. It is considered highly accurate when it exceeds the threshold 0.8.
*3 ROC (Receiver Operating Characteristic) curve: A two-dimensional graph representing the performance of a test or diagnostic agent; when an ROC curve is created, the area under the curve is the AUC (Area Under the Curve).
*4 Uniform Manifold Approximation and Projection: A method of nonlinear dimensionality reduction by machine learning.