Prediction and Assessment of Acute Respiratory Distress Syndrome: Effects of Assumptions in Imputation Methods – UROP Spring Symposium 2021

Prediction and Assessment of Acute Respiratory Distress Syndrome: Effects of Assumptions in Imputation Methods

Sion Kim

Sion Kim

Pronouns: He/His

Research Mentor(s): Jonathan Gryak, Assistant Research Scientist
Research Mentor School/College/Department: Department of Computational Medicine & Bioinformatics, Michigan Medicine
Presentation Date: Thursday, April 22, 2021
Session: Session 1 (10am-10:50am)
Breakout Room: Room 7
Presenter: 2

Event Link


Acute respiratory distress syndrome (ARDS) is a life-threatening lung condition that is under-diagnosed in the clinical setting. With a 43% hospitalized mortality rate, it is critical that diagnosis is made in a timely manner for ventilation strategies to be instituted. The Biomedical and Clinical Informatics lab at U-M has been developing a real-time clinical decision support system to detect ARDS through machine learning methods, specifically a modified support vector machine. A clear distinction from other ARDS-related support systems is the use of “privileged information” “” data accessible at the time of training themachine learning model but which is not available in deployment. In the context of this project, we define privileged information to consist of CT scans, which are required for ARDS diagnosis. Such information is not available to the clinician during the early period of a patient’s stay. Therefore, by incorporating such privileged information, we expect the model to lead to faster ARDS diagnosis in clinical practice. This research investigates the electronic health record (EHR) pipeline of the model. With inconsistencies and missing values, some EHR features are absent from one patient while present in another. We aim to improve the model’s accuracy in ARDS detection by testing a unique data imputation method against different machine learning models and settings. Specifically, missing values will be imputed under the assumption that data is not missing at random but as a result of the clinician’s decision that the patient had optimal health in such cases. This method is implemented by imputing data taken from the Michigan Medicine’s established reference ranges when possible. The proposed method will be compared with different machine learning models and feature reduction techniques.

Authors: Sion Kim, Jonathan Gryak
Research Method: Computer Programming

lsa logoum logo