Skip to main content

Research Repository

Advanced Search

Mining trauma injury data with imputed values

Penny, Kay I; Chesney, Thomas


Kay I Penny

Thomas Chesney


Methods for analyzing trauma injury data with missing values, collected at a UK hospital, are reported. One measure of injury severity, the Glasgow coma score, which is known to be associated with patient death, is missing for 12% of patients in the dataset. In order to include these 12% of patients in the analysis, three different data imputation techniques are used to estimate the missing values. The imputed datasets are analyzed by an artificial neural network and logistic regression, and their results compared in terms of sensitivity, specificity, positive predictive value and negative predictive value. Although there is little distinction between results for the three imputation methods for the overall dataset, the hot-deck imputation method appears to give more accurate results than the model-based or propensity score imputation methods, when comparing the subsets of cases including only those patients with imputed Glasgow coma score (GCS) scores. Results show that imputation does not reduce the overall predictive accuracy following a data-mining analysis; demonstrating that all cases may be included when undertaking analysis of these trauma injury data.


Penny, K. I., & Chesney, T. (2009). Mining trauma injury data with imputed values. Statistical Analysis and Data Mining, 2, 246-254.

Journal Article Type Article
Publication Date 2009-11
Deposit Date Mar 23, 2012
Print ISSN 1932-1864
Electronic ISSN 1932-1872
Publisher Wiley
Peer Reviewed Peer Reviewed
Volume 2
Pages 246-254
Keywords Data mining; artificial neural network; logistic regression; missing data imputation; trauma injury
Public URL
Publisher URL

Downloadable Citations