Skip to main content

Research Repository

Advanced Search

Machine Learning for Health and Social Care Demographics in Scotland

Buchanan, William J; Smales, Adrian; Lawson, Alistair; Chute, Chaloner


Chaloner Chute


This paper outlines an extensive study of applying machine learning to the analysis of publicly available health and social care data within Scotland, with a focus on learning the most significant variables involved in key health care outcome factors, such as for male life expectancy and premature deaths. It uses the publicly available data set from ScotPHO Profiles and uses the important metrics from the Profiles for the training. The paper analyses 56 routinely available variables based on local authority regions within Scotland, and then uses linear regression to match them to health risks. A forest regression method is then used to find the best prediction for machine learning methods. Each training variable is then trained against three other variables, which provides 26,235 different models. These models are later assessed for their success using the complete dataset. The top models are assessed for the metrics used. A frequency analysis method is finally used to determine the most defined variables for each of the variables being trained against. The results outline the significant factors that match to key health care objectives using a best match machine learning method. Other variables are however more gender-specific for example crime rates in men and claiming pension credits in women for life expectancy. There is a range of success scores for the variables, with many giving a success rate of over 87%. Along with this, there are several significant findings, and a key one is that obesity at primary school has a strong relationship with deaths for those 15-44 years old. In conclusion, the method provides a way of analysing open-source data and provides new insights into contributory factors within the health and social care conditions. It provides a ranked listing of the matches of variables to health and social care factors, and also an ordered list of the most significant variables. These can be used to further focus on health population surveys. Strengths and limitations of this study are: New methodology in the assessment of variables within health and social care and their linkages with gathered health assessment metrics, using machine learning; Processing time-efficient time for the selection of every possible model for 56 variables; New observations found within variables for health and social care conditions; Scope identifies local authority regions in Scotland, which ranges from highly populated areas, such as within cities, and less populated areas. Metrics gathered can vary across different countries, such as in England; and short-listing of key variables for health and social care related metrics.


Buchanan, W. J., Smales, A., Lawson, A., & Chute, C. (2019, November). Machine Learning for Health and Social Care Demographics in Scotland. Paper presented at HEALTHINFO 2019, Valencia, Spain

Presentation Conference Type Conference Paper (unpublished)
Conference Name HEALTHINFO 2019
Conference Location Valencia, Spain
Start Date Nov 24, 2019
End Date Nov 28, 2019
Deposit Date Oct 16, 2019
Publicly Available Date Oct 17, 2019
Series ISSN 2519-8491
Keywords Open source data; machine learning; demo- graphics; health profiles; ScotPHO
Public URL
Publisher URL
Related Public URLs


Machine Learning for Health and Social Care Demographics in Scotland (131 Kb)

You might also like

Downloadable Citations

Whoops, looks like something went wrong.