Skip to main content

Research Repository

Advanced Search

Canonical correlation analysis on data with censoring and error information

Sun, Jianyong; Keates, Simeon

Authors

Jianyong Sun

Simeon Keates



Abstract

We developed a probabilistic model for canonical correlation analysis in the case when the associated datasets are incomplete. This case can arise where data entries either contain measurement errors or are censored (i.e., nonignorable missing) due to uncertainties in instrument calibration and physical limitations of devices and experimental conditions. The aim of our model is to estimate the true correlation coefficients, through eliminating the effects of measurement errors and abstracting helpful information from censored data. As exact inference is not possible for the proposed model, a modified variational Expectation-Maximization (EM) algorithm was developed. In the algorithm developed, we approximated the posteriors of the latent variables as normal distributions. In the experiment, the modified E-step approximation accuracy is first empirically demonstrated by being compared to hybrid Monte Carlo (HMC) sampling. The following experiments were carried out on synthetic datasets with different numbers of censored data and different correlation coefficient settings to compare the proposed algorithm with a maximum a posteriori (MAP) solution and a Markov Chain-EM solution. Experimental results showed that the variational EM solution compares favorably against the MAP solution, approaching the accuracy of the Markov Chain-EM, while maintaining computational simplicity. We finally applied the proposed algorithm to finding the mostly correlated properties of galaxy group with the X-ray luminosity.

Citation

Sun, J., & Keates, S. (2013). Canonical correlation analysis on data with censoring and error information. IEEE Transactions on Neural Networks and Learning Systems, 24(12), 1909-1919. https://doi.org/10.1109/TNNLS.2013.2262949

Journal Article Type Article
Acceptance Date Apr 30, 2013
Online Publication Date Dec 1, 2013
Publication Date Jul 11, 2013
Deposit Date Jan 30, 2019
Journal IEEE Transactions on Neural Networks and Learning Systems
Print ISSN 2162-237X
Electronic ISSN 2162-2388
Publisher Institute of Electrical and Electronics Engineers
Peer Reviewed Peer Reviewed
Volume 24
Issue 12
Pages 1909-1919
DOI https://doi.org/10.1109/TNNLS.2013.2262949
Keywords Canonical correlation analysis (CCA), censored data, latent variable model, measurement errors
Public URL http://researchrepository.napier.ac.uk/Output/1496972
Publisher URL http://gala.gre.ac.uk/13813/