Estimating a Ranked List of Human Genetic Diseases by Associating Phenotype-Gene with Gene-Disease Bipartite Graphs

Ullah, Md Zia; Aono, Masaki; Seddiqui, Md Hanif

doi:10.1145/2700487

Estimating a Ranked List of Human Genetic Diseases by Associating Phenotype-Gene with Gene-Disease Bipartite Graphs

Ullah, Md Zia; Aono, Masaki; Seddiqui, Md Hanif

Authors

Dr Md Zia Ullah M.Ullah@napier.ac.uk
Lecturer

Masaki Aono

Md Hanif Seddiqui

Abstract

With vast amounts of medical knowledge available on the Internet, it is becoming increasingly practical to help doctors in clinical diagnostics by suggesting plausible diseases predicted by applying data and text mining technologies. Recently, Genome-Wide Association Studies (GWAS) have proved useful as a method for exploring phenotypic associations with diseases. However, since genetic diseases are difficult to diagnose because of their low prevalence, large number, and broad diversity of symptoms, genetic disease patients are often misdiagnosed or experience long diagnostic delays. In this article, we propose a method for ranking genetic diseases for a set of clinical phenotypes. In this regard, we associate a phenotype-gene bipartite graph (PGBG) with a gene-disease bipartite graph (GDBG) by producing a phenotype-disease bipartite graph (PDBG), and we estimate the candidate weights of diseases. In our approach, all paths from a phenotype to a disease are explored by considering causative genes to assign a weight based on path frequency, and the phenotype is linked to the disease in a new PDBG. We introduce the Bidirectionally induced Importance Weight (BIW) prediction method to PDBG for approximating the weights of the edges of diseases with phenotypes by considering link information from both sides of the bipartite graph. The performance of our system is compared to that of other known related systems by estimating Normalized Discounted Cumulative Gain (NDCG), Mean Average Precision (MAP), and Kendall’s tau metrics. Further experiments are conducted with well-known TF · IDF, BM25, and Jenson-Shannon divergence as baselines. The result shows that our proposed method outperforms the known related tool Phenomizer in terms of NDCG@10, NDCG@20, MAP@10, and MAP@20; however, it performs worse than Phenomizer in terms of Kendall’s tau-b metric at the top-10 ranks. It also turns out that our proposed method has overall better performance than the baseline methods.

Citation

Ullah, M. Z., Aono, M., & Seddiqui, M. H. (2015). Estimating a Ranked List of Human Genetic Diseases by Associating Phenotype-Gene with Gene-Disease Bipartite Graphs. ACM transactions on intelligent systems and technology, 6(4), Article 56. https://doi.org/10.1145/2700487

Journal Article Type	Article
Acceptance Date	Dec 1, 2014
Online Publication Date	Jul 4, 2015
Publication Date	2015-08
Deposit Date	Mar 13, 2023
Journal	ACM Transactions on Intelligent Systems and Technology
Print ISSN	2157-6904
Electronic ISSN	2157-6912
Publisher	Association for Computing Machinery (ACM)
Peer Reviewed	Peer Reviewed
Volume	6
Issue	4
Article Number	56
DOI	https://doi.org/10.1145/2700487

Instruments and Tools to Identify Radical Textual Content (2022)
Journal Article

Query expansion for microblog retrieval focusing on an ensemble of features (2019)
Journal Article

Comparison of machine learning models for early depression detection from users’ posts (2022)
Book Chapter

Selective Query Processing: A Risk-Sensitive Selection of Search Configurations (2023)
Journal Article

Leveraging contextual representations with BiLSTM-based regressor for lexical complexity prediction (2023)
Journal Article

Downloadable Citations

HTML

BIB

RTF

Authors

Abstract

Citation

You might also like

Downloadable Citations