Skip to main content

Research Repository

Advanced Search

Estimating a Ranked List of Human Genetic Diseases by Associating Phenotype-Gene with Gene-Disease Bipartite Graphs

Ullah, Md Zia; Aono, Masaki; Seddiqui, Md Hanif

Authors

Masaki Aono

Md Hanif Seddiqui



Abstract

With vast amounts of medical knowledge available on the Internet, it is becoming increasingly practical to help doctors in clinical diagnostics by suggesting plausible diseases predicted by applying data and text mining technologies. Recently, Genome-Wide Association Studies (GWAS) have proved useful as a method for exploring phenotypic associations with diseases. However, since genetic diseases are difficult to diagnose because of their low prevalence, large number, and broad diversity of symptoms, genetic disease patients are often misdiagnosed or experience long diagnostic delays. In this article, we propose a method for ranking genetic diseases for a set of clinical phenotypes. In this regard, we associate a phenotype-gene bipartite graph (PGBG) with a gene-disease bipartite graph (GDBG) by producing a phenotype-disease bipartite graph (PDBG), and we estimate the candidate weights of diseases. In our approach, all paths from a phenotype to a disease are explored by considering causative genes to assign a weight based on path frequency, and the phenotype is linked to the disease in a new PDBG. We introduce the Bidirectionally induced Importance Weight (BIW) prediction method to PDBG for approximating the weights of the edges of diseases with phenotypes by considering link information from both sides of the bipartite graph. The performance of our system is compared to that of other known related systems by estimating Normalized Discounted Cumulative Gain (NDCG), Mean Average Precision (MAP), and Kendall’s tau metrics. Further experiments are conducted with well-known TF · IDF, BM25, and Jenson-Shannon divergence as baselines. The result shows that our proposed method outperforms the known related tool Phenomizer in terms of NDCG@10, NDCG@20, MAP@10, and MAP@20; however, it performs worse than Phenomizer in terms of Kendall’s tau-b metric at the top-10 ranks. It also turns out that our proposed method has overall better performance than the baseline methods.

Citation

Ullah, M. Z., Aono, M., & Seddiqui, M. H. (2015). Estimating a Ranked List of Human Genetic Diseases by Associating Phenotype-Gene with Gene-Disease Bipartite Graphs. ACM transactions on intelligent systems and technology, 6(4), Article 56. https://doi.org/10.1145/2700487

Journal Article Type Article
Acceptance Date Dec 1, 2014
Online Publication Date Jul 4, 2015
Publication Date 2015-08
Deposit Date Mar 13, 2023
Journal ACM Transactions on Intelligent Systems and Technology
Print ISSN 2157-6904
Publisher Association for Computing Machinery (ACM)
Peer Reviewed Peer Reviewed
Volume 6
Issue 4
Article Number 56
DOI https://doi.org/10.1145/2700487