Babatunde Kazeem Olorisade
The use of bibliography enriched features for automatic citation screening
Olorisade, Babatunde Kazeem; Brereton, Pearl; Andras, Peter
Authors
Pearl Brereton
Prof Peter Andras P.Andras@napier.ac.uk
Dean of School of Computing Engineering and the Built Environment
Abstract
Context
Citation screening (also called study selection) is a phase of systematic review process that has attracted a growing interest on the use of text mining (TM) methods to support it to reduce time and effort. Search results are usually imbalanced between the relevant and the irrelevant classes of returned citations. Class imbalance among other factors has been a persistent problem that impairs the performance of TM models, particularly in the context of automatic citation screening for systematic reviews. This has often caused the performance of classification models using the basic title and abstract data to ordinarily fall short of expectations.
Objective
In this study, we explore the effects of using full bibliography data in addition to title and abstract on text classification performance for automatic citation screening.
Methods
We experiment with binary and Word2vec feature representations and SVM models using 4 software engineering (SE) and 15 medical review datasets. We build and compare 3 types of models (binary-non-linear, Word2vec-linear and Word2vec-non-linear kernels) with each dataset using the two feature sets.
Results
The bibliography enriched data exhibited consistent improved performance in terms of recall, work saved over sampling (WSS) and Matthews correlation coefficient (MCC) in 3 of the 4 SE datasets that are fairly large in size. For the medical datasets, the results vary, however in the majority of cases the performance is the same or better.
Conclusion
Inclusion of the bibliography data provides the potential of improving the performance of the models but to date results are inconclusive.
Citation
Olorisade, B. K., Brereton, P., & Andras, P. (2019). The use of bibliography enriched features for automatic citation screening. Journal of Biomedical Informatics, 94, Article 103202. https://doi.org/10.1016/j.jbi.2019.103202
Journal Article Type | Article |
---|---|
Acceptance Date | May 3, 2019 |
Online Publication Date | May 7, 2019 |
Publication Date | 2019-06 |
Deposit Date | Nov 8, 2021 |
Journal | Journal of biomedical informatics |
Print ISSN | 1532-0464 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 94 |
Article Number | 103202 |
DOI | https://doi.org/10.1016/j.jbi.2019.103202 |
Keywords | Computing methodologies, Citation screening automation, Systematic reviews, Text mining, Feature enrichment |
Public URL | http://researchrepository.napier.ac.uk/Output/2808924 |
You might also like
Structural Complexity and Performance of Support Vector Machines
(2022)
Presentation / Conference Contribution
Federated Learning for Short-term Residential Load Forecasting
(2022)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search