Babatunde K Olorisade
Reproducibility in machine Learning-Based studies: An example of text mining
Olorisade, Babatunde K; Brereton, Pearl; Andras, Peter
Authors
Pearl Brereton
Prof Peter Andras P.Andras@napier.ac.uk
Dean of School of Computing Engineering and the Built Environment
Abstract
Reproducibility is an essential requirement for computational studies including those based on machine learning techniques. However, many machine learning studies are either not reproducible or are difficult to reproduce.
In this paper, we consider what information about text mining studies is crucial to successful reproduction of such studies. We identify a set of factors that affect reproducibility based on our experience of attempting to reproduce six studies proposing text mining techniques for the automation of the citation screening stage in the systematic review process. Subsequently, the reproducibility of 30 studies was evaluated based on the presence or otherwise of information relating to the factors.
While the studies provide useful reports of their results, they lack information on access to the dataset in the form and order as used in the original study (as against raw data), the software environment used, randomization control and the implementation of proposed techniques. In order to increase the chances of being reproduced, researchers should ensure that details about and/or access to information about these factors are provided in their reports.
Citation
Olorisade, B. K., Brereton, P., & Andras, P. (2017, August). Reproducibility in machine Learning-Based studies: An example of text mining. Presented at ICML 2017 RML Workshop: Reproducibility in Machine Learning, Sydney, Autralia
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | ICML 2017 RML Workshop: Reproducibility in Machine Learning |
Start Date | Aug 11, 2017 |
Publication Date | 2017 |
Deposit Date | Nov 9, 2021 |
Book Title | ICML 2017 RML Workshop: Reproducibility in Machine Learning |
Keywords | Text mining, reproducibility, citation screening |
Public URL | http://researchrepository.napier.ac.uk/Output/2809095 |
Publisher URL | https://openreview.net/forum?id=By4l2PbQ- |
You might also like
A review of privacy-preserving federated learning for the Internet-of-Things
(2021)
Book Chapter
Amnesia: Neuropsychological Interpretation and Artificial Neural Network Simulation
(1998)
Journal Article
Neural activity pattern systems
(2004)
Journal Article
Scalability analysis comparisons of cloud-based software services
(2019)
Journal Article
Environmental adversity and uncertainty favour cooperation
(2007)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search