Publications Details
Measuring Reproduciblity of Machine Learning Methods for Medical Diagnosis
Ahmed, Hana A.; Tchoua, Roselyne; Lofstead, Gerald F.
The National Academy of Sciences, Engineering, and Medicine (NASEM) defines reproducibility as 'obtaining consistent computational results using the same input data, computational steps, methods, code, and conditions of analysis,' and replicability as 'obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data' [1]. Due to an increasing number of applications of artificial intelligence and machine learning (AI/ML) to fields such as healthcare and digital medicine, there is a growing need for verifiable AI/ML results, and therefore reproducible research and replicable experiments. This paper establishes examples of irreproducible AI/ML applications to medical sciences and quantifies the variance of common AI/ML models (Artificial Neural Network, Naive Bayes classifier, and Random Forest classifiers) for tasks on medical data sets.