Comparison of Different Classifiers for Diabetes Diagnosis
Download as PDF
Shuchang Ye, Enqi Liu
Machine learning algorithms provide several indispensable tools for intelligent medical data analysis. The paper provides a macroscopical comparison among different classifiers’ performance in diabetes diagnosis. Representative and pervasive classifiers are chosen in several typical classifier categories, which are supported by Waikato Environment for Knowledge Analysis. The dataset used is the Pima Indians Diabetes Database, which is collected by the National Institute of Diabetes and Kidney Diseases in 1990. The high-level overview of the procedure of this study is data preprocessing, applying a classification algorithm, and estimating the performance. The paper briefly introduces the nature of each classifier and its application scenarios. The details of data preprocessing including feature selection are explained and the results of the outcome are discussed. The existing studies leave out the interpretability of classifiers which is crucial in medical prediction. To address the limitation of previous studies, this paper takes interpretability, and domain knowledge into consideration when estimating the performance of each model. The Naïve Bayes classifier achieves relatively high performance in this scenario.
Classifier, Machine Learning, Weka, Diabetes Diagnosis