Skip to main content
Log in

Intelligent Models for Diabetic Prediction Using Conventional Machine Learning Techniques and Ensemble Learning Algorithms

  • Original Research
  • Published:
Save article
View saved research
SN Computer Science Aims and scope Submit manuscript

Abstract

The discovery of knowledge from medical database using machine learning approach is always beneficial as well as challenging task for diagnosis. Diabetes if left undiagnosed can affect many other organs (e.g., kidney and liver) of human body and this particular disease is very common in all ages young to adult. A large number of researches have been already taken place to predict diabetes using traditional machine learning algorithm such as artificial neural network, Naïve Bayes theorem, decision tree, etc. However, improvement of performance measures towards accuracy of identification of diabetes with a certain degree of confidence is a challenging task. Ensemble learning approach of classification of diabetes is one of such techniques in the parlour of machine learning classifier algorithms that provide a research gap for predicting the diabetes. This work presents classification algorithms for the prediction of diabetes based on two conventional machine learning classifiers (Naïve Bayes classifier model and decision tree) and four ensemble classifiers (Random Forest (RF), Bagging, AdaBoosting and Gradient Boosting). Performance measures of these algorithms have been carried out in terms of accuracy score. Dataset for training and testing the algorithms mentioned is retrieved from Pima Indian Database. On the basis of their comparative evaluation, most important feature with respect to identification of diabetic is extracted. This research underscores the significance of ensemble learning in diabetes prediction, comparing its efficiency with traditional classifiers. The study enhances accuracy assessment and identifies key features crucial for diabetes identification. These findings contribute valuable insights, paving the way for advancements in machine learning applications for healthcare diagnostics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from €37.37 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price includes VAT (India)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Rani S, Kautish S. Association clustering and time series based data mining in continuous data for diabetes prediction. In: Second international conference on intelligent computing and control systems (ICICCS); 2018.

  2. “IDF DIABETES ATLAS—8th Edition”, International Diabetes Federation; 2017. https://diabetesatlas.org/. Accessed: 15 Dec 2018.

  3. Sisodia D, Sisodia DS. Prediction of diabetes using classification algorithms. Proc Comput Sci. 2018;132:1578–85.

    Article  Google Scholar 

  4. Shailaja K, Seetharamulu B, Jabba MA. Machine learning in healthcare: a review. In: 2018 second international conference on electronics, communication and aerospace technology (ICECA), pp. 910–914. IEEE; 2018.

  5. Sarwar, M.A., Kamal, N., Hamid, W., and Shah, M.A., Prediction of diabetes using machine learning algorithms in healthcare. In: 2018 24th International conference on automation and computing (ICAC), pp. 1–6. IEEE; 2018.

  6. Orabi KM, Kamal YM, Rabah TM. Early predictive system for diabetes mellitus disease. In: Industrial conference on data mining, pp. 420–427. Springer; 2016.

  7. GLOBAL REPORT ON DIABETES WHO LIBRARY: Cataloguing-in-Publication Data Global report on diabetes; 2016.

  8. Nai-arun N, Moungmai R. Comparison of classifiers for the risk of diabetes prediction. Procedia Comput Sci. 2015;69:132–42.

    Article  Google Scholar 

  9. Bamnote GR, Pradhan M. Design of classifier for detection of diabetes mellitus using genetic programming. Adv Intell Syst Comput. 2014;1:763–70. https://doi.org/10.1007/978-3-319-11933-5.

    Article  Google Scholar 

  10. Bansal R, Kumar S, Mahajan A. Diagnosis of diabetes mellitus using PSO and KNN classifier. In: 2017 International conference on computing and communication technologies for smart nation (IC3TSN), pp. 32–38; 2017.

  11. Saxena K, Khan Z, Singh S. Diagnosis of diabetes mellitus using k nearest neighbor algorithm. Int J Comput Sci Trends Technol (IJCST). 2014;2(4):1–8.

    Google Scholar 

  12. Dagliati A, Marini S, Sacchi L, Cogni G, Teliti M, Tibollo V, De Cata P, Chiovato L, Bellazzi R. Machine learning methods to predict diabetes complications. J Diabetes Sci Technol. 2018;12(2):295–302.

    Article  Google Scholar 

  13. Bhattacharya M, Datta D. Performance evaluation of predictive machine learning models for diabetic disease using Python. In: 2022 IEEE 3rd Global conference for advancement in technology (GCAT), ISBN: 978-1-6654-6855-8; 2022.

  14. Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N. Machine learning and data mining methods in diabetes research. Comput Struct Biotechnol J. 2017;15:104–16.

    Article  Google Scholar 

  15. Prema NS, Pushpalatha MP. Prediction of gestational diabetes mellitus (GDM) using classification. In: 2017 IEEE international conference on science, technology, engineering and management (ICSTEM), Coimbatore; 2017.

  16. Iyer A, Jeyalatha S, Sumbaly R. Diagnosis of diabetes using classification mining techniques. Int J Data Min Knowl Manage Process. 2015;5:1–14. https://doi.org/10.5121/ijdkp.2015.5101.

    Article  Google Scholar 

  17. “PIMA Indian Diabetes Dataset, An open dataset”, UCI Machine Learning Repository. http://ftp.ics.uci.edu/pub/machine-learnigdatabases/pima-indians-diabetes/. Accessed 13 Oct 2022

  18. Breiman L. Bagging predictors. Mach Learn. 1996;24:123–40.

    Article  Google Scholar 

  19. Louppe G. Understanding random forests: from theory to practice. PhD Thesis, U. of Liege; 2014.

  20. Salman R, Alzaatreh A, Sulieman H, Fisal S. A bootstrap framework for aggregating within and between feature selection methods. Entropy (Basel, Switzerland). 2021;23(2):200. https://doi.org/10.3390/e23020200.

    Article  MathSciNet  Google Scholar 

  21. Ayinala M, Parhi KK. Low complexity algorithm for seizure prediction using Adaboost. In: Conf Proc IEEE Eng Med Biol Soc., pp. 1061–1064; 2012.

  22. Friedman JH. Stochastic gradient boosting. Comput Stat Data Anal. 2002;38(4):367–78.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

There is no funding involved in this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Madhubrata Bhattacharya.

Ethics declarations

Conflict of Interest

Both the authors declare that he/she has no conflict of interest.

Ethical Approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhattacharya, M., Datta, D. Intelligent Models for Diabetic Prediction Using Conventional Machine Learning Techniques and Ensemble Learning Algorithms. SN COMPUT. SCI. 6, 29 (2025). https://doi.org/10.1007/s42979-024-03479-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1007/s42979-024-03479-9

Keywords