In this project, five different machine learning (ML) models are trained and compared in term of predicting the early-stage diabetes. A data collected in hospital Frankfurt, Germany containing 2000 patients’ information have been used in this study. RF, NB, SVM, KNN, and LR are the five models used for predicting the diabetes.
• The objective of this project is to classify whether someone has diabetes or not. • Dataset consists of several Medical Variables (Independent) and one Outcome Variable (Dependent) • The independent variables in this data set are: 'Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin','BMI', 'DiabetesPedigreeFunction', 'Age' • The outcome variable value is either 1 or 0 indicating whether a person has diabetes(1) or not(0).