Predicting the Onset of Diabetes using Clinical and Demographic Features
Abstract
The rising global prevalence of diabetes necessitates the development of effective diagnostic tools. This study explores the prediction of diabetes onset using clinical and demographic variables including glucose level, BMI, age, and family history. Utilizing a well-established dataset of 768 women from the Pima Indian population, we perform exploratory analysis and build a logistic regression model to assess the probability of diabetes presence. The model shows promising accuracy, with glucose level and BMI emerging as strong predictors. These findings emphasize the potential of machine learning in enhancing early diabetes detection and prevention strategies.
Keywords
Download Options
Introduction
Diabetes mellitus, especially type 2 diabetes, is a chronic metabolic disorder characterized by elevated blood glucose levels. Its detection is often delayed until complications arise, necessitating better early-warning mechanisms. This study uses machine learning techniques on a clinical dataset to understand the contributing features and predict diabetes onset efficiently. This can support timely interventions and potentially reduce healthcare burdens.
Conclusion
This study used a logistic regression model to predict diabetes onset based on clinical and demographic data. Key takeaways include:
Glucose, BMI, and age are dominant features associated with diabetes.
Logistic regression achieved ~75% accuracy, offering a simple yet effective tool for early screening.
More advanced models or ensemble methods may further boost prediction quality