A Data-Driven Analysis for Predicting Polycystic Ovary Syndrome (PCOS) Using Clinical and Hormonal Indicators

Authors: Gottimukkula Vinayasri
DIN
IMJH-SVU-JAN-2025-1
Abstract

Polycystic Ovary Syndrome (PCOS) is a prevalent endocrine disorder among women of reproductive age, characterized by hormonal imbalance and irregular menstrual cycles. Early diagnosis is crucial to prevent long-term complications such as infertility, diabetes, and cardiovascular issues. This study analyzes a clinical dataset of 1,000 women to develop predictive models for PCOS based on features such as BMI, testosterone levels, menstrual irregularity, and antral follicle count. Machine learning classifiers are implemented to identify the most influential predictors. Results indicate that antral follicle count and testosterone levels are the most critical features, while the models achieve over 90% classification accuracy, supporting the viability of automated diagnostic tools in clinical practice.

Keywords
Polycystic Ovary Syndrome (PCOS) Prediction Clinical and Hormonal Feature Analysis Machine Learning in Healthcare Antral Follicle Count and Androgen Levels Predictive Modeling for Early Diagnosis
Introduction

PCOS affects millions of women globally, yet it remains underdiagnosed due to varied symptoms and lack of standardized screening protocols. Key manifestations include irregular periods, elevated androgen levels, and polycystic ovaries. In this era of data-driven medicine, leveraging clinical data through machine learning offers a promising avenue for timely and accurate diagnosis. 

This research focuses on analyzing structured PCOS data to uncover patterns and risk factors associated with PCOS and to develop predictive models to assist in early identification and intervention.

Conclusion

This study confirms the utility of machine learning models in diagnosing PCOS using basic clinical data. The Random Forest model achieved 91% accuracy, demonstrating robust performance and interpretability. Antral follicle count, testosterone levels, and menstrual history emerged as the most influential features. These models could serve as decision-support tools in gynecology clinics, improving early detection and management of PCOS.

Article Preview