Comparative Analysis of Decision Tree Attribute Selection Measures for Breast Cancer Diagnosis
Abstract
This research paper presents a comprehensive evaluation of the decision tree algorithm using two attribute selection measures, namely Gini (information gain) and Entropy, for the classification of Breast Cancer Wisconsin (Diagnostic) Data. The primary objectives were to assess the number of selected features for the root node and the resulting learning accuracy. The study found that while both attribute selection measures yield promising results, Entropy outperforms Gini in terms of accuracy and precision. This research sheds light on the importance of feature selection in machine learning models for medical diagnosis.
Keywords
Download Options
Introduction
Disease cancers are brought about by the wild development of cells in the bosom. Quite possibly of the most continuous threat in ladies is bosom disease [1]. Bosom disease is delegated harmless and dangerous. Harmless cancer cells just fill in the bosom and don't divide all through different cells. A harmful growth is comprised of carcinogenic cells that can extend wildly, spread to different region of the body, and contaminate different tissues [5[7]].
Breast cancer is a critical health concern worldwide, and early diagnosis is essential for effective treatment [10][11]. Machine learning algorithms, such as decision trees, have shown promise in diagnosing breast cancer based on patient data. However, the choice of attribute selection measure can significantly impact the model's performance. In this study, we compare the performance of decision trees using two popular attribute selection measures, Gini (information gain) and Entropy, on Breast Cancer Wisconsin (Diagnostic) Data.
Conclusion
In this study, we conducted a comparative analysis of decision trees using Gini and Entropy as attribute selection measures for breast cancer diagnosis. Our findings indicate that the choice of attribute selection measure significantly impacts the model's performance. Entropy, in particular, demonstrated superior accuracy, precision, and recall compared to Gini.
Ultimately, this research contributes to the ongoing efforts to improve breast cancer diagnosis through machine learning techniques and emphasizes the need for careful consideration of attribute selection methods in model development. Further investigations could explore additional attribute selection measures and their impact on the performance of decision tree models for breast cancer diagnosis.