AI Models for Predicting Mammographic Mass Severity

Authors: Bukke Devendranaik
DIN
IMJH-SVU-MAY-2024-2
Abstract

Mammography stands out as the most cost-effective and efficient method for detecting cancer in its preclinical stages, with breast screening programs specifically designed to identify cancer at earlier stages. These screening programs typically yield vast amounts of data, standardized by the Breast Imaging Reporting and Data System (BI-RADS) established by the American College of Radiology. The BI-RADS system provides a standardized vocabulary for radiologists to use when interpreting each finding. The primary objective of this study is to develop AI models that predict mammography outcomes from a reduced set of interpreted mammography findings. However, the low positive predictive value of breast biopsy results stemming from mammogram interpretation leads to approximately 70% unnecessary biopsies with benign outcomes. In this research paper, data mining classification algorithms, namely Artificial Neural Network (ANN) and Support Vector Machine (SVM), are explored on a mammographic masses dataset. The accuracy of ANN and SVM is reported as 80.3% and 81.9% on test samples, respectively. Our analysis indicates that among these three classification models, SVM predicts the severity of breast cancer with the lowest error rate and highest accuracy.

Keywords
Mammography; Breast Cancer Prediction; Artificial Neural Network (ANN); Support Vector Machine (SVM); BI-RADS Classification.
Introduction

Breast cancer remains one of the most prevalent diseases among women. In 2016 alone, it is estimated that nearly 246 thousand new cases of invasive breast cancer will be diagnosed, along with 61 thousand cases of non-invasive breast cancer. The journey for any cancer patient, and their caregivers, is undeniably challenging. Early diagnosis of breast cancer is crucial, given its high mortality rate in later stages. Mammography stands as the most reliable method for diagnosing breast cancer in the modern era. The Breast Imaging Reporting and Data System (BI-RADS), established by the American College of Radiology, categorizes mammogram results into four initially, later expanded to six, classifications. Mammography is considered the most costeffective and efficient technique for identifying risk at a preclinical stage, with breast screening programs specifically aimed at detecting disease in its earlier stages. 

The diagnostic evaluation of a patient using the BI-RADS scale may necessitate further biopsy before the physician renders a final diagnosis based on the mammogram. Tumor biopsy outcomes may indicate either malignant or benign tumors. If the tumor is benign, the biopsy could have been avoided, but it is required when the physician is uncertain about a patient's BIRADS assessment of the mammogram. Nearly 70% of biopsies conducted yield benign results, a significant number of which could have been prevented. Radiologists exhibit considerable variation in interpreting mammography results. In such cases, Fine Needle Aspiration Cytology (FNAC) is employed. However, the average accuracy rate of FNAC identification is only 90%. The goal of BI-RADS identification is to assign a patient to either a benign category, indicating the absence of breast cancer, or a malignant category, suggesting strong evidence of breast cancer. The aim of this study is to enhance the physician's ability to assess the severity of a mammographic mass lesion based on BI-RADS characteristics, thereby reducing unnecessary breast biopsies and considering the patient's age.[1,3,5,7]

Conclusion

In this paper, two different classification models have been analyzed for the prediction of the severity of breast masses. These models are namely artificial neural network and support vector machine. The proposed stream imputes the missing values then trains and optimizes the two models. In this paper mainly focused on to establish an accurate classification model for mammographic mass medical diagnosis. The empirical results reveal that the SVM model does outperform the MLP method in terms of learning accuracy and complexity.

Article Preview