Informatics in Medicine Unlocked | |
Breast cancer risk assessment and early diagnosis using Principal Component Analysis and support vector machine techniques | |
Babafemi O. Macaulay1  Boluwaji A. Akinnuwesi2  Benjamin S. Aribisala3  | |
[1] Corresponding author. Department of Computer Science, University of Eswatini, Kwaluseni Campus, M201, Eswatini. .;Department of Computer Science, Faculty of Science and Engineering, University of Eswatini, Kwaluseni M201, Swaziland;Department of Computer Sciences, Faculty of Science, Lagos State University, Lagos, Nigeria; | |
关键词: Breast cancer; Risk assessment; Early diagnosis; Multi pre-processing; Feature extraction; Support vector machine; | |
DOI : | |
来源: DOAJ |
【 摘 要 】
Breast cancer (BCa) is one of the leading causes of cancer mortality among women globally and the specific causes of the disease remain unknown, but studies have shown several risk factors associated with the morbid condition. Breast cancer risk assessment and diagnosis can be achieved using clinical acumen of physicians, medical imaging and computational techniques. Early diagnosis has been identified as one of the ways to reduce BCa mortality. However, accuracy of the diagnosis is not always guaranteed due to human error; radiologists' divergent results from interpretations given to medical images; and computational errors due to use of data imbued with some errors. Thus, in this study, we adopted the hybrid of Principal Component Analysis (PCA) and Support Vector Machine (SVM) to develop BCa risk assessment and early diagnosis model (i.e. BC-RAED) that is capable of accurately establishing BCa at the early stage. PCA was used to extract features at the first preprocessing and the features were further reduced after the second preprocessing. The multi pre-processed data were assessed for breast cancer's risk and diagnosis using SVM. BC-RAED presents accuracy of 97.62%, sensitivity of 95.24% and specificity of 100% on BCa risk assessment and diagnosis. The new levels of accuracy, sensitivity and specificity were significant at 5% level of significance (p < 0.05) when compared with documented values in literature and this confirmed the viability of BC-RAED. Based on this result, it was concluded that BC-RAED has the potential to multi pre-process breast cancer data and classify patients into likely and unlikely categories, based on risk factors, and classify cancer cases into malignant and benign, based on established technical indicators reported in literature.
【 授权许可】
Unknown