Comparative Study of Machine Learning Algorithms for Breast Cancer Diagnosis: A Clinician–Engineer Collaborative Approach

Nargiza Pulatova; Ibrokhim Pulatov

doi:10.1051/bioconf/202520401017

Open Access

Issue		BIO Web Conf. Volume 204, 2025 International Conference on Advancing Science and Technologies in Health Science (IEM-HEALS 2025)


Article Number		01017
Number of page(s)		16
DOI		https://doi.org/10.1051/bioconf/202520401017
Published online		12 December 2025

BIO Web of Conferences 204, 01017 (2025)

Comparative Study of Machine Learning Algorithms for Breast Cancer Diagnosis: A Clinician–Engineer Collaborative Approach

Nargiza Pulatova¹^* and Ibrokhim Pulatov²

¹ Department of Clinical Pharmacology, Tashkent State Medical University
² Department of Computer Science, Specialised School named after Al-Khwarizmi

^* Corresponding Author: This email address is being protected from spambots. You need JavaScript enabled to view it.

Abstract

Breast cancer is the most common cancer and the second leading cause of cancer-related deaths among women globally. The early and precise diagnosis of malignant breast tumours is beneficial for increasing the survival rate of cancer patients [1]. In the current investigation, we propose a multidisciplinary clinician–engineer collaboration to demonstrate the potential of ML in the diagnosis of breast cancer. A publicly available dataset consisting of 569 fine-needle aspirate samples (212 malignant, 357 benign) [2] and 30 quantitative cytological measures was employed to train and evaluate four classification models: Logistic Regression, Random Forest, Support Vector Machine (SVM), and Gradient Boosting. The data were randomly divided into 70% training and 30% testing with standard normalisation. Performance of models was evaluated based on accuracy, sensitivity, specificity and F1-score. SVM had the best of all the highest accuracy (96.5%) at a sensitivity of 93.7% and a specificity of 98.1%, which performed slightly better than the other models. For clinical applications, the high sensitivity means the model won’t miss many cancers, and the high specificity reduces false alarms. The analysis of feature importance identified that cell size and shape-based features (e.g., “worst” radius, perimeter, and area features) contributed the most to the prediction of malignancy, consistent with known pathology guidelines. We debate the clinical impact of such ML tools and potential pitfalls (dataset bias, absence of external validation), as well as future directions such as prospective validation and image integration. In conclusion, according to our results, ML models are reliable classifiers to differentiate benign from malignant breast cytological lesions and point to a promising prospect to complement the clinical decision-making in oncology.

Key words: Breast cancer / Machine learning / Support vector machine / Clinical application

This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.