Optimization of machine learning algorithms and data minimization for breast Cancer detection

                                             

Author

Garriga Guitart, Joan                                           

                                              

Abstract                                             

Breast cancer is one of the most common cancers in women, with high mortality rates. Premature diagnosis and prognosis of breast cancer are key to reducing mortality. Machine learning, and the application of artificial intelligence, enable computers to identify patterns in large, noisy, or complex databases. Well suited for medical applications, these techniques are used in the diagnosis, classification, and prediction of cancer.
This project aims to analyze classification methods using machine learning techniques for the prediction of breast cancer. A 30-parameter database was used, containing records from 569 patients. The algorithms of Logistic Regression, K Nearest Neighbors, Random Forests, and Neural Networks were proposed. A possible reduction in the number of parameters for cancer prediction was also analyzed.
The algorithm K closest neighbors were the ones that showed the best overall performance, obtaining the highest precision, F1 value, and ROC-AUC value. Parameter reduction showed promising results. A reduction of more than 50% of the input data can be made with satisfactory results. This could have a major impact on the healthcare system, reducing the number of medical tests and therefore saving time, expense, and inconvenience to patients.

Keywords: machine learning, breast cancer, Python.

                                              

                                              

                                              

 

                                              

                                              

Director                                          

Fernández Esmerats, Joan                                              

                                              

Degree                                        

IQS SE - Undergraduate Program in Biotechnology                                  

                                              

Date                                         

2020-06-15