Radiomics represents a specialized branch of medical imaging where quantitative features are extracted from images. Performing a classification using radiomics means solving two common problems: the imbalanced setting, and the large number of features that would increase the risk of overfitting. Moreover, since its main application and impact are in clinical field, there is the need of interpretable models for explaining their results. The aim of this study is to compare two modelling approaches: one based on a logistic regression model, known for its simplicity and interpretability, and RUSBoost, an ensemble method designed to handle class imbalance with potentially higher complexity, in order to answer the question whether higher complexity and lower interpretability are justified when dealing with radiomics data. Additionally, due to the large literature suggesting it, we analyze the impact of a feature selection step applied to these two classifiers. Test performances measured across 20 repeated splits on two datasets show how the RUSBoost approach is able to capture more detailed patterns of the data but this is highly dependent on the dataset at hand.

Radiomics-based classification in imbalanced datasets: Complexity or interpretability / Boesso, S.; Farina, L.; Petti, M.. - (2024), pp. 6937-6942. ( 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024 Lisbon; Portugal ) [10.1109/BIBM62325.2024.10821807].

Radiomics-based classification in imbalanced datasets: Complexity or interpretability

Boesso S.
;
Farina L.
;
Petti M.
2024

Abstract

Radiomics represents a specialized branch of medical imaging where quantitative features are extracted from images. Performing a classification using radiomics means solving two common problems: the imbalanced setting, and the large number of features that would increase the risk of overfitting. Moreover, since its main application and impact are in clinical field, there is the need of interpretable models for explaining their results. The aim of this study is to compare two modelling approaches: one based on a logistic regression model, known for its simplicity and interpretability, and RUSBoost, an ensemble method designed to handle class imbalance with potentially higher complexity, in order to answer the question whether higher complexity and lower interpretability are justified when dealing with radiomics data. Additionally, due to the large literature suggesting it, we analyze the impact of a feature selection step applied to these two classifiers. Test performances measured across 20 repeated splits on two datasets show how the RUSBoost approach is able to capture more detailed patterns of the data but this is highly dependent on the dataset at hand.
2024
2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024
Feature Selection; Machine Learning; Precision Oncology; Radiomics
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Radiomics-based classification in imbalanced datasets: Complexity or interpretability / Boesso, S.; Farina, L.; Petti, M.. - (2024), pp. 6937-6942. ( 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024 Lisbon; Portugal ) [10.1109/BIBM62325.2024.10821807].
File allegati a questo prodotto
File Dimensione Formato  
Boesso_Radiomics-based _2024.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 397.44 kB
Formato Adobe PDF
397.44 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1734511
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact