Radiomics represents a specialized branch of medical imaging where quantitative features are extracted from images. Performing a classification using radiomics means solving two common problems: the imbalanced setting, and the large number of features that would increase the risk of overfitting. Moreover, since its main application and impact are in clinical field, there is the need of interpretable models for explaining their results. The aim of this study is to compare two modelling approaches: one based on a logistic regression model, known for its simplicity and interpretability, and RUSBoost, an ensemble method designed to handle class imbalance with potentially higher complexity, in order to answer the question whether higher complexity and lower interpretability are justified when dealing with radiomics data. Additionally, due to the large literature suggesting it, we analyze the impact of a feature selection step applied to these two classifiers. Test performances measured across 20 repeated splits on two datasets show how the RUSBoost approach is able to capture more detailed patterns of the data but this is highly dependent on the dataset at hand.
Radiomics-based classification in imbalanced datasets: Complexity or interpretability / Boesso, S.; Farina, L.; Petti, M.. - (2024), pp. 6937-6942. ( 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024 Lisbon; Portugal ) [10.1109/BIBM62325.2024.10821807].
Radiomics-based classification in imbalanced datasets: Complexity or interpretability
Boesso S.
;Farina L.
;Petti M.
2024
Abstract
Radiomics represents a specialized branch of medical imaging where quantitative features are extracted from images. Performing a classification using radiomics means solving two common problems: the imbalanced setting, and the large number of features that would increase the risk of overfitting. Moreover, since its main application and impact are in clinical field, there is the need of interpretable models for explaining their results. The aim of this study is to compare two modelling approaches: one based on a logistic regression model, known for its simplicity and interpretability, and RUSBoost, an ensemble method designed to handle class imbalance with potentially higher complexity, in order to answer the question whether higher complexity and lower interpretability are justified when dealing with radiomics data. Additionally, due to the large literature suggesting it, we analyze the impact of a feature selection step applied to these two classifiers. Test performances measured across 20 repeated splits on two datasets show how the RUSBoost approach is able to capture more detailed patterns of the data but this is highly dependent on the dataset at hand.| File | Dimensione | Formato | |
|---|---|---|---|
|
Boesso_Radiomics-based _2024.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
397.44 kB
Formato
Adobe PDF
|
397.44 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


