Purpose: To analyse the classification performances of a decision tree method applied to predictor variables in survival outcome in patients with locally advanced rectal cancer (LARC). The aim was to offer a critical analysis to better apply tree-based approach in clinical practice and improve its interpretation. Materials and methods: Data concerning patients with histological proven LARC between 2007 and 2014 were reviewed. All patients were treated with trimodality approach with a curative intent. The Kaplan–Meier method was used to estimate overall survival (OS). Decision tree methods were was used to select important variables in outcome prediction. Results: A total of 100 patients were included. The 5-year and 7-year OS rates were 76.4% and 71.3%, respectively. Age, co-morbidities, tumor size, clinical tumor classification (cT) and clinical nodes classification (cN) were the important predictor variables to the tree’s construction. Overall, 13 distinct groups of patients were defined. Patients aged < 65 years with cT3 disease and elderly patients with a tumor size < 5 cm seemed to have highest rates of survival. But the process over-fitted the data, leading to poor algorithm performance. Conclusion: We proposed a decision tree algorithm to identify known and new pre-treatment clinical predictors of survival in LARC. Our analysis confirmed that tree-based machine learning method, especially classification trees, can be easily interpreted even by a non-expert in the field, but controlling cross validation errors is mandatory to capture its statistical power. However, it is necessary to carefully analyze the classification error trend to chose the important predictor variables, especially in little data. Machine learning approach should be considered the new unexplored frontier in LARC. Based on big datasets, decision trees represent an opportunity to improve decision-making process in clinical practice.

Decision tree algorithm in locally advanced rectal cancer: an example of over-interpretation and misuse of a machine learning approach / De Felice, F.; Crocetti, D.; Parisi, M.; Maiuri, V.; Moscarelli, E.; Caiazzo, R.; Bulzonetti, N.; Musio, D.; Tombolini, V.. - In: JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY. - ISSN 0171-5216. - 146:3(2020), pp. 761-765. [10.1007/s00432-019-03102-y]

Decision tree algorithm in locally advanced rectal cancer: an example of over-interpretation and misuse of a machine learning approach

De Felice F.
Primo
;
Crocetti D.
Secondo
;
Parisi M.;Maiuri V.;Moscarelli E.;Tombolini V.
Ultimo
2020

Abstract

Purpose: To analyse the classification performances of a decision tree method applied to predictor variables in survival outcome in patients with locally advanced rectal cancer (LARC). The aim was to offer a critical analysis to better apply tree-based approach in clinical practice and improve its interpretation. Materials and methods: Data concerning patients with histological proven LARC between 2007 and 2014 were reviewed. All patients were treated with trimodality approach with a curative intent. The Kaplan–Meier method was used to estimate overall survival (OS). Decision tree methods were was used to select important variables in outcome prediction. Results: A total of 100 patients were included. The 5-year and 7-year OS rates were 76.4% and 71.3%, respectively. Age, co-morbidities, tumor size, clinical tumor classification (cT) and clinical nodes classification (cN) were the important predictor variables to the tree’s construction. Overall, 13 distinct groups of patients were defined. Patients aged < 65 years with cT3 disease and elderly patients with a tumor size < 5 cm seemed to have highest rates of survival. But the process over-fitted the data, leading to poor algorithm performance. Conclusion: We proposed a decision tree algorithm to identify known and new pre-treatment clinical predictors of survival in LARC. Our analysis confirmed that tree-based machine learning method, especially classification trees, can be easily interpreted even by a non-expert in the field, but controlling cross validation errors is mandatory to capture its statistical power. However, it is necessary to carefully analyze the classification error trend to chose the important predictor variables, especially in little data. Machine learning approach should be considered the new unexplored frontier in LARC. Based on big datasets, decision trees represent an opportunity to improve decision-making process in clinical practice.
2020
big data; chemoradiotherapy; decision tree; machine learning; rectal cancer; surgery; survival
01 Pubblicazione su rivista::01a Articolo in rivista
Decision tree algorithm in locally advanced rectal cancer: an example of over-interpretation and misuse of a machine learning approach / De Felice, F.; Crocetti, D.; Parisi, M.; Maiuri, V.; Moscarelli, E.; Caiazzo, R.; Bulzonetti, N.; Musio, D.; Tombolini, V.. - In: JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY. - ISSN 0171-5216. - 146:3(2020), pp. 761-765. [10.1007/s00432-019-03102-y]
File allegati a questo prodotto
File Dimensione Formato  
De-Felice_Decision_epub2019.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 987.27 kB
Formato Adobe PDF
987.27 kB Adobe PDF   Contatta l'autore
De-Felice_Decision_2020.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 969.22 kB
Formato Adobe PDF
969.22 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1360352
Citazioni
  • ???jsp.display-item.citation.pmc??? 1
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 10
social impact