Tree based methods for regression and classification have a long and successful history in statistics and data–analysis and are essentially based on a recursive partition of the covariate space, possibly driven by specific testing procedures design to control branch creation. Starting from the conditional approach introduced in where the choice of the split–variable and the split–value are divided into two different steps allowing an unbiased feature selection, in this work we introduce an energy based testing scheme to validate each of these phases. Energy methods are based on metrics such as distance correlation which, under suitable conditions, ensures the independence of the variables and are therefore more informative than standard association measures. Moreover, as distance correlation measures can be defined for (almost) any kind of variables, our proposed framework is flexible enough to accomodate multiple types of covariates. We focus in particular on the case of functional covariates, for which we show simulated and real data examples, as well as comparisons with more established functional data analysis methods.

Classification and regression energy tree for functional data / Brandi, Marco. - (2018 Sep 13).

Classification and regression energy tree for functional data

BRANDI, MARCO
13/09/2018

Abstract

Tree based methods for regression and classification have a long and successful history in statistics and data–analysis and are essentially based on a recursive partition of the covariate space, possibly driven by specific testing procedures design to control branch creation. Starting from the conditional approach introduced in where the choice of the split–variable and the split–value are divided into two different steps allowing an unbiased feature selection, in this work we introduce an energy based testing scheme to validate each of these phases. Energy methods are based on metrics such as distance correlation which, under suitable conditions, ensures the independence of the variables and are therefore more informative than standard association measures. Moreover, as distance correlation measures can be defined for (almost) any kind of variables, our proposed framework is flexible enough to accomodate multiple types of covariates. We focus in particular on the case of functional covariates, for which we show simulated and real data examples, as well as comparisons with more established functional data analysis methods.
13-set-2018
File allegati a questo prodotto
File Dimensione Formato  
Tesi dottorato Brandi

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 5.11 MB
Formato Adobe PDF
5.11 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1146102
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact