We derive non-parametric confidence intervals for the eigenvalues of the Hessian at modes of a density estimate. This provides information about the strength and shape of modes and can also be used as a significance test. We use a data-splitting approach in which potential modes are identified by using the first half of the data and inference is done with the second half of the data. To obtain valid confidence sets for the eigenvalues, we use a bootstrap based on an elementary symmetric polynomial transformation. This leads to valid bootstrap confidence sets regardless of any multiplicities in the eigenvalues. We also suggest a new method for bandwidth selection, namely choosing the bandwidth to maximize the number of significant modes. We show by example that this method works well. Even when the true distribution is singular, and hence does not have a density (in which case cross-validation chooses a zero bandwidth), our method chooses a reasonable bandwidth.
Non-parametric inference for density modes / C., Genovese; PERONE PACIFICO, Marco; Verdinelli, Isabella; L., Wasserman. - In: JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B STATISTICAL METHODOLOGY. - ISSN 1369-7412. - ELETTRONICO. - 78:1(2015), pp. 99-126. [10.1111/rssb.12111]
Non-parametric inference for density modes
PERONE PACIFICO, Marco;VERDINELLI, Isabella;
2015
Abstract
We derive non-parametric confidence intervals for the eigenvalues of the Hessian at modes of a density estimate. This provides information about the strength and shape of modes and can also be used as a significance test. We use a data-splitting approach in which potential modes are identified by using the first half of the data and inference is done with the second half of the data. To obtain valid confidence sets for the eigenvalues, we use a bootstrap based on an elementary symmetric polynomial transformation. This leads to valid bootstrap confidence sets regardless of any multiplicities in the eigenvalues. We also suggest a new method for bandwidth selection, namely choosing the bandwidth to maximize the number of significant modes. We show by example that this method works well. Even when the true distribution is singular, and hence does not have a density (in which case cross-validation chooses a zero bandwidth), our method chooses a reasonable bandwidth.File | Dimensione | Formato | |
---|---|---|---|
Genovese_Non-parametric_2015.pdf
solo utenti autorizzati
Tipologia:
Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.21 MB
Formato
Adobe PDF
|
1.21 MB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.