Among the various statistical intervals, highest-density regions (HDRs) stand out for their ability to effectively summarise a distribution or sample, unveiling its distinctive and salient features. An HDR represents the minimum size set that satisfies a certain probability coverage, and current methods for their computation require knowledge or estimation of the underlying probability distribution or density f. In this work, we illustrate a broader framework for computing HDRs, which generalises the classical density quantile method. The framework is based on neighbourhood measures, that is, measures that preserve the order induced in the sample by f, and include the density f as a special case. We explore a number of suitable distance-based measures, such as the k-nearest neighbourhood distance, and some probabilistic variants based on copula models. An extensive comparison is provided, showing the advantages of the copula-based strategy, especially in those scenarios that exhibit complex structures (e.g. multimodalities or particular dependencies). Finally, we discuss the practical implications of our findings for estimating HDRs in real-world applications.

Alternative Approaches for Estimating Highest‐Density Regions / Deliu, Nina; Liseo, Brunero. - In: INTERNATIONAL STATISTICAL REVIEW. - ISSN 1751-5823. - (2024). [10.1111/insr.12592]

Alternative Approaches for Estimating Highest‐Density Regions

Nina Deliu
;
Brunero Liseo
2024

Abstract

Among the various statistical intervals, highest-density regions (HDRs) stand out for their ability to effectively summarise a distribution or sample, unveiling its distinctive and salient features. An HDR represents the minimum size set that satisfies a certain probability coverage, and current methods for their computation require knowledge or estimation of the underlying probability distribution or density f. In this work, we illustrate a broader framework for computing HDRs, which generalises the classical density quantile method. The framework is based on neighbourhood measures, that is, measures that preserve the order induced in the sample by f, and include the density f as a special case. We explore a number of suitable distance-based measures, such as the k-nearest neighbourhood distance, and some probabilistic variants based on copula models. An extensive comparison is provided, showing the advantages of the copula-based strategy, especially in those scenarios that exhibit complex structures (e.g. multimodalities or particular dependencies). Finally, we discuss the practical implications of our findings for estimating HDRs in real-world applications.
2024
anomaly detection; copula models; density estimation; k-nearest neighbourhood; statistical intervals
01 Pubblicazione su rivista::01a Articolo in rivista
Alternative Approaches for Estimating Highest‐Density Regions / Deliu, Nina; Liseo, Brunero. - In: INTERNATIONAL STATISTICAL REVIEW. - ISSN 1751-5823. - (2024). [10.1111/insr.12592]
File allegati a questo prodotto
File Dimensione Formato  
Liseo_Alternative-Approaches_2024.pdf

accesso aperto

Note: file pdf
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 5.29 MB
Formato Adobe PDF
5.29 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1717212
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact