This paper explores the use of clustering to rank multivariate observations by linking ranking to clustering through the Linear Ordered Partition (LOP) concept. A LOP allows optimal clustering into ordered “equivalence classes”. In fact, unlike simple units’ ordering, cluster ranking identifies classes where units are “incomparable”. The aim is to partition units into clusters with statistically distinct centroids, leading to an optimally ranked total order of clusters, where units within each one are considered “ties”. The proposed model finds the best least-squares (LS) LOP, alongside with a univariate transformation of the observed variables. This is because it identifies the LS LOP by orthogonally projecting multivariate units onto a line, thus creating a composite indicator that summarizes the observed variables. Model’s theoretical properties are discussed, and a large simulation study demonstrates its performance across different scenarios. Three real data applications highlight the method’s potential across different fields.

Clustering for ranking multivariate data by Linear Ordered Partitions / Bottazzi Schenone, Mariaelena; Vichi, Maurizio. - In: ASTA ADVANCES IN STATISTICAL ANALYSIS. - ISSN 1863-8171. - (2025), pp. 1-32. [10.1007/s10182-025-00534-5]

Clustering for ranking multivariate data by Linear Ordered Partitions

Mariaelena Bottazzi Schenone
;
Maurizio Vichi
2025

Abstract

This paper explores the use of clustering to rank multivariate observations by linking ranking to clustering through the Linear Ordered Partition (LOP) concept. A LOP allows optimal clustering into ordered “equivalence classes”. In fact, unlike simple units’ ordering, cluster ranking identifies classes where units are “incomparable”. The aim is to partition units into clusters with statistically distinct centroids, leading to an optimally ranked total order of clusters, where units within each one are considered “ties”. The proposed model finds the best least-squares (LS) LOP, alongside with a univariate transformation of the observed variables. This is because it identifies the LS LOP by orthogonally projecting multivariate units onto a line, thus creating a composite indicator that summarizes the observed variables. Model’s theoretical properties are discussed, and a large simulation study demonstrates its performance across different scenarios. Three real data applications highlight the method’s potential across different fields.
2025
linear ordered partition; equivalence classes; ranking clusters; multivariate observations
01 Pubblicazione su rivista::01a Articolo in rivista
Clustering for ranking multivariate data by Linear Ordered Partitions / Bottazzi Schenone, Mariaelena; Vichi, Maurizio. - In: ASTA ADVANCES IN STATISTICAL ANALYSIS. - ISSN 1863-8171. - (2025), pp. 1-32. [10.1007/s10182-025-00534-5]
File allegati a questo prodotto
File Dimensione Formato  
BottazziSchenone_clustering-for-ranking_2025.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.22 MB
Formato Adobe PDF
2.22 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1743382
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact