This paper explores the use of clustering to rank multivariate observations by linking ranking to clustering through the Linear Ordered Partition (LOP) concept. A LOP allows optimal clustering into ordered “equivalence classes”. In fact, unlike simple units’ ordering, cluster ranking identifies classes where units are “incomparable”. The aim is to partition units into clusters with statistically distinct centroids, leading to an optimally ranked total order of clusters, where units within each one are considered “ties”. The proposed model finds the best least-squares (LS) LOP, alongside with a univariate transformation of the observed variables. This is because it identifies the LS LOP by orthogonally projecting multivariate units onto a line, thus creating a composite indicator that summarizes the observed variables. Model’s theoretical properties are discussed, and a large simulation study demonstrates its performance across different scenarios. Three real data applications highlight the method’s potential across different fields.
Clustering for ranking multivariate data by Linear Ordered Partitions / Bottazzi Schenone, Mariaelena; Vichi, Maurizio. - In: ASTA ADVANCES IN STATISTICAL ANALYSIS. - ISSN 1863-8171. - (2025), pp. 1-32. [10.1007/s10182-025-00534-5]
Clustering for ranking multivariate data by Linear Ordered Partitions
Mariaelena Bottazzi Schenone
;Maurizio Vichi
2025
Abstract
This paper explores the use of clustering to rank multivariate observations by linking ranking to clustering through the Linear Ordered Partition (LOP) concept. A LOP allows optimal clustering into ordered “equivalence classes”. In fact, unlike simple units’ ordering, cluster ranking identifies classes where units are “incomparable”. The aim is to partition units into clusters with statistically distinct centroids, leading to an optimally ranked total order of clusters, where units within each one are considered “ties”. The proposed model finds the best least-squares (LS) LOP, alongside with a univariate transformation of the observed variables. This is because it identifies the LS LOP by orthogonally projecting multivariate units onto a line, thus creating a composite indicator that summarizes the observed variables. Model’s theoretical properties are discussed, and a large simulation study demonstrates its performance across different scenarios. Three real data applications highlight the method’s potential across different fields.| File | Dimensione | Formato | |
|---|---|---|---|
|
BottazziSchenone_clustering-for-ranking_2025.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.22 MB
Formato
Adobe PDF
|
2.22 MB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


