Many relevant multidimensional phenomena, such as well-being, climate change, sustainable development, poverty and so on, are defined by nested latent concepts, which can be represented by a tree-shape structure supposing hierarchical relationships among observed variables. In literature, several methodologies have been proposed to both model the relationships among observed variables that reflect unobserved ones, and assess the existence of unobserved variables of "higher-order''. Nonetheless, these methodologies are usually developed with sequential procedures that do not optimize a unique objective function, and/or a confirmatory approach, i.e., by setting the relationships between observed and unobserved variables a priori. This dissertation discusses some new simultaneous, exploratory and parsimonious models for hierarchical dimensionality reduction, which overcome the limitations of the existing methodologies. The proposals introduced herein are based, "directly'' or "indirectly'', upon the definition of an ultrametric matrix, that differs from the well-known definition of an ultrametric distance matrix and is one-to-one associated with a hierarchy of latent concepts. The first proposal allows to model a nonnegative correlation matrix via an ultrametric correlation one by detecting reliable concepts, associated with disjoint groups of variables, and hierarchical relationships among them. The second work compares the first proposal with the traditional agglomerative hierarchical clustering algorithms applied on variables, after a transformation of correlations into distances, by highlighting the need for specific models to inspect the hierarchical relationships among variables. The third proposal extends the definition of an ultrametric matrix to a generic one by relaxing the non-negativity assumption and applying it to a covariance matrix. The extended ultrametric covariance matrix is then used to model the covariance structures of a Gaussian mixture model by both defining a new parsimonious parameterization of a covariance matrix and inspecting the hierarchical structure underlying multidimensional phenomena in heterogeneous populations. The fourth proposal introduces a quantification of latent concepts via a hierarchical extension of the Disjoint Principal Component Analysis. Even if not directly based on the definition of an ultrametric matrix, this proposal aims in turn at pinpointing nested partitions of variables into groups, each one associated with a component. The proposed models are illustrated both via simulation studies and real data applications in order to study their performances and abilities.

Ultrametric models for hierarchical dimensionality reduction / Zaccaria, Giorgia. - (2022 Feb 22).

Ultrametric models for hierarchical dimensionality reduction

ZACCARIA, GIORGIA
22/02/2022

Abstract

Many relevant multidimensional phenomena, such as well-being, climate change, sustainable development, poverty and so on, are defined by nested latent concepts, which can be represented by a tree-shape structure supposing hierarchical relationships among observed variables. In literature, several methodologies have been proposed to both model the relationships among observed variables that reflect unobserved ones, and assess the existence of unobserved variables of "higher-order''. Nonetheless, these methodologies are usually developed with sequential procedures that do not optimize a unique objective function, and/or a confirmatory approach, i.e., by setting the relationships between observed and unobserved variables a priori. This dissertation discusses some new simultaneous, exploratory and parsimonious models for hierarchical dimensionality reduction, which overcome the limitations of the existing methodologies. The proposals introduced herein are based, "directly'' or "indirectly'', upon the definition of an ultrametric matrix, that differs from the well-known definition of an ultrametric distance matrix and is one-to-one associated with a hierarchy of latent concepts. The first proposal allows to model a nonnegative correlation matrix via an ultrametric correlation one by detecting reliable concepts, associated with disjoint groups of variables, and hierarchical relationships among them. The second work compares the first proposal with the traditional agglomerative hierarchical clustering algorithms applied on variables, after a transformation of correlations into distances, by highlighting the need for specific models to inspect the hierarchical relationships among variables. The third proposal extends the definition of an ultrametric matrix to a generic one by relaxing the non-negativity assumption and applying it to a covariance matrix. The extended ultrametric covariance matrix is then used to model the covariance structures of a Gaussian mixture model by both defining a new parsimonious parameterization of a covariance matrix and inspecting the hierarchical structure underlying multidimensional phenomena in heterogeneous populations. The fourth proposal introduces a quantification of latent concepts via a hierarchical extension of the Disjoint Principal Component Analysis. Even if not directly based on the definition of an ultrametric matrix, this proposal aims in turn at pinpointing nested partitions of variables into groups, each one associated with a component. The proposed models are illustrated both via simulation studies and real data applications in order to study their performances and abilities.
22-feb-2022
File allegati a questo prodotto
File Dimensione Formato  
Tesi_dottorato_Zaccaria.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 6.41 MB
Formato Adobe PDF
6.41 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1628179
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact