We propose a multivariate hidden Markov model (HMM) designed for mixedtype variables, including both continuous and ordinal data. Since some variables may not contribute to the clustering structure, our model is structured to differentiate between discriminative and non-informative dimensions. Specifically, the observed variables are modeled as linear combinations of two independent sets of latent factors: one capturing the cluster structure through an HMM, and the other representing noise, which follows a multivariate normal distribution and remains constant over time. While the model is efficient in terms of parameterization, its computational complexity can be challenging. To address this, we implement a composite likelihood approach for parameter estimation, ensuring feasibility in practical applications. The proposed framework is validated through an empirical study on the Chinese Longitudinal Healthy Longevity Survey (CLHLS), analyzing lifestyle and health-related factors in the elderly population.
Composite Likelihood Inference for Simultaneous Clustering and Dimensionality Reduction in the Chinese Longitudinal Healthy Longevity Survey / Ranalli, Monia; Rocci, Roberto; Maruotti, Antonello. - (2025), pp. 474-478. (Intervento presentato al convegno IES 2025 tenutosi a Bressanone).
Composite Likelihood Inference for Simultaneous Clustering and Dimensionality Reduction in the Chinese Longitudinal Healthy Longevity Survey
Monia Ranalli
;Roberto Rocci;Antonello Maruotti
2025
Abstract
We propose a multivariate hidden Markov model (HMM) designed for mixedtype variables, including both continuous and ordinal data. Since some variables may not contribute to the clustering structure, our model is structured to differentiate between discriminative and non-informative dimensions. Specifically, the observed variables are modeled as linear combinations of two independent sets of latent factors: one capturing the cluster structure through an HMM, and the other representing noise, which follows a multivariate normal distribution and remains constant over time. While the model is efficient in terms of parameterization, its computational complexity can be challenging. To address this, we implement a composite likelihood approach for parameter estimation, ensuring feasibility in practical applications. The proposed framework is validated through an empirical study on the Chinese Longitudinal Healthy Longevity Survey (CLHLS), analyzing lifestyle and health-related factors in the elderly population.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


