Fuzzy clustering methodologies gained attention for their ability to handle scenarios with unclear cluster assignments. They allow for varying membership degrees within [0,1], offering a more nuanced representation of clustering results. Among these techniques, Fuzzy K-Means (FKM) stands out by extending the conventional K-Means (KM) algorithm to accommodate fuzzy membership assignments. Moreover, in situations where certain variables obscure the underlying clustering structure, dimensionality reduction can enhance the identification of latent clustering dimensions. Factorial and Reduced K-Means address this in hard partitioning scenarios, by simultaneously performing clustering and dimensionality reduction. This work introduces a novel approach in which simultaneous fuzzy clustering and dimensionality reduction are performed with the goal to simplify the interpretation of the identified dimensions. Unlike FKM, which treats all dimensions equally, the proposed method identifies only the relevant dimensions for clustering, but retains the flexibility of FKM to assign non-binary membership degrees. This flexible model is formalized as a convex linear combination of Factorial and Reduced K-Means objective functions in a fuzzy framework. It adapts to real-world scenarios where only some variables show clustering structure and others that do not clearly have this feature, actually risking to mask the observed heterogeneity in the data. An ad-hoc methodology is developed to estimate the optimal model’s parameters. Model’s performance in recovering the clustering structure underline the data is evaluated through an extensive simulation study, under different scenarios. Moreover, the methodology is applied to classify a benchmark dataset to show the flexible features of the novel proposal, compared to KM, FKM and standard dimensionality reduction models. All these models are particular cases of the proposed one. In conclusion, our findings underscore the efficacy of the novel model as tool for handling complex datasets. Results demonstrate its effectiveness in discovering the data structure and main features, offering tailored and flexible clustering solutions.
Advances in fuzzy clustering and dimensionality reduction / BOTTAZZI SCHENONE, Mariaelena; Vichi, Maurizio. - (2024), pp. 35-35. (Intervento presentato al convegno MBC2 2024 - Models and Learning in Clustering and Classification 7th International Workshop tenutosi a Catania).
Advances in fuzzy clustering and dimensionality reduction
Mariaelena Bottazzi Schenone
;Maurizio Vichi
2024
Abstract
Fuzzy clustering methodologies gained attention for their ability to handle scenarios with unclear cluster assignments. They allow for varying membership degrees within [0,1], offering a more nuanced representation of clustering results. Among these techniques, Fuzzy K-Means (FKM) stands out by extending the conventional K-Means (KM) algorithm to accommodate fuzzy membership assignments. Moreover, in situations where certain variables obscure the underlying clustering structure, dimensionality reduction can enhance the identification of latent clustering dimensions. Factorial and Reduced K-Means address this in hard partitioning scenarios, by simultaneously performing clustering and dimensionality reduction. This work introduces a novel approach in which simultaneous fuzzy clustering and dimensionality reduction are performed with the goal to simplify the interpretation of the identified dimensions. Unlike FKM, which treats all dimensions equally, the proposed method identifies only the relevant dimensions for clustering, but retains the flexibility of FKM to assign non-binary membership degrees. This flexible model is formalized as a convex linear combination of Factorial and Reduced K-Means objective functions in a fuzzy framework. It adapts to real-world scenarios where only some variables show clustering structure and others that do not clearly have this feature, actually risking to mask the observed heterogeneity in the data. An ad-hoc methodology is developed to estimate the optimal model’s parameters. Model’s performance in recovering the clustering structure underline the data is evaluated through an extensive simulation study, under different scenarios. Moreover, the methodology is applied to classify a benchmark dataset to show the flexible features of the novel proposal, compared to KM, FKM and standard dimensionality reduction models. All these models are particular cases of the proposed one. In conclusion, our findings underscore the efficacy of the novel model as tool for handling complex datasets. Results demonstrate its effectiveness in discovering the data structure and main features, offering tailored and flexible clustering solutions.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


