In the context of human activity pattern analysis, we adopt a fuzzy clustering around medoids approach to classify ordered sequences (paths). These sequences represent patterns of individual behavior in an actual or virtual space-time domain. A fuzzy approach is suitable for path data, since sequences of human activities are typically characterized by switching behaviors, which are likely to produce overlapping clusters. We adopt a partitioning around medoids strategy since in human activity patterns analysis it is useful to represent each cluster by means of an observed (not fictitious) prototype (medoid). To measure pairwise distances among all sequence pairs we make use of the Levenshtein distance, which allows for the comparison between sequences of different length and explicitly takes into account the sequential nature of the data. We also consider two robust versions of the fuzzy clustering algorithm based, respectively, on the noise cluster and on the trimming technique. Robust algorithms deal with noisy observations, which are likely to occur in this framework and could provide an improvement to the standard model. We show several applications on sequence data, regarding different research areas, like Web usage mining, travel behavior, tourists and shopping paths. (C) 2012 Elsevier B.V. All rights reserved.
Fuzzy clustering of human activity patterns / D'Urso, Pierpaolo; Massari, Riccardo. - In: FUZZY SETS AND SYSTEMS. - ISSN 0165-0114. - ELETTRONICO. - 215:215(2013), pp. 29-54. [10.1016/j.fss.2012.05.009]
Fuzzy clustering of human activity patterns
D'URSO, Pierpaolo;MASSARI, Riccardo
2013
Abstract
In the context of human activity pattern analysis, we adopt a fuzzy clustering around medoids approach to classify ordered sequences (paths). These sequences represent patterns of individual behavior in an actual or virtual space-time domain. A fuzzy approach is suitable for path data, since sequences of human activities are typically characterized by switching behaviors, which are likely to produce overlapping clusters. We adopt a partitioning around medoids strategy since in human activity patterns analysis it is useful to represent each cluster by means of an observed (not fictitious) prototype (medoid). To measure pairwise distances among all sequence pairs we make use of the Levenshtein distance, which allows for the comparison between sequences of different length and explicitly takes into account the sequential nature of the data. We also consider two robust versions of the fuzzy clustering algorithm based, respectively, on the noise cluster and on the trimming technique. Robust algorithms deal with noisy observations, which are likely to occur in this framework and could provide an improvement to the standard model. We show several applications on sequence data, regarding different research areas, like Web usage mining, travel behavior, tourists and shopping paths. (C) 2012 Elsevier B.V. All rights reserved.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.