Anomaly detection plays a crucial role in assisting humans across various domains, from healthcare to security. In healthcare, it can identify both observable anomalies like heart attacks and more nuanced ones such as signs of social isolation or depression. Similarly, it can detect incidents like fights or crimes in a surveillance environment. Procedural mistake detection can help ensure the safe execution of tasks or procedures, preventing potential harm if not performed correctly. Recognizing anomalies involves identifying rare and unexpected events or behaviors that deviate from normal patterns. However, the task poses challenges as anomalies are subjective, context-dependent, and open-set, meaning novel types of anomalies may emerge. This thesis presents novel methods for detecting anomalies in various scenarios, using advancements in graph neural networks, hyperbolic geometry, and generative models. Addressing the limitations of existing methods, the proposed approaches tackle challenges such as uncertainty estimation, multimodal human action handling, and the scarcity of online and egocentric procedural mistake detection methods. First, HypAD is introduced, a novel method that uses hyperbolic neural networks to estimate uncertainty and detect anomalies in univariate and multivariate time series. HypAD outperforms the current state-of-the-art for univariate anomaly detection on established benchmarks based on data from NASA, Yahoo, Numenta, Amazon, and Twitter. HypAD is also tested on a multivariate dataset of anomaly activities in elderly home residences, where it detects anomalies in the daily routine of patients and provides explainable features. Next, two novel methods for detecting human-related anomalies in videos are presented: COSKAD and MoCoDAD. COSKAD uses graph convolutional networks and three different latent spaces (Euclidean, Hyperbolic and Spherical) to model and detect human pose and anomalies. All variants of COSKAD surpass the state-of-the-art on the UBnormal dataset, for which we contribute a human-related version with annotated skeletons. MoCoDAD is a novel generative model for video anomaly detection, which assumes that both normality and abnormality are multimodal. MoCoDAD leverages a diffusion probabilistic model to generate an array of possible future human poses and detect anomalies in human activities from videos. It is validated on four established benchmarks, surpassing state-of-the-art results. Then, PREGO, the first online one-class classification model for mistake detection in PRocedural EGOcentric videos, is introduced. PREGO uses an online action recognition component to model the current action and a symbolic reasoning module to predict the following actions. A mistake is detected when the predictions of two modules do not match. PREGO is evaluated on two procedural egocentric video datasets, which we rearrange for online benchmarking of procedural mistake detection. Finally, collaborative human pose forecasting, which involves predicting the future poses of multiple individuals interacting with each other, is investigated. This aims to improve anomaly detection systems by providing valuable insights and features, enhancing the understanding of abnormal behavior in human interactions and activities. In summary, this thesis advances the state-of-the-art in anomaly detection by proposing novel methods and techniques that can help and assist people in complex environments. We demonstrate the effectiveness and efficiency of our methods on various data modalities and scenarios and provide new datasets and benchmarks for future research. We hope our contributions inspire further research and lead to more robust and reliable anomaly detection systems.
Anomaly detection across different domains: the role of generative models, self-supervised learning, hyperbolic neural networks and large language models / Flaborea, Alessandro. - (2024 May 28).
Anomaly detection across different domains: the role of generative models, self-supervised learning, hyperbolic neural networks and large language models
FLABOREA, ALESSANDRO
28/05/2024
Abstract
Anomaly detection plays a crucial role in assisting humans across various domains, from healthcare to security. In healthcare, it can identify both observable anomalies like heart attacks and more nuanced ones such as signs of social isolation or depression. Similarly, it can detect incidents like fights or crimes in a surveillance environment. Procedural mistake detection can help ensure the safe execution of tasks or procedures, preventing potential harm if not performed correctly. Recognizing anomalies involves identifying rare and unexpected events or behaviors that deviate from normal patterns. However, the task poses challenges as anomalies are subjective, context-dependent, and open-set, meaning novel types of anomalies may emerge. This thesis presents novel methods for detecting anomalies in various scenarios, using advancements in graph neural networks, hyperbolic geometry, and generative models. Addressing the limitations of existing methods, the proposed approaches tackle challenges such as uncertainty estimation, multimodal human action handling, and the scarcity of online and egocentric procedural mistake detection methods. First, HypAD is introduced, a novel method that uses hyperbolic neural networks to estimate uncertainty and detect anomalies in univariate and multivariate time series. HypAD outperforms the current state-of-the-art for univariate anomaly detection on established benchmarks based on data from NASA, Yahoo, Numenta, Amazon, and Twitter. HypAD is also tested on a multivariate dataset of anomaly activities in elderly home residences, where it detects anomalies in the daily routine of patients and provides explainable features. Next, two novel methods for detecting human-related anomalies in videos are presented: COSKAD and MoCoDAD. COSKAD uses graph convolutional networks and three different latent spaces (Euclidean, Hyperbolic and Spherical) to model and detect human pose and anomalies. All variants of COSKAD surpass the state-of-the-art on the UBnormal dataset, for which we contribute a human-related version with annotated skeletons. MoCoDAD is a novel generative model for video anomaly detection, which assumes that both normality and abnormality are multimodal. MoCoDAD leverages a diffusion probabilistic model to generate an array of possible future human poses and detect anomalies in human activities from videos. It is validated on four established benchmarks, surpassing state-of-the-art results. Then, PREGO, the first online one-class classification model for mistake detection in PRocedural EGOcentric videos, is introduced. PREGO uses an online action recognition component to model the current action and a symbolic reasoning module to predict the following actions. A mistake is detected when the predictions of two modules do not match. PREGO is evaluated on two procedural egocentric video datasets, which we rearrange for online benchmarking of procedural mistake detection. Finally, collaborative human pose forecasting, which involves predicting the future poses of multiple individuals interacting with each other, is investigated. This aims to improve anomaly detection systems by providing valuable insights and features, enhancing the understanding of abnormal behavior in human interactions and activities. In summary, this thesis advances the state-of-the-art in anomaly detection by proposing novel methods and techniques that can help and assist people in complex environments. We demonstrate the effectiveness and efficiency of our methods on various data modalities and scenarios and provide new datasets and benchmarks for future research. We hope our contributions inspire further research and lead to more robust and reliable anomaly detection systems.File | Dimensione | Formato | |
---|---|---|---|
Tesi_dottorato_Flaborea.pdf
accesso aperto
Note: Tesi completa
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
21.6 MB
Formato
Adobe PDF
|
21.6 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.