Deep neural networks (DNNs) are increasingly vulnerable to adversarial attacks, particularly Trojan attacks, which embed malicious behaviors triggered by specific inputs. Detecting these attacks is challenging due to subtle model manipulations and limited data access. Topological data analysis (TDA) has shown promise in identifying structural deviations within neural networks through Persistent Homology (PH). Prior work has leveraged PH to distinguish Trojaned models by summarizing topological features. We propose an enhanced detection framework that use persistence images as a features for classification. Our approach improves robustness and accuracy. Additionally, we introduce a randomized variant of the method that significantly reduces execution time while maintaining classification accuracy.
Topological Anomalies for Trojaned Neural Networks detection / La Commare, Stefano; Ceccaroni, Riccardo; Brutti, Pierpaolo. - (2025). ( Statistical Methods for Data Analysis and Decision Sciences (SDS25) Milano ).
Topological Anomalies for Trojaned Neural Networks detection
Stefano La Commare
;Riccardo Ceccaroni;Pierpaolo Brutti
2025
Abstract
Deep neural networks (DNNs) are increasingly vulnerable to adversarial attacks, particularly Trojan attacks, which embed malicious behaviors triggered by specific inputs. Detecting these attacks is challenging due to subtle model manipulations and limited data access. Topological data analysis (TDA) has shown promise in identifying structural deviations within neural networks through Persistent Homology (PH). Prior work has leveraged PH to distinguish Trojaned models by summarizing topological features. We propose an enhanced detection framework that use persistence images as a features for classification. Our approach improves robustness and accuracy. Additionally, we introduce a randomized variant of the method that significantly reduces execution time while maintaining classification accuracy.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


