With the rise of smart working and recent global events, the risk of cyberattacks is increasing steadily. Sometimes adversaries focus on stealing valuable data, such as intellectual property (IP): they exfiltrate a large volume of IP documents from a target company. They then identify those of their interest by leveraging automated methods. In this work, we propose the DARD (Deceptive Approaches for Robust Defense against IP Theft) system, a framework designed to deceive adversaries who rely on automatic approaches to classify exfiltrated documents. Starting from an original repository of documents, DARD automatically generates a new deceptive repository that misleads popular automatic approaches, resulting in clusters of documents that are significantly different from the actual ones. By utilizing this approach, DARD aims to hinder the accurate clustering and the identification of the topic of documents by adversaries relying on automated techniques. The paper presents four deceptive operations (Basic Shuffle, Shuffle increment, Shuffle reduction, and Change topic) that DARD leverages to create a deceptive repository. We evaluate the efficacy of our approach by considering three different types of adversaries, each possessing varying levels of knowledge and expertise. We show experimentally that the DARD system can deceive both topic modeling and document clustering techniques, including commercial tools such as Amazon Comprehend. As a result, our solution provides a robust defense mechanism against Intellectual Property (IP) theft
DARD: Deceptive Approaches for Robust Defense Against IP Theft / Mongardini, A. M.; La Morgia, M.; Jajodia, S.; Mancini, L. V.; Mei, A.. - In: IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY. - ISSN 1556-6013. - 19:(2024), pp. 5591-5606. [10.1109/TIFS.2024.3402433]
DARD: Deceptive Approaches for Robust Defense Against IP Theft
Mongardini A. M.
;La Morgia M.
Co-primo
Investigation
;Mancini L. V.;Mei A.
2024
Abstract
With the rise of smart working and recent global events, the risk of cyberattacks is increasing steadily. Sometimes adversaries focus on stealing valuable data, such as intellectual property (IP): they exfiltrate a large volume of IP documents from a target company. They then identify those of their interest by leveraging automated methods. In this work, we propose the DARD (Deceptive Approaches for Robust Defense against IP Theft) system, a framework designed to deceive adversaries who rely on automatic approaches to classify exfiltrated documents. Starting from an original repository of documents, DARD automatically generates a new deceptive repository that misleads popular automatic approaches, resulting in clusters of documents that are significantly different from the actual ones. By utilizing this approach, DARD aims to hinder the accurate clustering and the identification of the topic of documents by adversaries relying on automated techniques. The paper presents four deceptive operations (Basic Shuffle, Shuffle increment, Shuffle reduction, and Change topic) that DARD leverages to create a deceptive repository. We evaluate the efficacy of our approach by considering three different types of adversaries, each possessing varying levels of knowledge and expertise. We show experimentally that the DARD system can deceive both topic modeling and document clustering techniques, including commercial tools such as Amazon Comprehend. As a result, our solution provides a robust defense mechanism against Intellectual Property (IP) theftI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.