The binary similarity problem consists in determining if two functions are similar considering only their compiled form. Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as copyright disputes, malware analysis, vulnerability detection, etc. In this paper we describe SAFE, a novel architecture for function representation based on a self-attentive neural network. SAFE works directly on disassembled binary functions, does not require manual feature extraction, is computationally more efficient than existing solutions, and is more general as it works on stripped binaries and on multiple architectures. Results from our experimental evaluation show how SAFE provides a performance improvement with respect to previoussolutions. Furthermore, we show how SAFE can be used in widely different use cases, thus providing a general solution for several application scenarios.

Function Representations for Binary Similarity / Massarelli, Luca; Di Luna, Giuseppe Antonio; Petroni, Fabio; Querzoni, Leonardo; Baldoni, Roberto. - In: IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING. - ISSN 1545-5971. - (2021), pp. -15. [10.1109/TDSC.2021.3051852]

Function Representations for Binary Similarity

Massarelli, Luca
;
Di Luna, Giuseppe Antonio;Querzoni, Leonardo;Baldoni, Roberto
2021

Abstract

The binary similarity problem consists in determining if two functions are similar considering only their compiled form. Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as copyright disputes, malware analysis, vulnerability detection, etc. In this paper we describe SAFE, a novel architecture for function representation based on a self-attentive neural network. SAFE works directly on disassembled binary functions, does not require manual feature extraction, is computationally more efficient than existing solutions, and is more general as it works on stripped binaries and on multiple architectures. Results from our experimental evaluation show how SAFE provides a performance improvement with respect to previoussolutions. Furthermore, we show how SAFE can be used in widely different use cases, thus providing a general solution for several application scenarios.
2021
binary analysis; binary similarity; deep learning; malware
01 Pubblicazione su rivista::01a Articolo in rivista
Function Representations for Binary Similarity / Massarelli, Luca; Di Luna, Giuseppe Antonio; Petroni, Fabio; Querzoni, Leonardo; Baldoni, Roberto. - In: IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING. - ISSN 1545-5971. - (2021), pp. -15. [10.1109/TDSC.2021.3051852]
File allegati a questo prodotto
File Dimensione Formato  
Massarelli_postprint_Function-Representations_2021.pdf

Open Access dal 16/01/2022

Note: Article in press. https://ieeexplore.ieee.org/document/9325042
Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 5.39 MB
Formato Adobe PDF
5.39 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1477115
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 4
social impact