As Virtual Reality (VR) technologies advance, their application in privacy-sensitive contexts, such as meetings, lectures, simulations, and training, expands. These environments often involve conversations that contain privacy-sensitive information about users and the individuals with whom they interact. The presence of advanced sensors in modern VR devices raises concerns about possible side-channel attacks that exploit these sensor capabilities. In this paper, we introduce IMMERSPY, a novel acoustic side-channel attack that exploits motion sensors in VR devices to extract sensitive speech content from on-device speakers. We analyze two powerful attacker scenarios: informed attacker, where the attacker possesses labeled data about the victim, and uninformed attacker, where no prior victim information is available. We design a Mel-spectrogram CNN-LSTM model to extract digit information (e.g., social security or credit card numbers) by learning the speech-induced vibrations captured by motion sensors. Our experiments show that IMMERSPY detects four consecutive digits with 74% accuracy and 16-digit sequences, such as credit card numbers, with 62% accuracy. Additionally, we leverage Generative AI text-to-speech models in our attack experiments to illustrate how the attackers can create training datasets even without the need to use the victim’s labeled data. Our findings highlight the critical need for security measures in VR domains to mitigate evolving privacy risks. To address this, we introduce a defense technique that emits inaudible tones through the Head-Mounted Display (HMD) speakers, showing its effectiveness in mitigating acoustic side-channel attacks.

Speak Up, I'm Listening: Extracting Speech from Zero-Permission VR Sensors / Cayir, Derin; Aburas, Reham Mohamed; Lazzeretti, Riccardo; Angelini, Marco; Acar, Abbas; Conti, Mauro; Celik, Z. Berkay; Uluagac, Selcuk. - (2025). ( Network and Distributed System Security (NDSS) Symposium 2025 San Diego; USA ) [10.14722/ndss.2025.240164].

Speak Up, I'm Listening: Extracting Speech from Zero-Permission VR Sensors

Lazzeretti, Riccardo
;
Uluagac, Selcuk
2025

Abstract

As Virtual Reality (VR) technologies advance, their application in privacy-sensitive contexts, such as meetings, lectures, simulations, and training, expands. These environments often involve conversations that contain privacy-sensitive information about users and the individuals with whom they interact. The presence of advanced sensors in modern VR devices raises concerns about possible side-channel attacks that exploit these sensor capabilities. In this paper, we introduce IMMERSPY, a novel acoustic side-channel attack that exploits motion sensors in VR devices to extract sensitive speech content from on-device speakers. We analyze two powerful attacker scenarios: informed attacker, where the attacker possesses labeled data about the victim, and uninformed attacker, where no prior victim information is available. We design a Mel-spectrogram CNN-LSTM model to extract digit information (e.g., social security or credit card numbers) by learning the speech-induced vibrations captured by motion sensors. Our experiments show that IMMERSPY detects four consecutive digits with 74% accuracy and 16-digit sequences, such as credit card numbers, with 62% accuracy. Additionally, we leverage Generative AI text-to-speech models in our attack experiments to illustrate how the attackers can create training datasets even without the need to use the victim’s labeled data. Our findings highlight the critical need for security measures in VR domains to mitigate evolving privacy risks. To address this, we introduce a defense technique that emits inaudible tones through the Head-Mounted Display (HMD) speakers, showing its effectiveness in mitigating acoustic side-channel attacks.
2025
Network and Distributed System Security (NDSS) Symposium 2025
Virtual Reality Security; Side Channel Attack
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Speak Up, I'm Listening: Extracting Speech from Zero-Permission VR Sensors / Cayir, Derin; Aburas, Reham Mohamed; Lazzeretti, Riccardo; Angelini, Marco; Acar, Abbas; Conti, Mauro; Celik, Z. Berkay; Uluagac, Selcuk. - (2025). ( Network and Distributed System Security (NDSS) Symposium 2025 San Diego; USA ) [10.14722/ndss.2025.240164].
File allegati a questo prodotto
File Dimensione Formato  
Cayir_Speak-up_2025.pdf

accesso aperto

Note: https://doi.org/10.14722/ndss.2025.240164
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 6.52 MB
Formato Adobe PDF
6.52 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1754596
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact