Catalogo dei prodotti della ricerca

As Virtual Reality (VR) technologies advance, their application in privacy-sensitive contexts, such as meetings, lectures, simulations, and training, expands. These environments often involve conversations that contain privacy-sensitive information about users and the individuals with whom they interact. The presence of advanced sensors in modern VR devices raises concerns about possible side-channel attacks that exploit these sensor capabilities. In this paper, we introduce IMMERSPY, a novel acoustic side-channel attack that exploits motion sensors in VR devices to extract sensitive speech content from on-device speakers. We analyze two powerful attacker scenarios: informed attacker, where the attacker possesses labeled data about the victim, and uninformed attacker, where no prior victim information is available. We design a Mel-spectrogram CNN-LSTM model to extract digit information (e.g., social security or credit card numbers) by learning the speech-induced vibrations captured by motion sensors. Our experiments show that IMMERSPY detects four consecutive digits with 74% accuracy and 16-digit sequences, such as credit card numbers, with 62% accuracy. Additionally, we leverage Generative AI text-to-speech models in our attack experiments to illustrate how the attackers can create training datasets even without the need to use the victim’s labeled data. Our findings highlight the critical need for security measures in VR domains to mitigate evolving privacy risks. To address this, we introduce a defense technique that emits inaudible tones through the Head-Mounted Display (HMD) speakers, showing its effectiveness in mitigating acoustic side-channel attacks.

Speak Up, I'm Listening: Extracting Speech from Zero-Permission VR Sensors / Cayir, Derin; Aburas, Reham Mohamed; Lazzeretti, Riccardo; Angelini, Marco; Acar, Abbas; Conti, Mauro; Celik, Z. Berkay; Uluagac, Selcuk. - (2025). ( Network and Distributed System Security (NDSS) Symposium 2025 San Diego; USA ) [10.14722/ndss.2025.240164].

Speak Up, I'm Listening: Extracting Speech from Zero-Permission VR Sensors

Cayir, Derin;Aburas, Reham Mohamed;Lazzeretti, Riccardo;Angelini, Marco;Acar, Abbas;Conti, Mauro;Celik, Z. Berkay;Uluagac, Selcuk

2025

Abstract

As Virtual Reality (VR) technologies advance, their application in privacy-sensitive contexts, such as meetings, lectures, simulations, and training, expands. These environments often involve conversations that contain privacy-sensitive information about users and the individuals with whom they interact. The presence of advanced sensors in modern VR devices raises concerns about possible side-channel attacks that exploit these sensor capabilities. In this paper, we introduce IMMERSPY, a novel acoustic side-channel attack that exploits motion sensors in VR devices to extract sensitive speech content from on-device speakers. We analyze two powerful attacker scenarios: informed attacker, where the attacker possesses labeled data about the victim, and uninformed attacker, where no prior victim information is available. We design a Mel-spectrogram CNN-LSTM model to extract digit information (e.g., social security or credit card numbers) by learning the speech-induced vibrations captured by motion sensors. Our experiments show that IMMERSPY detects four consecutive digits with 74% accuracy and 16-digit sequences, such as credit card numbers, with 62% accuracy. Additionally, we leverage Generative AI text-to-speech models in our attack experiments to illustrate how the attackers can create training datasets even without the need to use the victim’s labeled data. Our findings highlight the critical need for security measures in VR domains to mitigate evolving privacy risks. To address this, we introduce a defense technique that emits inaudible tones through the Head-Mounted Display (HMD) speakers, showing its effectiveness in mitigating acoustic side-channel attacks.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2025
			
	Nome convegno
	
				Network and Distributed System Security (NDSS) Symposium 2025
			
	Parole chiave
	
				Virtual Reality Security; Side Channel Attack
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Speak Up, I'm Listening: Extracting Speech from Zero-Permission VR Sensors / Cayir, Derin; Aburas, Reham Mohamed; Lazzeretti, Riccardo; Angelini, Marco; Acar, Abbas; Conti, Mauro; Celik, Z. Berkay; Uluagac, Selcuk. - (2025). ( Network and Distributed System Security (NDSS) Symposium 2025 San Diego; USA ) [10.14722/ndss.2025.240164].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Cayir_Speak-up_2025.pdf accesso aperto Note: https://doi.org/10.14722/ndss.2025.240164 Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 6.52 MB Formato Adobe PDF	6.52 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1754596

Citazioni

ND

ND

ND

social impact