Catalogo dei prodotti della ricerca

In this paper, we present LJ-TTS, a large-scale single-speaker dataset of real and synthetic speech designed to support research in text-to-speech (TTS) synthesis and analysis. The dataset builds upon high-quality recordings of a single English speaker, alongside outputs generated by 11 state-of-the-art TTS models, including both autoregressive and non-autoregressive architectures. By maintaining a controlled single-speaker setting, LJ-TTS enables precise comparison of speech characteristics across different generative models, isolating the effects of synthesis methods from speaker variability. Unlike multi-speaker datasets lacking alignment between real and synthetic samples, LJ-TTS provides exact utterance-level correspondence, allowing fine-grained analyses that are otherwise impractical. The dataset supports systematic evaluation of synthetic speech across multiple dimensions, including deepfake detection, source tracing, and phoneme-level analyses. LJ-TTS provides a standardized resource for benchmarking generative models, assessing the limits of current TTS systems, and developing robust detection and evaluation methods. The dataset is publicly available to the research community to foster reproducible and controlled studies in speech synthesis and synthetic speech detection.

LJ-TTS: A Paired Real and Synthetic Speech Dataset for Single-Speaker TTS Analysis / Negroni, V., Salvi, D., Comanducci, L., Wani, T.M., Uecker, M., Amerini, I., Tubaro, S., Bestagini, P.. - In: ELECTRONICS. - ISSN 2079-9292. - 15:1(2026). [10.3390/electronics15010169]

LJ-TTS: A Paired Real and Synthetic Speech Dataset for Single-Speaker TTS Analysis

Negroni, Viola;Salvi, Davide;Comanducci, Luca;Wani, Taiba Majid;Uecker, Madleen;Amerini, Irene;Tubaro, Stefano;Bestagini, Paolo

2026

Abstract

In this paper, we present LJ-TTS, a large-scale single-speaker dataset of real and synthetic speech designed to support research in text-to-speech (TTS) synthesis and analysis. The dataset builds upon high-quality recordings of a single English speaker, alongside outputs generated by 11 state-of-the-art TTS models, including both autoregressive and non-autoregressive architectures. By maintaining a controlled single-speaker setting, LJ-TTS enables precise comparison of speech characteristics across different generative models, isolating the effects of synthesis methods from speaker variability. Unlike multi-speaker datasets lacking alignment between real and synthetic samples, LJ-TTS provides exact utterance-level correspondence, allowing fine-grained analyses that are otherwise impractical. The dataset supports systematic evaluation of synthetic speech across multiple dimensions, including deepfake detection, source tracing, and phoneme-level analyses. LJ-TTS provides a standardized resource for benchmarking generative models, assessing the limits of current TTS systems, and developing robust detection and evaluation methods. The dataset is publicly available to the research community to foster reproducible and controlled studies in speech synthesis and synthetic speech detection.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2026
			
	Parole chiave
	
				audio forensics; deepfake detection; generative AI; speech processing; synthetic speech; text-to-speech
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				LJ-TTS: A Paired Real and Synthetic Speech Dataset for Single-Speaker TTS Analysis / Negroni, V., Salvi, D., Comanducci, L., Wani, T.M., Uecker, M., Amerini, I., Tubaro, S., Bestagini, P.. - In: ELECTRONICS. - ISSN 2079-9292. - 15:1(2026). [10.3390/electronics15010169]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Viola_LJ-TTS_2026.pdf accesso aperto Note: https://doi.org/10.3390/electronics15010169 Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 1.07 MB Formato Adobe PDF	1.07 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1765781

Citazioni

ND

1

0

social impact