The purpose of this project was to derive a reliable estimate of the frequency of occurrence of the 30 phonemes – plus consonant geminated counterparts- of the Italian language, based on four selected written texts. Since no comparable dataset was found in previous literature, the present analysis may serve as a reference in future studies. Four textual sources were considered: Come si fa una tesi di laurea: le materie umanistiche by Umberto Eco, I promessi sposi by Alessandro Manzoni, a recent article in Corriere della Sera (a popular daily Italian newspaper), and In altre parole by Jhumpa Lahiri. The sources were chosen to represent varied genres, subject matter, time periods, and writing styles. Results of the analysis, which also included an analysis of variance, showed that, for all four sources, the frequencies of occurrence reached relatively stable values after about 6000 phonemes (approximately 1250 words), varying by <0.025%. Estimated frequencies are provided for each single source and as an average across sources.

Estimation of the frequency of occurence of italian phonemes in text / Arango, Javier; Decaprio, Alec; Yao, Stephanie; Baik, Sunwoo; Shattuck-Hufnagel, Stefanie; Di Benedetto, Maria-Gabriella. - In: THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA. - ISSN 0001-4966. - 148:4(2020), pp. 2809-2809. [10.1121/1.5147826]

Estimation of the frequency of occurence of italian phonemes in text

Di Benedetto, Maria-Gabriella
2020

Abstract

The purpose of this project was to derive a reliable estimate of the frequency of occurrence of the 30 phonemes – plus consonant geminated counterparts- of the Italian language, based on four selected written texts. Since no comparable dataset was found in previous literature, the present analysis may serve as a reference in future studies. Four textual sources were considered: Come si fa una tesi di laurea: le materie umanistiche by Umberto Eco, I promessi sposi by Alessandro Manzoni, a recent article in Corriere della Sera (a popular daily Italian newspaper), and In altre parole by Jhumpa Lahiri. The sources were chosen to represent varied genres, subject matter, time periods, and writing styles. Results of the analysis, which also included an analysis of variance, showed that, for all four sources, the frequencies of occurrence reached relatively stable values after about 6000 phonemes (approximately 1250 words), varying by <0.025%. Estimated frequencies are provided for each single source and as an average across sources.
2020
consonants; phoneme frequency; Italian
01 Pubblicazione su rivista::01h Abstract in rivista
Estimation of the frequency of occurence of italian phonemes in text / Arango, Javier; Decaprio, Alec; Yao, Stephanie; Baik, Sunwoo; Shattuck-Hufnagel, Stefanie; Di Benedetto, Maria-Gabriella. - In: THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA. - ISSN 0001-4966. - 148:4(2020), pp. 2809-2809. [10.1121/1.5147826]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1642894
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact