Modeling the process that a listener actuates in deriving words intended by a speaker, requires setting a hypothesis on how lexical items are stored in memory. Stevens’ model (2002) postulates that lexical items are stored in memory according to distinctive features, and that these features are hierarchically organized. The model highlights the importance of abrupt acoustic events, named landmarks, in the perception process. In this model, the detection of landmarks is primary in human perception, corresponding to the first phase of recognition. The temporal area around the landmark is then further processed by the listener. Based on the above model, the Speech Communication Group of the Massachusetts Institute of Technology (MIT) developed a speech recognition system—for spoken English—over a span of more than 20 years. In the current work (LaMIT project, Lexical access Model for Italian) the above model is applied to Italian. Exploring a new language will provide insight into how Stevens' approach has universal application across languages, with relevant implications for understanding how the human brain recognizes speech. K. N. Stevens “Toward a model for lexical access based on acoustic landmarks and distinctive features,” J. Acoust. Soc. Am., 111(4), 1872–1891 (2002).
Speech recognition of spoken Italian based on detection of landmarks and other acoustic cues to distinctive features / Di Benedetto, Maria-Gabriella; Choi, Jeung-Yoon; Shattuck-Hufnagel, Stefanie; De Nardis, Luca; Budoni, Sara; Vivaldi, Jacopo; Arango, Javier; Decaprio, Alec; Yao, Stephanie. - In: THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA. - ISSN 0001-4966. - 148:4(2020), pp. 2808-2808. [10.1121/1.5147823]
Speech recognition of spoken Italian based on detection of landmarks and other acoustic cues to distinctive features
Di Benedetto, Maria-Gabriella
;De Nardis, Luca;
2020
Abstract
Modeling the process that a listener actuates in deriving words intended by a speaker, requires setting a hypothesis on how lexical items are stored in memory. Stevens’ model (2002) postulates that lexical items are stored in memory according to distinctive features, and that these features are hierarchically organized. The model highlights the importance of abrupt acoustic events, named landmarks, in the perception process. In this model, the detection of landmarks is primary in human perception, corresponding to the first phase of recognition. The temporal area around the landmark is then further processed by the listener. Based on the above model, the Speech Communication Group of the Massachusetts Institute of Technology (MIT) developed a speech recognition system—for spoken English—over a span of more than 20 years. In the current work (LaMIT project, Lexical access Model for Italian) the above model is applied to Italian. Exploring a new language will provide insight into how Stevens' approach has universal application across languages, with relevant implications for understanding how the human brain recognizes speech. K. N. Stevens “Toward a model for lexical access based on acoustic landmarks and distinctive features,” J. Acoust. Soc. Am., 111(4), 1872–1891 (2002).I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.