Speech and Language Impairments (SLI) affect a large and heterogeneous group of people. With our work, we propose a novel, easy, and immediate detection tool to help diagnose people who suffer from SLI using speech audio signals, along with a new dataset containing English speakers affected by SLI. In this work, we experiment with feature extraction methods such as Mel Spectrogram and wav2vec 2.0, as well as classification methods such as SVM, CNN, and linear neural networks. We also work on data audio augmentation trying to overcome the very common limitations imposed by data scarcity in the medical field. The overall results indicate that the wav2vec 2.0 feature extractor, paired with a linear classifier, provides the best performance with a reasonably high accuracy of over 96%.
Speech and Language Impairment Detection by Means of AI-Driven Audio-Based Techniques / Corvitto, L.; Faiella, L.; Napoli, C.; Puglisi, A.; Russo, S.. - 3869:(2024), pp. 19-31. (Intervento presentato al convegno 9th International Conference of Yearly Reports on Informatics, Mathematics, and Engineering, ICYRIME 2024 tenutosi a Catania; Italy).
Speech and Language Impairment Detection by Means of AI-Driven Audio-Based Techniques
Napoli C.
Co-primo
Supervision
;Puglisi A.
Co-primo
Investigation
;Russo S.
Co-primo
Conceptualization
2024
Abstract
Speech and Language Impairments (SLI) affect a large and heterogeneous group of people. With our work, we propose a novel, easy, and immediate detection tool to help diagnose people who suffer from SLI using speech audio signals, along with a new dataset containing English speakers affected by SLI. In this work, we experiment with feature extraction methods such as Mel Spectrogram and wav2vec 2.0, as well as classification methods such as SVM, CNN, and linear neural networks. We also work on data audio augmentation trying to overcome the very common limitations imposed by data scarcity in the medical field. The overall results indicate that the wav2vec 2.0 feature extractor, paired with a linear classifier, provides the best performance with a reasonably high accuracy of over 96%.File | Dimensione | Formato | |
---|---|---|---|
Corvitto_Speech-Language-Impairment_2024.pdf
accesso aperto
Note: https://ceur-ws.org/Vol-3869/p03.pdf
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Creative commons
Dimensione
1.52 MB
Formato
Adobe PDF
|
1.52 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.