This study describes and evaluates a multi-method approach for identifying and extracting collocations to develop a learner Italian collocation dictionary. The approach integrates part-of-speech tagging and dependency parsing to extract six syntactic relations from a reference corpus of Italian. The initial set of candidates was gradually reduced using frequency, dispersion, and association measures. This set was then evaluated by comparing it with existing collocation dictionaries and gathering expert judgments on which collocations should be included. Combining these two evaluations, further refined the list. Moreover, the effect of statistical measures on expert judgments was investigated. Results revealed that dispersion and association measures positively influenced human evaluations, while higher frequency often correlated with negative ratings. This triangulation of corpus-based and statistical methods, human judgements and comparison with existing dictionaries captures collocations widely used across genres, suitable for inclusion in a learner dictionary, offering a useful tool for learners while contributing to corpusbased collocation research.

Developing a learner dictionary of collocations: description and evaluation of a multi-method approach / Spina, S.; Fioravanti, I.; Zanda, F.; Forti, L.; Perri, D.; Gervasi, O.. - In: CORPUS LINGUISTICS AND LINGUISTIC THEORY. - ISSN 1613-7027. - (2026). [10.1515/cllt-2025-0008]

Developing a learner dictionary of collocations: description and evaluation of a multi-method approach

Zanda, F.;
2026

Abstract

This study describes and evaluates a multi-method approach for identifying and extracting collocations to develop a learner Italian collocation dictionary. The approach integrates part-of-speech tagging and dependency parsing to extract six syntactic relations from a reference corpus of Italian. The initial set of candidates was gradually reduced using frequency, dispersion, and association measures. This set was then evaluated by comparing it with existing collocation dictionaries and gathering expert judgments on which collocations should be included. Combining these two evaluations, further refined the list. Moreover, the effect of statistical measures on expert judgments was investigated. Results revealed that dispersion and association measures positively influenced human evaluations, while higher frequency often correlated with negative ratings. This triangulation of corpus-based and statistical methods, human judgements and comparison with existing dictionaries captures collocations widely used across genres, suitable for inclusion in a learner dictionary, offering a useful tool for learners while contributing to corpusbased collocation research.
2026
collocation; learner dictionary; L2 Italian; frequency; dispersion; association measures
01 Pubblicazione su rivista::01a Articolo in rivista
Developing a learner dictionary of collocations: description and evaluation of a multi-method approach / Spina, S.; Fioravanti, I.; Zanda, F.; Forti, L.; Perri, D.; Gervasi, O.. - In: CORPUS LINGUISTICS AND LINGUISTIC THEORY. - ISSN 1613-7027. - (2026). [10.1515/cllt-2025-0008]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1758628
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact