In this paper, we study the expansion of pluralia tantum, i.e., defective nouns which lack a singular form, like scissors. We base our work on an annotation framework specifically developed for the study of lexicalization of pluralia tantum, namely Lexicalization profiles. On a corresponding hand-annotated testset, we show that the OpenAI and DeepSeek models provide useful annotators for semantic, syntactic and sense categories, with accuracy ranging from 51% to 89%, averaged across all feature groups and languages. Next, we turn to a large-scale investigation of pluralia tantum. Using dictionaries, we extract candidate words for Italian, Russian and English and keep those for which the changing ratio of singular and plural form is evident in a corresponding reference corpus. We use an LLM to annotate each instance from the reference corpora according to the annotation framework. We show that the large amount of automatically annotated sentences for each feature can be used to perform in-depth linguistic analysis. Focusing on the correlation between an annotated feature and the grammatical form (singular vs. plural), patterns of morpho-semantic change are noted.

Elections go bananas: A First Large-scale Multilingual Study of Pluralia Tantum using LLMs / Spaziani, Elena; Zeinalipour, Kamyar; Cassotti, Pierluigi; Tahmasebi, Nina. - (2026), pp. 6547-6570. ( 19th Conference of the European Chapter of the Association for Computational Linguistics Rabat; Morocco ) [10.18653/v1/2026.eacl-long.308].

Elections go bananas: A First Large-scale Multilingual Study of Pluralia Tantum using LLMs

Spaziani, Elena
Primo
;
2026

Abstract

In this paper, we study the expansion of pluralia tantum, i.e., defective nouns which lack a singular form, like scissors. We base our work on an annotation framework specifically developed for the study of lexicalization of pluralia tantum, namely Lexicalization profiles. On a corresponding hand-annotated testset, we show that the OpenAI and DeepSeek models provide useful annotators for semantic, syntactic and sense categories, with accuracy ranging from 51% to 89%, averaged across all feature groups and languages. Next, we turn to a large-scale investigation of pluralia tantum. Using dictionaries, we extract candidate words for Italian, Russian and English and keep those for which the changing ratio of singular and plural form is evident in a corresponding reference corpus. We use an LLM to annotate each instance from the reference corpora according to the annotation framework. We show that the large amount of automatically annotated sentences for each feature can be used to perform in-depth linguistic analysis. Focusing on the correlation between an annotated feature and the grammatical form (singular vs. plural), patterns of morpho-semantic change are noted.
2026
19th Conference of the European Chapter of the Association for Computational Linguistics
pluralia tantum
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Elections go bananas: A First Large-scale Multilingual Study of Pluralia Tantum using LLMs / Spaziani, Elena; Zeinalipour, Kamyar; Cassotti, Pierluigi; Tahmasebi, Nina. - (2026), pp. 6547-6570. ( 19th Conference of the European Chapter of the Association for Computational Linguistics Rabat; Morocco ) [10.18653/v1/2026.eacl-long.308].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1763942
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact