Word Sense Disambiguation (WSD) is a key task in Natural Language Processing (NLP), involving selecting the correct meaning of a word based on its context. With Pretrained Language Models (PLMs) like BERT and DeBERTa now well established, significant progress has been made in understanding contextual semantics. Nevertheless, how well these models inherently disambiguate word senses remains uncertain. In this work, we evaluate several encoder-only PLMs across two popular inventories (i.e. WordNet and the Oxford Dictionary of English) by analyzing their ability to separate word senses without any task-specific fine-tuning. We compute centroids of word senses and measure similarity to assess performance across different layers. Our results show that DeBERTa-v3 delivers the best performance on the task, with the middle layers (specifically the 7th and 8th layers) achieving the highest accuracy, outperforming the output layer by approximately 15 percentage points. Our experiments also explore the inherent structure of WordNet and ODE sense inventories, highlighting their influence on the overall model behavior and performance. Finally, based on our findings, we develop a small, efficient model for the WSD task that attains robust performance while significantly reducing the carbon footprint.

How Much Do Encoder Models Know About Word Senses? / Teglia, Simone; Tedeschi, Simone; Simone And Roberto, ; Navigli, Roberto. - (2025), pp. 2266-2277. ( Association for Computational Linguistics Vienna; Austria ) [10.18653/v1/2025.acl-long.113].

How Much Do Encoder Models Know About Word Senses?

Teglia
;
Simone and Tedeschi
;
Navigli
2025

Abstract

Word Sense Disambiguation (WSD) is a key task in Natural Language Processing (NLP), involving selecting the correct meaning of a word based on its context. With Pretrained Language Models (PLMs) like BERT and DeBERTa now well established, significant progress has been made in understanding contextual semantics. Nevertheless, how well these models inherently disambiguate word senses remains uncertain. In this work, we evaluate several encoder-only PLMs across two popular inventories (i.e. WordNet and the Oxford Dictionary of English) by analyzing their ability to separate word senses without any task-specific fine-tuning. We compute centroids of word senses and measure similarity to assess performance across different layers. Our results show that DeBERTa-v3 delivers the best performance on the task, with the middle layers (specifically the 7th and 8th layers) achieving the highest accuracy, outperforming the output layer by approximately 15 percentage points. Our experiments also explore the inherent structure of WordNet and ODE sense inventories, highlighting their influence on the overall model behavior and performance. Finally, based on our findings, we develop a small, efficient model for the WSD task that attains robust performance while significantly reducing the carbon footprint.
2025
Association for Computational Linguistics
nlp; wsd; encoder models
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
How Much Do Encoder Models Know About Word Senses? / Teglia, Simone; Tedeschi, Simone; Simone And Roberto, ; Navigli, Roberto. - (2025), pp. 2266-2277. ( Association for Computational Linguistics Vienna; Austria ) [10.18653/v1/2025.acl-long.113].
File allegati a questo prodotto
File Dimensione Formato  
Teglia_How-much_2025.pdf

accesso aperto

Note: https://aclanthology.org/2025.acl-long.113.pdf
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 410.07 kB
Formato Adobe PDF
410.07 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1747704
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact