Annotating large numbers of sentences with senses is the heaviest requirement of current Word Sense Disambiguation. We present Train-O-Matic, a language independent method for generating millions of sense-annotated training instances for virtually all meanings of words in a language’s vocabulary. The approach is fully automatic: no human intervention is required and the only type of human knowledge used is a WordNet-like resource. Train-O-Matic achieves consistently state-of-the-art performance across gold standard datasets and languages, while at the same time removing the burden of manual annotation. All the training data is available for research purposes at http://trainomatic.org

Train-O-Matic: large-scale supervised Word Sense Disambiguation in multiple languages without manual training data / Pasini, Tommaso; Navigli, Roberto. - ELETTRONICO. - --:(2017), pp. 78-88. (Intervento presentato al convegno EMNLP tenutosi a Copenhagen nel 7/9/2017 - 11/09/2017) [10.18653/v1/D17-1008].

Train-O-Matic: large-scale supervised Word Sense Disambiguation in multiple languages without manual training data

Tommaso Pasini
;
Roberto Navigli
2017

Abstract

Annotating large numbers of sentences with senses is the heaviest requirement of current Word Sense Disambiguation. We present Train-O-Matic, a language independent method for generating millions of sense-annotated training instances for virtually all meanings of words in a language’s vocabulary. The approach is fully automatic: no human intervention is required and the only type of human knowledge used is a WordNet-like resource. Train-O-Matic achieves consistently state-of-the-art performance across gold standard datasets and languages, while at the same time removing the burden of manual annotation. All the training data is available for research purposes at http://trainomatic.org
2017
EMNLP
WSD; Word Sense Disambiguation; Knowledge Acquisition Bottleneck
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Train-O-Matic: large-scale supervised Word Sense Disambiguation in multiple languages without manual training data / Pasini, Tommaso; Navigli, Roberto. - ELETTRONICO. - --:(2017), pp. 78-88. (Intervento presentato al convegno EMNLP tenutosi a Copenhagen nel 7/9/2017 - 11/09/2017) [10.18653/v1/D17-1008].
File allegati a questo prodotto
File Dimensione Formato  
Pasini_Train_2017.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 582.03 kB
Formato Adobe PDF
582.03 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1023434
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 45
  • ???jsp.display-item.citation.isi??? ND
social impact