In this paper, we present a novel approach to Web search result clustering based on the automatic discovery of word senses from raw text, a task referred to as Word Sense Induction (WSI). We first acquire the senses (i.e., meanings) of a query by means of a graph-based clustering algorithm that exploits cycles (triangles and squares) in the co-occurrence graph of the query. Then we cluster the search results based on their semantic similarity to the induced word senses. Our experiments, conducted on datasets of ambiguous queries, show that our approach improves search result clustering in terms of both clustering quality and degree of diversification. © 2010 Association for Computational Linguistics.
Inducing word senses to improve Web search result clustering / Navigli, Roberto; G., Crisafulli. - STAMPA. - (2010), pp. 116-126. (Intervento presentato al convegno Conference on Empirical Methods in Natural Language Processing, EMNLP 2010 tenutosi a Cambridge, Boston, USA nel October 9-11, 2010).
Inducing word senses to improve Web search result clustering
NAVIGLI, ROBERTO;
2010
Abstract
In this paper, we present a novel approach to Web search result clustering based on the automatic discovery of word senses from raw text, a task referred to as Word Sense Induction (WSI). We first acquire the senses (i.e., meanings) of a query by means of a graph-based clustering algorithm that exploits cycles (triangles and squares) in the co-occurrence graph of the query. Then we cluster the search results based on their semantic similarity to the induced word senses. Our experiments, conducted on datasets of ambiguous queries, show that our approach improves search result clustering in terms of both clustering quality and degree of diversification. © 2010 Association for Computational Linguistics.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.