In this paper we introduce the problem of query covering as a means to efficiently cache query results. The general idea is to populate the cache with documents that contribute to the result pages of a large number of queries, as opposed to caching the top documents for each query. It turns out that the problem is hard and solving it requires knowledge of the structure of the queries and the results space, as well as knowledge of the input query distribution. We formulate the problem under the framework of stochastic optimization; theoretically it can be seen as a stochastic universal version of set multicover. While the problem is NP-hard to be solved exactly, we show that for any distribution it can be approximated using a simple greedy approach. Our theoretical findings are complemented by experimental activity on real datasets, showing the feasibility and potential interest of query-covering approaches in practice. Copyright 2011 ACM.

Stochastic query covering / Anagnostopoulos, Aristidis; Becchetti, Luca; Leonardi, Stefano; Ida, Mele; Piotr, Sankowski. - (2011), pp. 725-734. (Intervento presentato al convegno 4th ACM International Conference on Web Search and Data Mining, WSDM 2011 tenutosi a Hong Kong; China nel 9 February 2011 through 12 February 2011) [10.1145/1935826.1935923].

Stochastic query covering

ANAGNOSTOPOULOS, ARISTIDIS;BECCHETTI, Luca;LEONARDI, Stefano;
2011

Abstract

In this paper we introduce the problem of query covering as a means to efficiently cache query results. The general idea is to populate the cache with documents that contribute to the result pages of a large number of queries, as opposed to caching the top documents for each query. It turns out that the problem is hard and solving it requires knowledge of the structure of the queries and the results space, as well as knowledge of the input query distribution. We formulate the problem under the framework of stochastic optimization; theoretically it can be seen as a stochastic universal version of set multicover. While the problem is NP-hard to be solved exactly, we show that for any distribution it can be approximated using a simple greedy approach. Our theoretical findings are complemented by experimental activity on real datasets, showing the feasibility and potential interest of query-covering approaches in practice. Copyright 2011 ACM.
2011
4th ACM International Conference on Web Search and Data Mining, WSDM 2011
caching; query covering; stochastic analysis
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Stochastic query covering / Anagnostopoulos, Aristidis; Becchetti, Luca; Leonardi, Stefano; Ida, Mele; Piotr, Sankowski. - (2011), pp. 725-734. (Intervento presentato al convegno 4th ACM International Conference on Web Search and Data Mining, WSDM 2011 tenutosi a Hong Kong; China nel 9 February 2011 through 12 February 2011) [10.1145/1935826.1935923].
File allegati a questo prodotto
File Dimensione Formato  
VE_2011_11573-377393.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 434.96 kB
Formato Adobe PDF
434.96 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/377393
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? ND
social impact