Modern information retrieval systems use several levels of caching to speedup computation by exploiting frequent, recent or costly data used in the past. Previous studies show that the use of caching techniques is crucial in search engines, as it helps reducing query response times and processing workloads on search servers. In this work we propose and evaluate a static cache that acts simultaneously as list and intersection cache, offering a more efficient way of handling cache space. We also use a query resolution strategy that takes advantage of the existence of this cache to reorder the query execution sequence. In addition, we propose effective strategies to select the term pairs that should populate the cache. We also represent the data in cache in both raw and compressed forms and evaluate the differences between them using different configurations of cache sizes. The results show that the proposed Integrated Cache outperforms the standard posting lists cache in most of the cases, taking advantage not only of the intersection cache but also the query resolution strategy.
Performance improvements for search systems using an integrated cache of lists + intersections / Tolosa, Gabriel; Feuerstein, Esteban; Becchetti, Luca; MARCHETTI SPACCAMELA, Alberto. - In: INFORMATION RETRIEVAL. - ISSN 1386-4564. - STAMPA. - 20:3(2017), pp. 172-198. [10.1007/s10791-017-9299-5]
Performance improvements for search systems using an integrated cache of lists + intersections
BECCHETTI, Luca;MARCHETTI SPACCAMELA, Alberto
2017
Abstract
Modern information retrieval systems use several levels of caching to speedup computation by exploiting frequent, recent or costly data used in the past. Previous studies show that the use of caching techniques is crucial in search engines, as it helps reducing query response times and processing workloads on search servers. In this work we propose and evaluate a static cache that acts simultaneously as list and intersection cache, offering a more efficient way of handling cache space. We also use a query resolution strategy that takes advantage of the existence of this cache to reorder the query execution sequence. In addition, we propose effective strategies to select the term pairs that should populate the cache. We also represent the data in cache in both raw and compressed forms and evaluate the differences between them using different configurations of cache sizes. The results show that the proposed Integrated Cache outperforms the standard posting lists cache in most of the cases, taking advantage not only of the intersection cache but also the query resolution strategy.File | Dimensione | Formato | |
---|---|---|---|
Tolosa_Preprint_Performance-eImprovements_2017.pdf
accesso aperto
Note: https://link.springer.com/article/10.1007/s10791-017-9299-5
Tipologia:
Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
933.71 kB
Formato
Adobe PDF
|
933.71 kB | Adobe PDF | |
Tolosa_Performance-eImprovements_2017.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.89 MB
Formato
Adobe PDF
|
2.89 MB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.