Catalogo dei prodotti della ricerca

We introduce the similarity caching problem, a variant of classical caching in which an algorithm can return an element from the cache that is similar, but not necessarily identical, to the query element. We are motivated by buffer management questions in approximate nearest-neighbor applications, especially in the context of caching targeted advertisements on the web. Formally, we assume the queries lie in a metric space, with distance function d(.,.). A query p is considered a cache hit if there is a point q in the cache that is sufficiently close to p, i.e., for a threshold radius r, we have d(p, q) <= r. The goal is then to minimize the number of cache misses, vis-a-vis the optimal algorithm. As with classical caching, we use the competitive ratio to measure the performance of different algorithms. While similarity caching is a strict generalization of classical caching, we show that unless the algorithm is allowed extra power (either in the size of the cache or the threshold r) over the optimal offline algorithm, the problem is intractable. We then proceed to quantify the hardness as a function of the complexity of the underlying metric space. We show that the problem becomes easier as we proceed from general metric spaces to those of bounded doubling dimension, and to Euclidean metrics. Finally, we investigate several extensions of the problem: dependence of the threshold r on the query and a smoother trade-off between the cache-miss cost and the query-query similarity.

Similarity Caching / Chierichetti, F., Ravi, K., Sergei, V.. - (2009), pp. 127-135. (28th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS) Providence, RI JUN 29-JUL 01, 2009) [10.1145/1559795.1559815].

Similarity Caching

CHIERICHETTI, FLAVIO;Ravi Kumar;Sergei Vassilvitskii

2009

Abstract

We introduce the similarity caching problem, a variant of classical caching in which an algorithm can return an element from the cache that is similar, but not necessarily identical, to the query element. We are motivated by buffer management questions in approximate nearest-neighbor applications, especially in the context of caching targeted advertisements on the web. Formally, we assume the queries lie in a metric space, with distance function d(.,.). A query p is considered a cache hit if there is a point q in the cache that is sufficiently close to p, i.e., for a threshold radius r, we have d(p, q) <= r. The goal is then to minimize the number of cache misses, vis-a-vis the optimal algorithm. As with classical caching, we use the competitive ratio to measure the performance of different algorithms. While similarity caching is a strict generalization of classical caching, we show that unless the algorithm is allowed extra power (either in the size of the cache or the threshold r) over the optimal offline algorithm, the problem is intractable. We then proceed to quantify the hardness as a function of the complexity of the underlying metric space. We show that the problem becomes easier as we proceed from general metric spaces to those of bounded doubling dimension, and to Euclidean metrics. Finally, we investigate several extensions of the problem: dependence of the threshold r on the query and a smoother trade-off between the cache-miss cost and the query-query similarity.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2009
			
	Nome convegno
	
				28th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS)
			
	Parole chiave
	
				buffer management; caching; competitive analysis; nearest-neighbor
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Similarity Caching / Chierichetti, F., Ravi, K., Sergei, V.. - (2009), pp. 127-135. (28th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS) Providence, RI JUN 29-JUL 01, 2009) [10.1145/1559795.1559815].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/497784

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

19

17

social impact