We study the problem of computing similarity rankings in large-scale multi-categorical bipartite graphs, where the two sides of the graph represent actors and items, and the items are partitioned into an arbitrary set of categories. The problem has several real-world applications, including identifying competing advertisers and suggesting related queries in an online advertising system or finding users with similar interests and suggesting content to them. In these settings, we are interested in computing on-the-fly rankings of similar actors, given an actor and an arbitrary subset of categories of interest. Two main challenges arise: First, the bipartite graphs are huge and often lopsided (e.g. the system might receive billions of queries while presenting only millions of advertisers). Second, the sheer number of possible combinations of categories prevents the pre-computation of the results for all of them. We present a novel algorithmic framework that addresses both issues for the computation of several graph-theoretical similarity measures, including # common neighbors, and Personalized PageRank. We show how to tackle the imbalance in the graphs to speed up the computation and provide efficient real-time algorithms for computing rankings for an arbitrary subset of categories. Finally, we show experimentally the accuracy of our approach with real-world data, using both public graphs and a very large dataset from Google AdWords. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs / Epasto, Alessandro; J., Feldman; Lattanzi, Silvio; Leonardi, Stefano; V., Mirrokni. - STAMPA. - (2014), pp. 349-360. (Intervento presentato al convegno 23rd International World Wide Web Conference tenutosi a Seoul; Korea, Republic of nel April 7-11, 2014) [10.1145/2566486.2568025].

Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs

EPASTO, ALESSANDRO;LATTANZI, SILVIO;LEONARDI, Stefano;
2014

Abstract

We study the problem of computing similarity rankings in large-scale multi-categorical bipartite graphs, where the two sides of the graph represent actors and items, and the items are partitioned into an arbitrary set of categories. The problem has several real-world applications, including identifying competing advertisers and suggesting related queries in an online advertising system or finding users with similar interests and suggesting content to them. In these settings, we are interested in computing on-the-fly rankings of similar actors, given an actor and an arbitrary subset of categories of interest. Two main challenges arise: First, the bipartite graphs are huge and often lopsided (e.g. the system might receive billions of queries while presenting only millions of advertisers). Second, the sheer number of possible combinations of categories prevents the pre-computation of the results for all of them. We present a novel algorithmic framework that addresses both issues for the computation of several graph-theoretical similarity measures, including # common neighbors, and Personalized PageRank. We show how to tackle the imbalance in the graphs to speed up the computation and provide efficient real-time algorithms for computing rankings for an arbitrary subset of categories. Finally, we show experimentally the accuracy of our approach with real-world data, using both public graphs and a very large dataset from Google AdWords. Copyright is held by the International World Wide Web Conference Committee (IW3C2).
2014
23rd International World Wide Web Conference
Graph mining; Similarity ranking; Bipartite graphs; Parsonalized PageRank; Markov Chains; Random Walks
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs / Epasto, Alessandro; J., Feldman; Lattanzi, Silvio; Leonardi, Stefano; V., Mirrokni. - STAMPA. - (2014), pp. 349-360. (Intervento presentato al convegno 23rd International World Wide Web Conference tenutosi a Seoul; Korea, Republic of nel April 7-11, 2014) [10.1145/2566486.2568025].
File allegati a questo prodotto
File Dimensione Formato  
VE_2014_11573-539412.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 495.69 kB
Formato Adobe PDF
495.69 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/539412
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 3
social impact