We propose a model for the synthetic generation of information cascades in social media. In our model the information “memes” propagating in the social network are characterized by a probability distribution in a topic space, accompanied by a textual description, i.e., a bag of keywords coherent with the topic distribution. Similarly, every user of the social media is described by a vector of interests defined over the same topic space. Information cascades are governed by the topic of the meme, its level of virality, the interests of each user, community pressure, and social influence. The main technical challenge we face towards our goal is the generation of realistic interest vectors, given a known network structure and a tunable level of homophily. We tackle this problem by means of a method based on non-negative matrix factorization, which is shown experimentally to outperform non-trivial baselines based on label propagation and random-walk-based graph embedding. As we showcase in our experiments, our model offers a small set of simple and easily interpretable “knobs” which allow to study, in vitro, how each set of assumptions affects the resulting propagations. Finally, we show how to generate synthetic cascades that have similar macro-statistics to the real world cascades for a dataset containing both the network and the cascades.
Generating realistic interest-driven information cascades / Cinus, Federico; Bonchi, Francesco; Monti, Corrado; Panisson, André. - (2020), pp. 107-118. (Intervento presentato al convegno 14th International AAAI Conference on Web and Social Media, ICWSM 2020 tenutosi a Atlanta, Virtual).
Generating realistic interest-driven information cascades
Cinus Federico
;Bonchi Francesco
;
2020
Abstract
We propose a model for the synthetic generation of information cascades in social media. In our model the information “memes” propagating in the social network are characterized by a probability distribution in a topic space, accompanied by a textual description, i.e., a bag of keywords coherent with the topic distribution. Similarly, every user of the social media is described by a vector of interests defined over the same topic space. Information cascades are governed by the topic of the meme, its level of virality, the interests of each user, community pressure, and social influence. The main technical challenge we face towards our goal is the generation of realistic interest vectors, given a known network structure and a tunable level of homophily. We tackle this problem by means of a method based on non-negative matrix factorization, which is shown experimentally to outperform non-trivial baselines based on label propagation and random-walk-based graph embedding. As we showcase in our experiments, our model offers a small set of simple and easily interpretable “knobs” which allow to study, in vitro, how each set of assumptions affects the resulting propagations. Finally, we show how to generate synthetic cascades that have similar macro-statistics to the real world cascades for a dataset containing both the network and the cascades.File | Dimensione | Formato | |
---|---|---|---|
Cinus_Generating_2020.pdf
accesso aperto
Note: https://ojs.aaai.org/index.php/ICWSM/article/view/7283
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.04 MB
Formato
Adobe PDF
|
1.04 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.