Social resource sharing systems like YouTube and del.icio.us have acquired a large number of users within the last few years. They provide rich resources for data analysis, information retrieval, and knowledge discovery applications. A first step towards this end is to gain better insights into content and structure of these systems. In this paper, we will analyse the main network characteristics of two of these systems. We consider their underlying data structures - so-called folksonomies - as tri-partite hypergraphs, and adapt classical network measures like characteristic path length and clustering coefficient to them. Subsequently, we introduce a network of tag co-occurrence and investigate some of its statistical properties, focusing on correlations in node connectivity and pointing out features that reflect emergent semantics within the folksonomy. We show that simple statistical indicators unambiguously spot non-social behavior such as spam. © 2007 - IOS Press and the authors. All rights reserved.
Network properties of folksonomies / C., Schmitz; M., Grahl; A., Hotho; G., Stumme; C., Cattuto; A., Baldassarri; Loreto, Vittorio; Servedio, VITO DOMENICO PIETRO. - STAMPA. - 20:4(2007), pp. 245-262. (Intervento presentato al convegno Sixteenth International World Wide Web Conference (WWW2007) tenutosi a Banff; Canada nel May 8-12, 2007).
Network properties of folksonomies
LORETO, Vittorio;SERVEDIO, VITO DOMENICO PIETRO
2007
Abstract
Social resource sharing systems like YouTube and del.icio.us have acquired a large number of users within the last few years. They provide rich resources for data analysis, information retrieval, and knowledge discovery applications. A first step towards this end is to gain better insights into content and structure of these systems. In this paper, we will analyse the main network characteristics of two of these systems. We consider their underlying data structures - so-called folksonomies - as tri-partite hypergraphs, and adapt classical network measures like characteristic path length and clustering coefficient to them. Subsequently, we introduce a network of tag co-occurrence and investigate some of its statistical properties, focusing on correlations in node connectivity and pointing out features that reflect emergent semantics within the folksonomy. We show that simple statistical indicators unambiguously spot non-social behavior such as spam. © 2007 - IOS Press and the authors. All rights reserved.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.