Privacy in communications calls primarily for information flow encryption. Packet traffic flows privacy breaches have been widely demonstrated in point-to-point communications due to information leakage from observable traffic features, like packet length, timestamp, direction. We address a point-to multipoint system, namely a Content Delivery Network, where user clients maintain and use connections with a number of servers. Specifically, we address Google search services: they are conveyed by TLS connections, by using https, either from within user accounts or even without logging as a Google services user. Https is provided to protect communications privacy. Yet, we show that by collecting the encrypted traffic and extracting simple features related to traffic activity and possibly the amount of data sent by servers to clients, effective classifiers of user activity can be realized. Specifically, we are able to distinguish which type of search a user is carrying out, among a given set of alternatives (text, images, maps, video, video on YouTube, news) with average success rates that can exceed 90%. © 2013 IEEE.
What are you Googling? - Inferring search type information through a statistical classifier / Iacovazzi, Alfonso; Baiocchi, Andrea; Ludovico, Bettini. - ELETTRONICO. - (2013), pp. 747-753. (Intervento presentato al convegno 2013 IEEE Global Communications Conference, GLOBECOM 2013 tenutosi a Atlanta; United States nel 9 December 2013 through 13 December 2013) [10.1109/glocom.2013.6831162].
What are you Googling? - Inferring search type information through a statistical classifier
IACOVAZZI, ALFONSO;BAIOCCHI, Andrea;
2013
Abstract
Privacy in communications calls primarily for information flow encryption. Packet traffic flows privacy breaches have been widely demonstrated in point-to-point communications due to information leakage from observable traffic features, like packet length, timestamp, direction. We address a point-to multipoint system, namely a Content Delivery Network, where user clients maintain and use connections with a number of servers. Specifically, we address Google search services: they are conveyed by TLS connections, by using https, either from within user accounts or even without logging as a Google services user. Https is provided to protect communications privacy. Yet, we show that by collecting the encrypted traffic and extracting simple features related to traffic activity and possibly the amount of data sent by servers to clients, effective classifiers of user activity can be realized. Specifically, we are able to distinguish which type of search a user is carrying out, among a given set of alternatives (text, images, maps, video, video on YouTube, news) with average success rates that can exceed 90%. © 2013 IEEE.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.