Influenza-like illness surveillance on twitter through automated learning of naïve language

Gesualdo, F.; Stilo, Giovanni; Agricola, E.; Gonfiantini, M. V.; Pandolfi, E.; Velardi, Paola; Tozzi, A. E.

doi:10.1371/journal.pone.0082489

Twitter has the potential to be a timely and cost-effective source of data for syndromic surveillance. When speaking of an illness, Twitter users often report a combination of symptoms, rather than a suspected or final diagnosis, using naïve, everyday language. We developed a minimally trained algorithm that exploits the abundance of health-related web pages to identify all jargon expressions related to a specific technical term. We then translated an influenza case definition into a Boolean query, each symptom being described by a technical term and all related jargon expressions, as identified by the algorithm. Subsequently, we monitored all tweets that reported a combination of symptoms satisfying the case definition query. In order to geolocalize messages, we defined 3 localization strategies based on codes associated with each tweet. We found a high correlation coefficient between the trend of our influenza-positive tweets and ILI trends identified by US traditional surveillance s

Twitter has the potential to be a timely and cost-effective source of data for syndromic surveillance. When speaking of an illness, Twitter users often report a combination of symptoms, rather than a suspected or final diagnosis, using naïve, everyday language. We developed a minimally trained algorithm that exploits the abundance of health-related web pages to identify all jargon expressions related to a specific technical term. We then translated an influenza case definition into a Boolean query, each symptom being described by a technical term and all related jargon expressions, as identified by the algorithm. Subsequently, we monitored all tweets that reported a combination of symptoms satisfying the case definition query. In order to geolocalize messages, we defined 3 localization strategies based on codes associated with each tweet. We found a high correlation coefficient between the trend of our influenza-positive tweets and ILI trends identified by US traditional surveillance systems. © 2013 Gesualdo et al.

Influenza-like illness surveillance on twitter through automated learning of naïve language / F., Gesualdo; Stilo, Giovanni; E., Agricola; M. V., Gonfiantini; E., Pandolfi; Velardi, Paola; A. E., Tozzi. - In: PLOS ONE. - ISSN 1932-6203. - ELETTRONICO. - 8:12(2013). [10.1371/journal.pone.0082489]

Influenza-like illness surveillance on twitter through automated learning of naïve language

F. Gesualdo;STILO, GIOVANNI;E. Agricola;M. V. Gonfiantini;E. Pandolfi;VELARDI, Paola;A. E. Tozzi

2013

Abstract

Twitter has the potential to be a timely and cost-effective source of data for syndromic surveillance. When speaking of an illness, Twitter users often report a combination of symptoms, rather than a suspected or final diagnosis, using naïve, everyday language. We developed a minimally trained algorithm that exploits the abundance of health-related web pages to identify all jargon expressions related to a specific technical term. We then translated an influenza case definition into a Boolean query, each symptom being described by a technical term and all related jargon expressions, as identified by the algorithm. Subsequently, we monitored all tweets that reported a combination of symptoms satisfying the case definition query. In order to geolocalize messages, we defined 3 localization strategies based on codes associated with each tweet. We found a high correlation coefficient between the trend of our influenza-positive tweets and ILI trends identified by US traditional surveillance s

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2013
			
	Abstract
	
				Twitter has the potential to be a timely and cost-effective source of data for syndromic surveillance. When speaking of an illness, Twitter users often report a combination of symptoms, rather than a suspected or final diagnosis, using naïve, everyday language. We developed a minimally trained algorithm that exploits the abundance of health-related web pages to identify all jargon expressions related to a specific technical term. We then translated an influenza case definition into a Boolean query, each symptom being described by a technical term and all related jargon expressions, as identified by the algorithm. Subsequently, we monitored all tweets that reported a combination of symptoms satisfying the case definition query. In order to geolocalize messages, we defined 3 localization strategies based on codes associated with each tweet. We found a high correlation coefficient between the trend of our influenza-positive tweets and ILI trends identified by US traditional surveillance systems. © 2013 Gesualdo et al.
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Influenza-like illness surveillance on twitter through automated learning of naïve language / F., Gesualdo; Stilo, Giovanni; E., Agricola; M. V., Gonfiantini; E., Pandolfi; Velardi, Paola; A. E., Tozzi. - In: PLOS ONE. - ISSN 1932-6203. - ELETTRONICO. - 8:12(2013). [10.1371/journal.pone.0082489]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/543269

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

25

55

46

Catalogo dei prodotti della ricerca