A matter of words: NLP for quality evaluation of wikipedia medical articles

Cozza, Vittoria; Petrocchi, Marinella; Spognardi, Angelo

doi:10.1007/978-3-319-38791-8_31

Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains, like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specific domain features to improve the results of the evaluation of Wikipedia medical articles, relying on Natural Language Processing (NLP) and dictionaries-based techniques. The results of our experiments confirm that, by considering domain-oriented features, it is possible to improve existing solutions, mainly with those articles that other approaches have less correctly classified.

A matter of words: NLP for quality evaluation of wikipedia medical articles / Cozza, V., Petrocchi, M., Spognardi, A.. - 9671:(2016), pp. 448-456. (16th International Conference on Web Engineering, ICWE 2016 Lugano; Switzerland ) [10.1007/978-3-319-38791-8_31].

A matter of words: NLP for quality evaluation of wikipedia medical articles

Cozza, Vittoria;Petrocchi, Marinella;SPOGNARDI, Angelo

2016

Abstract

Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains, like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specific domain features to improve the results of the evaluation of Wikipedia medical articles, relying on Natural Language Processing (NLP) and dictionaries-based techniques. The results of our experiments confirm that, by considering domain-oriented features, it is possible to improve existing solutions, mainly with those articles that other approaches have less correctly classified.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2016
			
	Nome convegno
	
				16th International Conference on Web Engineering, ICWE 2016
			
	Parole chiave
	
				Theoretical Computer Science; Computer Science (all)
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				A matter of words: NLP for quality evaluation of wikipedia medical articles / Cozza, V., Petrocchi, M., Spognardi, A.. - 9671:(2016), pp. 448-456. (16th International Conference on Web Engineering, ICWE 2016 Lugano; Switzerland ) [10.1007/978-3-319-38791-8_31].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/960185

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

12

7

Catalogo dei prodotti della ricerca