Catalogo dei prodotti della ricerca

In many research fields such as Psychology, Linguistics, Cognitive Science, Biomedicine, and Artificial Intelligence, computing semantic similarity between words is an important issue. In this paper we present a new semantic similarity metric that exploits some notions of the early work done using a feature based theory of similarity, and translates it into the information theoretic domain which leverages the notion of Information Content (IC). In particular, the proposed metric exploits the notion of intrinsic IC which quantifies IC values by scrutinizing how concepts are arranged in an ontological structure. In order to evaluate this metric, we conducted an on line experiment asking the community of researchers to rank a list of 65 word pairs. The experiment's web setup allowed to collect 101 similarity ratings, and to differentiate native and non-native English speakers. Such a large and diverse dataset enables to confidently evaluate similarity metrics by correlating them with human assessments. Experimental evaluations using WordNet indicate that our metric, coupled with the notion of intrinsic IC, yields results above the state of the art. Moreover, the intrinsic IC formulation also improves the accuracy of other IC based metrics. We implemented our metric and several others in the Java WordNet Similarity Library. © 2008 Springer Berlin Heidelberg.

Design, implementation and evaluation of a new semantic similarity metric combining features and intrinsic information content / Pirro', Giuseppe; Seco, N.. - 5332:2(2008), pp. 1271-1288. (Intervento presentato al convegno OTM 2008 Confederated International Conferences CoopIS, DOA, GADA, IS, and ODBASE 2008 tenutosi a Monterrey, mex) [10.1007/978-3-540-88873-4_25].

Design, implementation and evaluation of a new semantic similarity metric combining features and intrinsic information content

Pirro' Giuseppe;Seco N.

2008

Abstract

In many research fields such as Psychology, Linguistics, Cognitive Science, Biomedicine, and Artificial Intelligence, computing semantic similarity between words is an important issue. In this paper we present a new semantic similarity metric that exploits some notions of the early work done using a feature based theory of similarity, and translates it into the information theoretic domain which leverages the notion of Information Content (IC). In particular, the proposed metric exploits the notion of intrinsic IC which quantifies IC values by scrutinizing how concepts are arranged in an ontological structure. In order to evaluate this metric, we conducted an on line experiment asking the community of researchers to rank a list of 65 word pairs. The experiment's web setup allowed to collect 101 similarity ratings, and to differentiate native and non-native English speakers. Such a large and diverse dataset enables to confidently evaluate similarity metrics by correlating them with human assessments. Experimental evaluations using WordNet indicate that our metric, coupled with the notion of intrinsic IC, yields results above the state of the art. Moreover, the intrinsic IC formulation also improves the accuracy of other IC based metrics. We implemented our metric and several others in the Java WordNet Similarity Library. © 2008 Springer Berlin Heidelberg.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2008
			
	Nome convegno
	
				OTM 2008 Confederated International Conferences CoopIS, DOA, GADA, IS, and ODBASE 2008
			
	Parole chiave
	
				Feature based similarity; Intrinsic information content; Java WordNet Similarity Library; Semantic similarity
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Design, implementation and evaluation of a new semantic similarity metric combining features and intrinsic information content / Pirro', Giuseppe; Seco, N.. - 5332:2(2008), pp. 1271-1288. (Intervento presentato al  convegno OTM 2008 Confederated International Conferences CoopIS, DOA, GADA, IS, and ODBASE 2008 tenutosi a Monterrey, mex) [10.1007/978-3-540-88873-4_25].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1655468

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

89

ND

social impact