Catalogo dei prodotti della ricerca

The Internet explosion and the massive diffusion of mobile devices lead to the creation of a worldwide collaborative system, daily used by millions of users through search engines and application interfaces. New paradigms permit to calculate the similarity of terms using only the statistical information returned by a query, or from additional features; also old algorithms and measures have been applied to new domains and scopes, to efficiently find words clusters from the Web. The problem of evaluating such techniques and algorithms in new domains emerges, and highlights a still open field of experimentation. In this paper, preliminary tests have been held on different semantic proximity measures (average confidence, NGD, PMI, χ2, PMING Distance), and different clustering algorithms among the most used in literature have been compared (e.g. k-means, Expectation-Maximization, spectral clustering) for evaluating such measures. The suitability of the considered measures and methods to calculate the semantic proximity was verified at the state-of-art, and problems were identified, comparing the results of measurements to a ground truth provided by models of contextualized knowledge, clustering and human perception of semantic relations, which data are already studied in literature.

A semantic comparison of clustering algorithms for the evaluation of web-based similarity measures / Franzoni, V., Milani, A.. - STAMPA. - 9790:(2016), pp. 438-452. (16th International Conference on Computational Science and Its Applications, ICCSA 2016 Beijing; China 4 July 2016 through 7 July 2016) [10.1007/978-3-319-42092-9_34].

A semantic comparison of clustering algorithms for the evaluation of web-based similarity measures

FRANZONI, VALENTINA;Milani, Alfredo

2016

Abstract

The Internet explosion and the massive diffusion of mobile devices lead to the creation of a worldwide collaborative system, daily used by millions of users through search engines and application interfaces. New paradigms permit to calculate the similarity of terms using only the statistical information returned by a query, or from additional features; also old algorithms and measures have been applied to new domains and scopes, to efficiently find words clusters from the Web. The problem of evaluating such techniques and algorithms in new domains emerges, and highlights a still open field of experimentation. In this paper, preliminary tests have been held on different semantic proximity measures (average confidence, NGD, PMI, χ2, PMING Distance), and different clustering algorithms among the most used in literature have been compared (e.g. k-means, Expectation-Maximization, spectral clustering) for evaluating such measures. The suitability of the considered measures and methods to calculate the semantic proximity was verified at the state-of-art, and problems were identified, comparing the results of measurements to a ground truth provided by models of contextualized knowledge, clustering and human perception of semantic relations, which data are already studied in literature.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2016
			
	Nome convegno
	
				16th International Conference on Computational Science and Its Applications, ICCSA 2016
			
	Parole chiave
	
				Data mining; Clustering; Semantic evaluation; Semantic similarity; Information retrieval
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				A semantic comparison of clustering algorithms for the evaluation of web-based similarity measures / Franzoni, V., Milani, A.. - STAMPA. - 9790:(2016), pp. 438-452. (16th International Conference on Computational Science and Its Applications, ICCSA 2016 Beijing; China 4 July 2016 through 7 July 2016) [10.1007/978-3-319-42092-9_34].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Franzoni_A-Semantic-Comparison_2016.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.46 MB Formato Adobe PDF Contatta l'autore	1.46 MB	Adobe PDF	Contatta l'autore
Franzoni_Frontespizio-indice_A-Semantic-Comparison_2016.pdf solo gestori archivio Tipologia: Altro materiale allegato Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 614.36 kB Formato Adobe PDF Contatta l'autore	614.36 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/948029

Citazioni

ND

18

14

social impact