Catalogo dei prodotti della ricerca

Open-Source INTelligence is intelligence based on publicly available sources such as news sites, blogs, forums, etc. The Web is the primary source of information, but once data are crawled, they need to be interpreted and structured. Ontologies may play a crucial role in this process, but because of the vast amount of documents available, automatic mechanisms for their population are needed, starting from the crawled text. This paper presents an approach for the automatic population of predefined ontologies with data extracted from text and discusses the design and realization of a pipeline based on the General Architecture for Text Engineering system, which is interesting for both researchers and practitioners in the field. Some experimental results that are encouraging in terms of extracted correct instances of the ontology are also reported. Furthermore, the paper also describes an alternative approach and provides additional experiments for one of the phases of our pipeline, which requires the use of predefined dictionaries for relevant entities. Through such a variant, the manual workload required in this phase was reduced, still obtaining promising results.

Ontology population for open-source intelligence: A GATE-based solution / Ganino, Giulio; Lembo, Domenico; Mecella, Massimo; Scafoglieri, Federico. - In: SOFTWARE-PRACTICE & EXPERIENCE. - ISSN 0038-0644. - 48:12(2018), pp. 2303-2330. [10.1002/spe.2640]

Ontology population for open-source intelligence: A GATE-based solution

Ganino, Giulio;Lembo, Domenico;Mecella, Massimo;Scafoglieri, Federico

2018

Abstract

Open-Source INTelligence is intelligence based on publicly available sources such as news sites, blogs, forums, etc. The Web is the primary source of information, but once data are crawled, they need to be interpreted and structured. Ontologies may play a crucial role in this process, but because of the vast amount of documents available, automatic mechanisms for their population are needed, starting from the crawled text. This paper presents an approach for the automatic population of predefined ontologies with data extracted from text and discusses the design and realization of a pipeline based on the General Architecture for Text Engineering system, which is interesting for both researchers and practitioners in the field. Some experimental results that are encouraging in terms of extracted correct instances of the ontology are also reported. Furthermore, the paper also describes an alternative approach and provides additional experiments for one of the phases of our pipeline, which requires the use of predefined dictionaries for relevant entities. Through such a variant, the manual workload required in this phase was reduced, still obtaining promising results.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2018
			
	Parole chiave
	
				general architecture for text engineering (GATE); information extraction; internet as a data source; ontology population; open-source intelligence
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Ontology population for open-source intelligence: A GATE-based solution / Ganino, Giulio; Lembo, Domenico; Mecella, Massimo; Scafoglieri, Federico. - In: SOFTWARE-PRACTICE & EXPERIENCE. - ISSN 0038-0644. - 48:12(2018), pp. 2303-2330. [10.1002/spe.2640]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Ganino_Postprint_Ontology-population_2018.pdf Open Access dal 17/09/2019 Note: https://onlinelibrary.wiley.com/doi/full/10.1002/spe.2640 Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.31 MB Formato Adobe PDF	1.31 MB	Adobe PDF
Ganino_Ontology-population_2018.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 2.14 MB Formato Adobe PDF Contatta l'autore	2.14 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1184181

Citazioni

ND

25

14

social impact