AIML knowledge base construction from text corpora

De Gasperis, Giovanni; Chiari, Isabella; Florio, Niva

doi:10.1007/978-3-642-29694-9-12

Text mining (TM) and computational linguistics (CL) are computationally intensive fields where many tools are becoming available to study large text corpora and exploit the use of corpora for various purposes. In this chapter we will address the problem of building conversational agents or chatbots from corpora for domain-specific educational purposes. After addressing some linguistic issues relevant to the development of chatbot tools from corpora, a methodology to systematically analyze large text corpora about a limited knowledge domain will be presented. Given the Artificial Intelligence Markup Language as the assembly language for the artificial intelligence conversational agents we present a way of using text corpora as seed from which a set of source files can be derived. More specifically we will illustrate how to use corpus data to extract relevant keywords, multiword expressions, glossary building and text patterns in order to build an AIML knowledge base that could be later used to build interactive conversational systems. The approach we propose does not require deep understanding techniques for the analysis of text. As a case study it will be shown how to build the knowledge base of an English conversational agent for educational purpose from a child story that can answer question about characters, facts and episodes of the story. A discussion of the main linguistic and methodological issues and further improvements is offered in the final part of the chapter.

AIML knowledge base construction from text corpora / De Gasperis, Giovanni; Chiari, Isabella; Florio, Niva. - STAMPA. - 427(2013), pp. 287-318. - STUDIES IN COMPUTATIONAL INTELLIGENCE. [10.1007/978-3-642-29694-9-12].

AIML knowledge base construction from text corpora

De Gasperis, Giovanni;CHIARI, ISABELLA;Florio, Niva

2013

Abstract

Text mining (TM) and computational linguistics (CL) are computationally intensive fields where many tools are becoming available to study large text corpora and exploit the use of corpora for various purposes. In this chapter we will address the problem of building conversational agents or chatbots from corpora for domain-specific educational purposes. After addressing some linguistic issues relevant to the development of chatbot tools from corpora, a methodology to systematically analyze large text corpora about a limited knowledge domain will be presented. Given the Artificial Intelligence Markup Language as the assembly language for the artificial intelligence conversational agents we present a way of using text corpora as seed from which a set of source files can be derived. More specifically we will illustrate how to use corpus data to extract relevant keywords, multiword expressions, glossary building and text patterns in order to build an AIML knowledge base that could be later used to build interactive conversational systems. The approach we propose does not require deep understanding techniques for the analysis of text. As a case study it will be shown how to build the knowledge base of an English conversational agent for educational purpose from a child story that can answer question about characters, facts and episodes of the story. A discussion of the main linguistic and methodological issues and further improvements is offered in the final part of the chapter.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2013
			
	Titolo del volume
	
				Artificial Intelligence, Evolutionary Computation and Metaheuristics. In the footsteps of Alan Turing
			
	ISBN
	
				9783642296932
			
	Tipologia
	
				02 Pubblicazione su volume::02a Capitolo o Articolo
			
	Citazione
	
				AIML knowledge base construction from text corpora / De Gasperis, Giovanni; Chiari, Isabella; Florio, Niva. - STAMPA. - 427(2013), pp. 287-318. - STUDIES IN COMPUTATIONAL INTELLIGENCE. [10.1007/978-3-642-29694-9-12].
			
	Appartiene alla tipologia:
	
				02a Capitolo o Articolo

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/462912

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

9

ND

Catalogo dei prodotti della ricerca