In this paper, after reconstructing some essential phases in the evolution of automatic analysis of texts, the steps of an ideal strategy for the statistical analysis of textual data are defined. The characteristics of lexical and textual analysis are described, as well as some techniques of information extraction, that employ resources which are endogenous and exogenous with respect to the texts to be examined. In order to show the potential of textual statistics and of the most recent Text Mining applications, some relevant case studies concerning statistical survey and document analysis are illustrated.
Statistica testuale e text mining: alcuni paradigmi applicativi / Bolasco, Sergio. - In: QUADERNI DI STATISTICA. - ISSN 1594-3739. - STAMPA. - 7:(2005), pp. 17-53.
Statistica testuale e text mining: alcuni paradigmi applicativi
BOLASCO, Sergio
2005
Abstract
In this paper, after reconstructing some essential phases in the evolution of automatic analysis of texts, the steps of an ideal strategy for the statistical analysis of textual data are defined. The characteristics of lexical and textual analysis are described, as well as some techniques of information extraction, that employ resources which are endogenous and exogenous with respect to the texts to be examined. In order to show the potential of textual statistics and of the most recent Text Mining applications, some relevant case studies concerning statistical survey and document analysis are illustrated.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.