In this paper we discuss a novel mathematical approach to authorship attribution which we implemented recently to face a concrete problem of author recognition. The fundamental ideas for our methods came from statistical mechanics and information theory. We combine two approaches. Both of them use similarity measures between couples of texts as indicators of stylistic closeness: the first one is based on the comparison of frequencies of fixed length substrings (n-grams) throughout the texts; the second one relies on a suitable use of compression algorithms as relative entropy approximators, in the spirit of the so-called Ziv-Merhav theorem. The two methods were separately developed and then combined, together with a suitable and theoretically founded ranking analysis, to produce an original authorship attribution procedure that yielded very successful results on the specific problem to which it was applied. This ranking analysis could be of interest also in other application fields.

An example of mathematical authorship attribution / Chiara, Basile; Benedetto, Dario; Caglioti, Emanuele; M., Degli Esposti. - In: JOURNAL OF MATHEMATICAL PHYSICS. - ISSN 0022-2488. - STAMPA. - 49:12(2008), pp. 125211-125231. [10.1063/1.2996507]

An example of mathematical authorship attribution

BENEDETTO, Dario;CAGLIOTI, Emanuele;
2008

Abstract

In this paper we discuss a novel mathematical approach to authorship attribution which we implemented recently to face a concrete problem of author recognition. The fundamental ideas for our methods came from statistical mechanics and information theory. We combine two approaches. Both of them use similarity measures between couples of texts as indicators of stylistic closeness: the first one is based on the comparison of frequencies of fixed length substrings (n-grams) throughout the texts; the second one relies on a suitable use of compression algorithms as relative entropy approximators, in the spirit of the so-called Ziv-Merhav theorem. The two methods were separately developed and then combined, together with a suitable and theoretically founded ranking analysis, to produce an original authorship attribution procedure that yielded very successful results on the specific problem to which it was applied. This ranking analysis could be of interest also in other application fields.
2008
entropy; statistical mechanics
01 Pubblicazione su rivista::01a Articolo in rivista
An example of mathematical authorship attribution / Chiara, Basile; Benedetto, Dario; Caglioti, Emanuele; M., Degli Esposti. - In: JOURNAL OF MATHEMATICAL PHYSICS. - ISSN 0022-2488. - STAMPA. - 49:12(2008), pp. 125211-125231. [10.1063/1.2996507]
File allegati a questo prodotto
File Dimensione Formato  
Basile_An-example_2008.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 557.11 kB
Formato Adobe PDF
557.11 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/228980
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 27
  • ???jsp.display-item.citation.isi??? 18
social impact