We present and explore a real case in case in authorship attribution (A.A.) by combining the traditional philological approach with novel mathematical techniques. The problem involves the extensive productions of Basil of Caesarea and his brother Gregory of Nyssa, two influential 4th century Christian theologians, and the attribution of specific and discussed works in their corpora. Our novel method is based on two similarity (pseudo) distances, based, respectively, on the statistics of n-grams and on zip-like algorithms and on a new ranking/voting system that allows to infer the attribution from the values of distances between the unknown texts and the texts of the training corpus. The main results are on one hand the attribution of the letter with 97% of precision to one of the two authors and on the other the strong agreement of the numerical explorations with both the philological analysis and the so far known results for all the works in the two corpora.
The Puzzle of Basil's Epistula 38: A Mathematical Approach to a Philological Problem / Benedetto, Dario; Mirko Degli, Esposti; Giulio, Maspero. - In: JOURNAL OF QUANTITATIVE LINGUISTICS. - ISSN 0929-6174. - STAMPA. - 20:4(2013), pp. 267-287. [10.1080/09296174.2013.830549]
The Puzzle of Basil's Epistula 38: A Mathematical Approach to a Philological Problem
BENEDETTO, Dario;
2013
Abstract
We present and explore a real case in case in authorship attribution (A.A.) by combining the traditional philological approach with novel mathematical techniques. The problem involves the extensive productions of Basil of Caesarea and his brother Gregory of Nyssa, two influential 4th century Christian theologians, and the attribution of specific and discussed works in their corpora. Our novel method is based on two similarity (pseudo) distances, based, respectively, on the statistics of n-grams and on zip-like algorithms and on a new ranking/voting system that allows to infer the attribution from the values of distances between the unknown texts and the texts of the training corpus. The main results are on one hand the attribution of the letter with 97% of precision to one of the two authors and on the other the strong agreement of the numerical explorations with both the philological analysis and the so far known results for all the works in the two corpora.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.