Authorship attribution is a fascinating field at the crossroad between linguistics and information science. Its relevance goes much beyond the specific predictions that different tools can make about authors whose identity is uncertain or hidden behind known “noms de plume”. Correctly spotting the unknown author of a text is far from reflecting a “keyhole” attitude, representing instead the tip of an iceberg whose main body is made of solid tools and algorithms able to extract syntactic, possibly semantic, information out of generic strings of characters. Here we follow a data- compression approach to authorship attribution through which we define a notion of similarity between generic strings of characters (in particular literary texts). We start by assessing the overall performance of our set of tools in performing authorship attribution both on the wide corpus adopted in this volume and on an extended corpus. We then concentrate on the well-known “affaire Ferrante” (originally treated by some of us back in 20061), confirming and strengthening our original claim that, within the corpus considered, Domenico Starnone is the most likely author behind Elena Ferrante. We stress again that, despite the strong hints pointing to Starnone, we cannot rule out the possibility that Ferrante’s signature could hide another author (or several authors) not included in the corpus. Specific analyses are still in order to shed light on this last point.

Data-compression approach to authorship attribution / Lalli, Margherita; Tria, Francesca; Loreto, Vittorio. - (2018). (Intervento presentato al convegno Drawing Elena Ferrante's Profile tenutosi a Padova).

Data-compression approach to authorship attribution

Francesca Tria;Vittorio Loreto
2018

Abstract

Authorship attribution is a fascinating field at the crossroad between linguistics and information science. Its relevance goes much beyond the specific predictions that different tools can make about authors whose identity is uncertain or hidden behind known “noms de plume”. Correctly spotting the unknown author of a text is far from reflecting a “keyhole” attitude, representing instead the tip of an iceberg whose main body is made of solid tools and algorithms able to extract syntactic, possibly semantic, information out of generic strings of characters. Here we follow a data- compression approach to authorship attribution through which we define a notion of similarity between generic strings of characters (in particular literary texts). We start by assessing the overall performance of our set of tools in performing authorship attribution both on the wide corpus adopted in this volume and on an extended corpus. We then concentrate on the well-known “affaire Ferrante” (originally treated by some of us back in 20061), confirming and strengthening our original claim that, within the corpus considered, Domenico Starnone is the most likely author behind Elena Ferrante. We stress again that, despite the strong hints pointing to Starnone, we cannot rule out the possibility that Ferrante’s signature could hide another author (or several authors) not included in the corpus. Specific analyses are still in order to shed light on this last point.
2018
Drawing Elena Ferrante's Profile
Authorship attribution; data compression approach; cross-entropy; LZ77
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Data-compression approach to authorship attribution / Lalli, Margherita; Tria, Francesca; Loreto, Vittorio. - (2018). (Intervento presentato al convegno Drawing Elena Ferrante's Profile tenutosi a Padova).
File allegati a questo prodotto
File Dimensione Formato  
Lalli_Data-compression_2018.pdf

solo gestori archivio

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 731.63 kB
Formato Adobe PDF
731.63 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1290272
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact