Computer-aided stylometry is a powerful tool in authorship attribution. Recent models can point the author of an anonymous text among thousands or distinguish different contributors to one text. However, most methods are quite complex and depend on the language. We propose a new Authorship Attribution method based on inference using a stochastic process. Every author is associated with the process that is most likely to reproduce their known corpus. We assign a text to the author whose process gives the highest probability of producing the text. We find high attribution rates independent of the language of the text or the tokenisation. Inference using stochastic processes offers exciting opportunities for stylometry and information retrieval.
Generative models for inference: an application to authorship attribution / TANI RAFFAELLI, Giulio. - (2022 May 26).
Generative models for inference: an application to authorship attribution
TANI RAFFAELLI, GIULIO
26/05/2022
Abstract
Computer-aided stylometry is a powerful tool in authorship attribution. Recent models can point the author of an anonymous text among thousands or distinguish different contributors to one text. However, most methods are quite complex and depend on the language. We propose a new Authorship Attribution method based on inference using a stochastic process. Every author is associated with the process that is most likely to reproduce their known corpus. We assign a text to the author whose process gives the highest probability of producing the text. We find high attribution rates independent of the language of the text or the tokenisation. Inference using stochastic processes offers exciting opportunities for stylometry and information retrieval.File | Dimensione | Formato | |
---|---|---|---|
Tesi_dottorato_TaniRaffaelli.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
5.69 MB
Formato
Adobe PDF
|
5.69 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.