A Computational Measure for the Semantic Readability of Segmented Texts

Santucci, Valentino; Bartoccini, Umberto; Mengoni, Paolo; Zanda, Fabio

doi:10.1007/978-3-031-10536-4_8

In this paper we introduce a computational procedure for measuring the semantic readability of a segmented text. The procedure mainly consists of three steps. First, natural language processing tools and unsupervised machine learning techniques are adopted in order to obtain a vectorized numerical representation for any section or segment of the inputted text. Hence, similar or semantically related text segments are modeled by nearby points in a vector space, then the shortest and longest Hamiltonian paths passing through them are computed. Lastly, the lengths of these paths and that of the original ordering on the segments are combined into an arithmetic expression in order to derive an index, which may be used to gauge the semantic difficulty that a reader is supposed to experience when reading the text. A preliminary experimental study is conducted on seven classic narrative texts written in English, which were obtained from the well-known Gutenberg project. The experimental results appear to be in line with our expectations.

A Computational Measure for the Semantic Readability of Segmented Texts / Santucci, Valentino; Bartoccini, Umberto; Mengoni, Paolo; Zanda, Fabio. - 13377:(2022), pp. 107-119. ( International Conference on Computational Science and its Applications (ICCSA 2022) Malaga ) [10.1007/978-3-031-10536-4_8].

A Computational Measure for the Semantic Readability of Segmented Texts

Santucci, Valentino;Bartoccini, Umberto;Mengoni, Paolo;Zanda, Fabio

2022

Abstract

In this paper we introduce a computational procedure for measuring the semantic readability of a segmented text. The procedure mainly consists of three steps. First, natural language processing tools and unsupervised machine learning techniques are adopted in order to obtain a vectorized numerical representation for any section or segment of the inputted text. Hence, similar or semantically related text segments are modeled by nearby points in a vector space, then the shortest and longest Hamiltonian paths passing through them are computed. Lastly, the lengths of these paths and that of the original ordering on the segments are combined into an arithmetic expression in order to derive an index, which may be used to gauge the semantic difficulty that a reader is supposed to experience when reading the text. A preliminary experimental study is conducted on seven classic narrative texts written in English, which were obtained from the well-known Gutenberg project. The experimental results appear to be in line with our expectations.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2022
			
	Nome convegno
	
				International Conference on Computational Science and its Applications (ICCSA 2022)
			
	Parole chiave
	
				Semantic readability of texts; Natural Language Processing; Unsupervised machine learning; Hamiltonian path
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				A Computational Measure for the Semantic Readability of Segmented Texts / Santucci, Valentino; Bartoccini, Umberto; Mengoni, Paolo; Zanda, Fabio. - 13377:(2022), pp. 107-119. ( International Conference on Computational Science and its Applications (ICCSA 2022) Malaga ) [10.1007/978-3-031-10536-4_8].

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1758265

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

1

1

Catalogo dei prodotti della ricerca