Digital annotation of web pages presents two types of problems which are unknown to traditional annotation and which are connected to the dynamicity and the openness of the Web. The first problem is related to the possibility of replicating a document over multiple sites, so that it can be retrieved over the Web at different URLs or with different queries. This poses the need to associate to a web page all the annotations pertaining to its content, even if they were created while accessing the same content under a different URL. The second problem is related to the dynamics of individual HTML pages that often consist of insertions, deletions or movement of page segments. Annotations related to portions of the page that have moved within the page itself should be retrieved and shown to the user. To reduce the impact of these phenomena on the usefulness of the annotation process, our annotation system madcow incorporates two algorithms which assess the identity of two pages under two different URLs, and the differences between two versions of a page under the same URL, taking the proper actions in order to retrieve all the pertaining annotations.

Differences and Identities in Document Retrieval in an Annotation Environment / Bottoni, Paolo Gaspare; M., Cuomo; LEVIALDI GHIRON, Stefano; Panizzi, Emanuele; M., Passavanti; R., Trinchese. - STAMPA. - 4777:(2007), pp. 139-153. (Intervento presentato al convegno DNIS 2007 tenutosi a Aizu Wakamatsu, Japan nel October 17-19, 2007) [10.1007/978-3-540-75512-8_11].

Differences and Identities in Document Retrieval in an Annotation Environment

BOTTONI, Paolo Gaspare;LEVIALDI GHIRON, Stefano;PANIZZI, Emanuele;
2007

Abstract

Digital annotation of web pages presents two types of problems which are unknown to traditional annotation and which are connected to the dynamicity and the openness of the Web. The first problem is related to the possibility of replicating a document over multiple sites, so that it can be retrieved over the Web at different URLs or with different queries. This poses the need to associate to a web page all the annotations pertaining to its content, even if they were created while accessing the same content under a different URL. The second problem is related to the dynamics of individual HTML pages that often consist of insertions, deletions or movement of page segments. Annotations related to portions of the page that have moved within the page itself should be retrieved and shown to the user. To reduce the impact of these phenomena on the usefulness of the annotation process, our annotation system madcow incorporates two algorithms which assess the identity of two pages under two different URLs, and the differences between two versions of a page under the same URL, taking the proper actions in order to retrieve all the pertaining annotations.
2007
DNIS 2007
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Differences and Identities in Document Retrieval in an Annotation Environment / Bottoni, Paolo Gaspare; M., Cuomo; LEVIALDI GHIRON, Stefano; Panizzi, Emanuele; M., Passavanti; R., Trinchese. - STAMPA. - 4777:(2007), pp. 139-153. (Intervento presentato al convegno DNIS 2007 tenutosi a Aizu Wakamatsu, Japan nel October 17-19, 2007) [10.1007/978-3-540-75512-8_11].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/242100
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact