Catalogo dei prodotti della ricerca

The rise in loosely-structured data available through text, images, and other modalities has called for new ways of querying them. Multimedia Information Retrieval has filled this gap and has witnessed exciting progress in recent years. Tasks such as search and retrieval of extensive multimedia archives have undergone massive performance improvements, driven to a large extent by recent developments in multimodal deep learning. However, methods in this field remain limited in the kinds of queries they support and, in particular, their inability to answer database-like queries. For this reason, inspired by recent work on neural databases, we propose a new framework, which we name Multimodal Neural Databases (MMNDBs). MMNDBs can answer complex database-like queries that involve reasoning over different input modalities, such as text and images, at scale. In this paper, we present the first architecture able to fulfill this set of requirements and test it with several baselines, showing the limitations of currently available models. The results show the potential of these new techniques to process unstructured data coming from different modalities, paving the way for future research in the area.

Multimodal Neural Databases / Trappolini, G.; Santilli, A.; Rodola, E.; Halevy, A.; Silvestri, F.. - (2023), pp. 2619-2628. (Intervento presentato al convegno ACM International Conference on Research and Development in Information Retrieval tenutosi a Taipei; Taiwan) [10.1145/3539618.3591930].

Multimodal Neural Databases

Trappolini G.;Santilli A.;Rodola E.;Halevy A.;Silvestri F.

2023

Abstract

The rise in loosely-structured data available through text, images, and other modalities has called for new ways of querying them. Multimedia Information Retrieval has filled this gap and has witnessed exciting progress in recent years. Tasks such as search and retrieval of extensive multimedia archives have undergone massive performance improvements, driven to a large extent by recent developments in multimodal deep learning. However, methods in this field remain limited in the kinds of queries they support and, in particular, their inability to answer database-like queries. For this reason, inspired by recent work on neural databases, we propose a new framework, which we name Multimodal Neural Databases (MMNDBs). MMNDBs can answer complex database-like queries that involve reasoning over different input modalities, such as text and images, at scale. In this paper, we present the first architecture able to fulfill this set of requirements and test it with several baselines, showing the limitations of currently available models. The results show the potential of these new techniques to process unstructured data coming from different modalities, paving the way for future research in the area.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2023
			
	Nome convegno
	
				ACM International Conference on Research and Development in Information Retrieval
			
	Parole chiave
	
				multimedia information retrieval; databases; neural networks
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Multimodal Neural Databases / Trappolini, G.; Santilli, A.; Rodola, E.; Halevy, A.; Silvestri, F.. - (2023), pp. 2619-2628. (Intervento presentato al  convegno ACM International Conference on Research and Development in Information Retrieval tenutosi a Taipei; Taiwan) [10.1145/3539618.3591930].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Trappolini_Multimodal_2023.pdf accesso aperto Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 3.07 MB Formato Adobe PDF	3.07 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1699202

Citazioni

ND

11

3

social impact