Recommendation pipelines involve several stages that can critically affect performance and reproducibility. However, early pipeline stages remain under-standardized, limiting comparability and interoperability across studies. This tutorial addresses this gap by providing both theoretical insights and hands-on experience with tools and practices for standardized data processing in recommender systems. In the first part, we introduce DataRec, a Python library for reproducible and interoperable data management, and discuss data filtering, splitting, and topological analysis techniques. In the second part, we explore multimodal feature extraction in domains such as fashion, music, and movies, focusing on the challenges of meaningful multimodal integration. We introduce Ducho, a unified framework for extracting audio, visual, and textual features using modern backends, and demonstrate its integration with the evaluation framework Elliot. The tutorial targets researchers and practitioners with an interest in recommender systems, data preprocessing, and multimodal modeling. All materials, including slides, code, datasets, and recordings, will be openly available on a dedicated tutorial website: https://sites.google.com/view/dd4rec-tutorial/.

Standard Practices for Data Processing and Multimodal Feature Extraction in Recommendation with DataRec and Ducho (D&D4Rec) / Mancino, Alberto Carlo Maria; Attimonelli, Matteo; Di Fazio, Angela; Malitesta, Daniele; Di Noia, Tommaso. - (2025), pp. 1432-1434. ( 19th ACM Conference on Recommender Systems, RecSys 2025 Prague, Czech Republic ) [10.1145/3705328.3748009].

Standard Practices for Data Processing and Multimodal Feature Extraction in Recommendation with DataRec and Ducho (D&D4Rec)

Mancino, Alberto Carlo Maria
;
Attimonelli, Matteo;
2025

Abstract

Recommendation pipelines involve several stages that can critically affect performance and reproducibility. However, early pipeline stages remain under-standardized, limiting comparability and interoperability across studies. This tutorial addresses this gap by providing both theoretical insights and hands-on experience with tools and practices for standardized data processing in recommender systems. In the first part, we introduce DataRec, a Python library for reproducible and interoperable data management, and discuss data filtering, splitting, and topological analysis techniques. In the second part, we explore multimodal feature extraction in domains such as fashion, music, and movies, focusing on the challenges of meaningful multimodal integration. We introduce Ducho, a unified framework for extracting audio, visual, and textual features using modern backends, and demonstrate its integration with the evaluation framework Elliot. The tutorial targets researchers and practitioners with an interest in recommender systems, data preprocessing, and multimodal modeling. All materials, including slides, code, datasets, and recordings, will be openly available on a dedicated tutorial website: https://sites.google.com/view/dd4rec-tutorial/.
2025
19th ACM Conference on Recommender Systems, RecSys 2025
Multimodal Recommendation; Python Library; Recommendation Datasets; Reproducibility
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Standard Practices for Data Processing and Multimodal Feature Extraction in Recommendation with DataRec and Ducho (D&D4Rec) / Mancino, Alberto Carlo Maria; Attimonelli, Matteo; Di Fazio, Angela; Malitesta, Daniele; Di Noia, Tommaso. - (2025), pp. 1432-1434. ( 19th ACM Conference on Recommender Systems, RecSys 2025 Prague, Czech Republic ) [10.1145/3705328.3748009].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1753391
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact