Retrieval Augmented Generation enhances LLM accuracy by adding passages retrieved from an external corpus to the LLM prompt. This paper investigates how positional bias - the tendency of LLMs to weight information differently based on its position in the prompt - affects not only the LLM’s capability to capitalize on relevant passages, but also its susceptibility to distracting passages. Through extensive experiments on three benchmarks, we show how state-of-the-art retrieval pipelines, while attempting to retrieve relevant passages, systematically bring highly distracting ones to the top ranks, with over 60% of queries containing at least one highly distracting passage among the top-10 retrieved passages. As a result, the impact of the LLM positional bias, which in controlled settings is often reported as very prominent by related works, is actually marginal in real scenarios since both relevant and distracting passages are, in turn, penalized. Indeed, our findings reveal that sophisticated strategies that attempt to rearrange the passages based on LLM positional preferences do not perform better than random shuffling.

Do RAG Systems Really Suffer From Positional Bias? / Cuconasu, Florin; Filice, Simone; Horowitz, Guy; Maarek, Yoelle; Silvestri, Fabrizio. - (2025), pp. 28010-28024. ( EMNLP 2025 - Empirical Methods in Natural Language Processing Suzhou; China ) [10.18653/v1/2025.emnlp-main.1422].

Do RAG Systems Really Suffer From Positional Bias?

Cuconasu, Florin
Primo
;
Silvestri, Fabrizio
2025

Abstract

Retrieval Augmented Generation enhances LLM accuracy by adding passages retrieved from an external corpus to the LLM prompt. This paper investigates how positional bias - the tendency of LLMs to weight information differently based on its position in the prompt - affects not only the LLM’s capability to capitalize on relevant passages, but also its susceptibility to distracting passages. Through extensive experiments on three benchmarks, we show how state-of-the-art retrieval pipelines, while attempting to retrieve relevant passages, systematically bring highly distracting ones to the top ranks, with over 60% of queries containing at least one highly distracting passage among the top-10 retrieved passages. As a result, the impact of the LLM positional bias, which in controlled settings is often reported as very prominent by related works, is actually marginal in real scenarios since both relevant and distracting passages are, in turn, penalized. Indeed, our findings reveal that sophisticated strategies that attempt to rearrange the passages based on LLM positional preferences do not perform better than random shuffling.
2025
EMNLP 2025 - Empirical Methods in Natural Language Processing
RAG; LLM
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Do RAG Systems Really Suffer From Positional Bias? / Cuconasu, Florin; Filice, Simone; Horowitz, Guy; Maarek, Yoelle; Silvestri, Fabrizio. - (2025), pp. 28010-28024. ( EMNLP 2025 - Empirical Methods in Natural Language Processing Suzhou; China ) [10.18653/v1/2025.emnlp-main.1422].
File allegati a questo prodotto
File Dimensione Formato  
Cuconasu_Do-RAG-Systems_2025.pdf

accesso aperto

Note: DOI: 10.18653/v1/2025.emnlp-main.1422
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 497.44 kB
Formato Adobe PDF
497.44 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1756368
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact