Multiple-choice questions (MCQs) are widely used in educational assessments and professional certification exams. Managing large repositories of MCQs, however, poses several challenges due to the high volume of questions andthe need to maintain their quality and relevance over time. One of these challenges is the presence of questions thatduplicate concepts but are formulated differently. Such questions can indeed elude syntactic controls but provide noadded value to the repository. In this paper, we focus on this specific challenge and propose a workflow for the discovery and management ofpotential duplicate questions in large MCQ repositories. Overall, the workflow comprises three main steps: MCQpreprocessing, similarity computation, and finally a graph-based exploration and analysis of the obtained similarity values. For the preprocessing phase, we consider three main strategies: (i) removing the list of candidate answers from each question, (ii) augmenting each question with the correct answer, or (iii) augmenting each question with all candidate answers. Then, we use deep learning–based natural language processing (NLP) techniques, based on the Transformers architecture, to compute similarities between MCQs based on semantics. Finally, we propose a new approach to graph exploration based on graph communities to analyze the similarities and relationships between MCQs in the graph. We illustrate the approach with a case study of the Competenze Digitaliprogram, a large-scale assessment project by the Italian government.

NLP-Based Management of Large Multiple-Choice Test Item Repositories / Albano, Valentina; Firmani, Donatella; Laura, Luigi; Mathew, JERIN GEORGE; Lucia Paoletti, Anna; Torrente, Irene. - In: THE JOURNAL OF LEARNING ANALYTICS. - ISSN 1929-7750. - (2023). [10.18608/jla.2023.7897]

NLP-Based Management of Large Multiple-Choice Test Item Repositories

Donatella Firmani;Jerin George Mathew;
2023

Abstract

Multiple-choice questions (MCQs) are widely used in educational assessments and professional certification exams. Managing large repositories of MCQs, however, poses several challenges due to the high volume of questions andthe need to maintain their quality and relevance over time. One of these challenges is the presence of questions thatduplicate concepts but are formulated differently. Such questions can indeed elude syntactic controls but provide noadded value to the repository. In this paper, we focus on this specific challenge and propose a workflow for the discovery and management ofpotential duplicate questions in large MCQ repositories. Overall, the workflow comprises three main steps: MCQpreprocessing, similarity computation, and finally a graph-based exploration and analysis of the obtained similarity values. For the preprocessing phase, we consider three main strategies: (i) removing the list of candidate answers from each question, (ii) augmenting each question with the correct answer, or (iii) augmenting each question with all candidate answers. Then, we use deep learning–based natural language processing (NLP) techniques, based on the Transformers architecture, to compute similarities between MCQs based on semantics. Finally, we propose a new approach to graph exploration based on graph communities to analyze the similarities and relationships between MCQs in the graph. We illustrate the approach with a case study of the Competenze Digitaliprogram, a large-scale assessment project by the Italian government.
2023
Multiple-choice question management; natural language processing; deep learning; similarity computation; graph visualisation; learning analytics
01 Pubblicazione su rivista::01a Articolo in rivista
NLP-Based Management of Large Multiple-Choice Test Item Repositories / Albano, Valentina; Firmani, Donatella; Laura, Luigi; Mathew, JERIN GEORGE; Lucia Paoletti, Anna; Torrente, Irene. - In: THE JOURNAL OF LEARNING ANALYTICS. - ISSN 1929-7750. - (2023). [10.18608/jla.2023.7897]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1695891
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact