The paper is concerned with the problem of automatic detection and correction of inconsistent or out of range data in a general process of statistical data collection. The proposed approach is able to deal with hierarchical data containing both qualitative and quantitative values. As customary, erroneous data records are detected by formulating a set of rules. Erroneous records should then be corrected, by modifying as less as possible the erroneous data, while causing minimum perturbation to the original frequency distributions of the data. Such process is called imputation. By encoding the rules with linear inequalities, we convert imputation problems into integer linear programming problems. The proposed procedure is tested on a real-world case of census. Results are extremely encouraging both from the computational and from the data quality point of view

Discrete models for data imputation / Bruni, Renato. - In: DISCRETE APPLIED MATHEMATICS. - ISSN 0166-218X. - STAMPA. - 144:1-2(2004), pp. 59-69. [10.1016/j.dam.2004.04.004]

Discrete models for data imputation

BRUNI, Renato
2004

Abstract

The paper is concerned with the problem of automatic detection and correction of inconsistent or out of range data in a general process of statistical data collection. The proposed approach is able to deal with hierarchical data containing both qualitative and quantitative values. As customary, erroneous data records are detected by formulating a set of rules. Erroneous records should then be corrected, by modifying as less as possible the erroneous data, while causing minimum perturbation to the original frequency distributions of the data. Such process is called imputation. By encoding the rules with linear inequalities, we convert imputation problems into integer linear programming problems. The proposed procedure is tested on a real-world case of census. Results are extremely encouraging both from the computational and from the data quality point of view
File allegati a questo prodotto
File Dimensione Formato  
VE_2004_11573-443274.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 218.29 kB
Formato Adobe PDF
218.29 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/443274
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 10
social impact