The paper is concerned with the problem of automatic detection and correction of inconsistent or out of range data in a general process of statistical data collecting. Under such circumstances, errors are usually detected by formulating a set of rules which the data records must respect in order to be declared correct. As a first relevant point, the set of rules itself is checked for inconsistency or redundancy, by encoding it into a propositional logic formula, and solving a sequence of Satisfiability problems. This set of rules is then used to detect erroneous data. In the subsequent phase of error correction, the above set of rules must be satisfied, but the erroneous records should be altered as little as possible, and frequency distributions of correct data should be preserved. As a second relevant point, error correction is modeled by encoding the rules with linear inequalities, and solving a sequence of set covering problems. The proposed procedure is tested on a real-world case of Census.

Errors Detection and Correction in Large Scale Data Collecting / Bruni, Renato; Sassano, Antonio. - STAMPA. - (2001), pp. 84-94. - LECTURE NOTES IN COMPUTER SCIENCE.

Errors Detection and Correction in Large Scale Data Collecting

BRUNI, Renato;SASSANO, Antonio
2001

Abstract

The paper is concerned with the problem of automatic detection and correction of inconsistent or out of range data in a general process of statistical data collecting. Under such circumstances, errors are usually detected by formulating a set of rules which the data records must respect in order to be declared correct. As a first relevant point, the set of rules itself is checked for inconsistency or redundancy, by encoding it into a propositional logic formula, and solving a sequence of Satisfiability problems. This set of rules is then used to detect erroneous data. In the subsequent phase of error correction, the above set of rules must be satisfied, but the erroneous records should be altered as little as possible, and frequency distributions of correct data should be preserved. As a second relevant point, error correction is modeled by encoding the rules with linear inequalities, and solving a sequence of set covering problems. The proposed procedure is tested on a real-world case of Census.
2001
Advances of Intelligent Data Analysis
Information Reconstruction; Propositional Logic; Set Covering Problems
02 Pubblicazione su volume::02a Capitolo o Articolo
Errors Detection and Correction in Large Scale Data Collecting / Bruni, Renato; Sassano, Antonio. - STAMPA. - (2001), pp. 84-94. - LECTURE NOTES IN COMPUTER SCIENCE.
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/497519
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? ND
social impact