Errors can be detected into large data sets by means of rule-based techniques (e.g. Fellegi-Holt). They essentially consist in checking if each record satisfies a number of rules. Records not respecting all the rules are declared erroneous records. In all the cases When data collecting has a cost, which are the majority, we are interested in correcting such data. The correction consists in changing a number of fields of the erroneous record, until it satisfies the above rules. This should generally be performed by modifying as less as possible the erroneous data, while causing minimum perturbation to the original frequency distributions of the data. Such process is called data imputation, and is of great relevance in the field of statistics. A new procedure for data imputation by using a discrete optimization model is here presented.
Error Correction Models / Bruni, Renato; A., Reale; R., Torelli. - STAMPA. - (2002). (Intervento presentato al convegno annual conference AIRO tenutosi a L'Aquila).
Error Correction Models
BRUNI, Renato;
2002
Abstract
Errors can be detected into large data sets by means of rule-based techniques (e.g. Fellegi-Holt). They essentially consist in checking if each record satisfies a number of rules. Records not respecting all the rules are declared erroneous records. In all the cases When data collecting has a cost, which are the majority, we are interested in correcting such data. The correction consists in changing a number of fields of the erroneous record, until it satisfies the above rules. This should generally be performed by modifying as less as possible the erroneous data, while causing minimum perturbation to the original frequency distributions of the data. Such process is called data imputation, and is of great relevance in the field of statistics. A new procedure for data imputation by using a discrete optimization model is here presented.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.