The problem of error detection is generally approached by formulating a set of rules that each household must respect in order to be declared correct. Afterwards, in the correction process, the incorrect values of erroneous records should be replaced by new correct ones with the purpose of restoring their unknown original values. The correction methodology is based on the use of correct records called donors. However, when the set of donors is very large, as in the case of a Census, the iterative comparison between every single erroneous record e and all possible donors could require unacceptable computational time. Therefore, we propose here a new approach for reducing the number of donors that must be examined. This is obtained by preventively dividing the large set of donors by means of a new clustering algorithm.
Data Clustering for Improving the Selection of Donors for Data Imputation / G., Bianchi; Bruni, Renato; R., Nucara; A., Reale. - STAMPA. - (2005).
Data Clustering for Improving the Selection of Donors for Data Imputation
BRUNI, Renato;
2005
Abstract
The problem of error detection is generally approached by formulating a set of rules that each household must respect in order to be declared correct. Afterwards, in the correction process, the incorrect values of erroneous records should be replaced by new correct ones with the purpose of restoring their unknown original values. The correction methodology is based on the use of correct records called donors. However, when the set of donors is very large, as in the case of a Census, the iterative comparison between every single erroneous record e and all possible donors could require unacceptable computational time. Therefore, we propose here a new approach for reducing the number of donors that must be examined. This is obtained by preventively dividing the large set of donors by means of a new clustering algorithm.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.