

Author: Seppo Laaksonen
Publisher: Taylor & Francis Ltd
ISSN: 1360-0532
Source: Journal of Applied Statistics, Vol.30, Iss.9, 2003-11, pp. : 1009-1020
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
This paper deals with imputation techniques and strategies. Usually, imputation truly commences after the first data editing, but many preceding operations are needed before that. In this editing step, the missing or deficient items are to be recognized and coded, and then it is decided which of these, if any, should be substituted by imputing. There are a number of imputation methods and their specifications. Consequently, it is not clear what method finally should be chosen, especially when an imputation method may be best in one respect, and another method in the other. In this paper, we consider these questions through the following four imputation methods: (i) random hot decking, (ii) logistic regression imputation, (iii) linear regression imputation, and (iv) regression-based nearest neighbour hot decking. The last two methods are applied with the two different specifications. The two metric variables have been used in empirical tests. The first is very complex, but the second is more ordinary, and thus easier to handle. The empirical examples are based on simulations, which clearly show the biases of the various methods and their specifications. In general, it seems that method (iv) is recommendable although the results from it are not perfect either.
Related content


Control Techniques for Complex Networks
Journal of Applied Statistics, Vol. 36, Iss. 6, 2009-06 ,pp. :




Randomized response techniques for complex survey designs
By Arnab Raghunath Dorffner Georg
Statistical Papers, Vol. 48, Iss. 1, 2007-01 ,pp. :


Randomized response techniques for complex survey designs
Statistical Papers, Vol. 48, Iss. 2, 2007-04 ,pp. :