EXACT DISTRIBUTION OF SQUARED WELSCH-KUH DISTANCE AND IDENTIFICATION OF INFLUENTIAL OBSERVATIONS
Abstract
This paper proposes the exact distribution of squared DFFITS alias squared Welsch- Kuh ( 2 ) WK distance measure used to evaluate the influential observations in a multiple linear regression analysis. The authors have explored the relationship between the 2 WK in terms of two independent F-ratio’s and they have shown the derived density function of the 2 WK distance in a complicated series expression form involving Gauss hyper-geometric function with two shape parameters p and n. Moreover, the mean, variance of the distribution are derived in terms of the shape parameters and the authors have established the upper control limit of 2 WK . Similarly, the critical points of squared Welsch-Kuh ( 2 ) WK distance measure are computed at 5% and 1% significance levesl for different sample sizes and varying no. of predictors. Finally, the numerical example shows the identification of the influential observations and the results extracted from the proposed approaches are more scientific, systematic and their exactness outperforms the Welsch-Kuh’s traditional approach.