Variable selection for heteroscedastic data through variance estimation

Baek S., Karaman F., Ahn H.

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, vol.34, no.3, pp.567-583, 2005 (SCI-Expanded) identifier identifier


In this article, we extend some variable selection criteria in regression analysis to heteroscedastic models. First, a sequential test procedure is proposed to identify potential heteroscedasticity of the error variances. Next, we develop a variance estimation method to estimate the variance-covariance matrix for data with unequal variances. We improve Mallows' C-p and AIC using the proposed variance estimation method. This work is motivated by the poor behavior of C-p in highly heteroscedastic models and by the fact that C-p can be written as a linear function of an F statistic for testing the fit of a regression model. The proposed method performs well,for both homoscedastic and heteroscedastic data. Simulation results show that our method is superior to C-p for data with significant heteroscedasticity and is comparable in accuracy for homoscedastic models. The new method is illustrated with real data.