New Robust Penalized Estimators for Linear and Logistic Regression

Tez Türü: Doktora

Tezin Yürütüldüğü Kurum: Yıldız Teknik Üniversitesi, İstatistik, Türkiye

Tez Danışmanı: Evren A.A., Fılzmoser P.

Tezin Onay Tarihi: 2017

Tezin Dili: İngilizce

Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu

Özet:

The least squares (LS) regression estimator can be very sensitive in the presence of multicollinearity among predictors and outliers in the data. As a solution, we introduce a new robust version of Liu estimator. Although the proposed estimator is useful for low dimensional data, there are some restrictions of it for high-dimensional data, namely some calculation problems. Respecting this situation, a new robust Liu-type estimator with similar idea is introduced for high-dimensional data. By considering weights, also the resulting estimators are highly robust, but also the estimations of the biasing parameters are robustified.

The main focus of this thesis is to provide a family to literature which is able to deal with multicollinearity among predictors and outliers in the data, particularly high-dimensional data. Concerning improving interpretibility and increasing the model predictive ability in high-dimensional data, variable selection has attracted much research interest. Modern regularization methods have become a popular choice because they perform intrinsic variable selection and parameter estimation simultaneously. However, the estimation procedure becomes more difficult and challenging task when the data suffer from outliers. As a solution, recently, researchers started to improve robust versions of those regualarization methods. With this aim, fully robust versions of the elastic net estimator are introduced for linear regression. Conserning the binary response case, the idea is extended for logistic regression. The algorithms to compute the newly proposed estimators are based on the idea of repeatedly applying the non-robust classical estimators to data subsets only. It is shown how outlier-free subsets can be identified efficiently, and how appropriate tuning parameters for the elastic net penalties can be selected for corresponding model. A final reweighting steps are thought to improve the efficiency of the estimators.

Simulation studies compare with non-robust and other competing robust estimators and reveal the superiority of the newly proposed methods. This is also supported by a reasonable computation time. Additionaly, some real data examples show the advantages of the proposed estimators.