Performances of some high dimensional regression methods: sparse principal component regression


KURNAZ F. S.

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, vol.50, no.9, pp.2529-2543, 2021 (Journal Indexed in SCI) identifier

  • Publication Type: Article / Article
  • Volume: 50 Issue: 9
  • Publication Date: 2021
  • Doi Number: 10.1080/03610918.2021.1898638
  • Title of Journal : COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION
  • Page Numbers: pp.2529-2543
  • Keywords: High dimensional data, Lasso type penalty, Principal component regression, Sparsity

Abstract

Principal component analysis (PCA) is widely used technique in data processing and dimensionality reduction, but it has some drawbacks since each principal component is a linear combination of explanatory variables. As an alternative, the sparse PCA (SPCA) is a very appealing method which produces principal components with sparse loadings. On the other hand, combining PCA on explanatory variables with least squares regression yields to principal component regression (PCR). In PCR, the components are obtained using only explanatory variables, not considered the effect of the dependent variable. Considering the dependent variable, the sparse PCR (SPCR) enables to obtain sparse principal component loadings. But the main drawback of it is the computational cost. Taking into consideration the general structure of PCR, we combine (S)PCA with some sparse regression methods and compared with the classical PCR and last introduced method, SPCR. Extensive simulation studies and real data examples are implemented to show their performances. The results are supported by a reasonable computation time study.