Robust Feature Selection With LSTM Recurrent Neural Networks for Artificial Immune Recognition System


Creative Commons License

Sahin C. B., Diri B.

IEEE ACCESS, cilt.7, ss.24165-24178, 2019 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 7
  • Basım Tarihi: 2019
  • Doi Numarası: 10.1109/access.2019.2900118
  • Dergi Adı: IEEE ACCESS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.24165-24178
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Stability and robustness of feature selection techniques have great importance in the high dimensional and small sample data. The neglected subject in the feature selection is solving the instability problem. Therefore, an ensemble gene selection framework is used in order to provide stable and accurate results of feature selection algorithms. Sequence modeling from high-dimensional data is an important research area for the discovery of biomarkers. Identifying biomarkers requires robust gene selection methods, which makes it possible to find important tumor-related genes with high accuracy. The main issue of this paper is creating a model in order to learn long sequences with the artificial immune recognition system (AIRS) for robust feature selection. Long short-term memory (LSTM) recurrent neural networks are trained with the AIRS in order to obtain the long-lived unit cells for use in the feature selection process. LSTM was used to be better understanding the mechanisms involving the "remember'' feature of the immunological behavior of the immune response. We tried to apply a theory suggested by immunologists in order to develop stable associative memory, which capable of solving robustness and optimization tasks. We examined the initial gene selection step based on the different types of group formation algorithm for analysis of the most informative selected features. Microarray datasets are showing remarkable increases in their robustness and classification accuracy. The suggested framework is evaluated on six commonly used microarray datasets.