Metadata-Integrated Deep Semi-Autoencoder for Implicit-Feedback Recommendation Under Data Sparsity and Cold Start


Durdu U., Kemalbay G.

IEEE Access, cilt.14, ss.47990-48008, 2026 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 14
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1109/access.2026.3678156
  • Dergi Adı: IEEE Access
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Sayfa Sayıları: ss.47990-48008
  • Anahtar Kelimeler: Autoencoders, cold-start problem, collaborative filtering, data sparsity, implicit feedback, metadata, recommender systems, top-N recommendation
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Collaborative filtering under implicit feedback suffers from data sparsity and cold start, degrading ranking quality. The basic semi-autoencoder (BSAE) integrates metadata to improve top-k performance, but its single-hidden-layer design may limit the ability to capture higher-order user–item relations. A training algorithm dense re-feeding has been introduced in prior work for deep autoencoders to mitigate sparsity, but it operates on reconstructed interactions and does not use metadata, so its value in hybrid settings remains unclear. To address this gap, we propose a deep semi-autoencoder (DSAE) for implicit-feedback recommender systems that uses depth as an inductive bias to hierarchically separate and recompose interaction and metadata signals, improving top-10 ranking under data sparsity and cold start. We adapt dense re-feeding to our hybrid DSAE by re-feeding only the reconstructed interaction block while keeping metadata fixed. Then, we train this variant (DSAE+RA) to quantify any incremental benefit. DSAE is trained with binary cross-entropy and selected by validation NDCG@10 with early stopping. We select capacity and depth in two stages, using information criteria only as tie-breakers. All experiments follow a reproducible user-level 70/15/15 split with cold-item filtering and full-item evaluation without negative sampling. In MovieLens-1M, DSAE outperforms BSAE, DSAE+RA, and a variational baseline (Mult-VAE) on validation and test; achieving the test NDCG@10 = 0.866, Precision@10 = 0.833, Recall@10 = 0.092, and mAP@10 = 0.170. Robustness is assessed through threshold sensitivity, sparsity-faithfulness analysis at τ = 3.5, metadata ablations, and bucketed cold-start evaluations over user interaction-frequency regimes. The results indicate that dense re-feeding can aid a shallow baseline, whereas DSAE yields stronger and more stable top-10 ranking under sparsity and cold-start conditions.