KFPT: Reliability and uncertainty filtered self-distillation for language model training


YÜCE M. K., Fatih Amasyali M.

Knowledge-Based Systems, cilt.343, 2026 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 343
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1016/j.knosys.2026.115880
  • Dergi Adı: Knowledge-Based Systems
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Library, Information Science & Technology Abstracts (LISTA)
  • Anahtar Kelimeler: Continual pretraining, KFPT, Large language models, Reliability weighting, Self-distillation, Two-phase training, Uncertainty (entropy)-based filtering
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Transformer-based large language models are typically trained on large text corpora using the next-token cross-entropy (CE) objective. Although CE is scalable and stable, in practice it can exhibit limitations such as overconfidence, weak learning signals on hard/rare tokens, and a mismatch between the training objective and generation-time behavior. In this work, we propose Knowledge-Filtered Phase Training (KFPT), a two-phase scheme that strengthens the training signal without requiring an additional teacher/model. In the first phase, KFPT augments CE with a selective regularization term (RU) and, at fixed intervals, performs a second forward pass on the same text by masking small blocks in the attention mask, averaging the CE losses to make updates more stable. In the second phase, KFPT adds a one-way KL-consistency term by taking the distribution from a span-drop-induced second view as the target; this term is selectively weighted and strengthened only at useful positions based on the reference view's reliability (gold-margin and correctness) and the student's uncertainty (entropy). We also analyze why the additional terms used in Phase 1 and Phase 2 can be effective through mathematical theorems and proofs. In comprehensive experiments, we compare KFPT across multiple model architectures and training regimes against a strong CE baseline and prior teacher-free objective-improvement methods. The results show that KFPT generally improves accuracy and reduces perplexity, outperforming teacher-free alternatives in the literature.