SelfCCL: Curriculum Contrastive Learning by Transferring Self-Taught Knowledge for Fine-Tuning BERT

Creative Commons License

Dehghan S., AMASYALI M. F.

Applied Sciences (Switzerland), vol.13, no.3, 2023 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 13 Issue: 3
  • Publication Date: 2023
  • Doi Number: 10.3390/app13031913
  • Journal Name: Applied Sciences (Switzerland)
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Agricultural & Environmental Science Database, Applied Science & Technology Source, Communication Abstracts, INSPEC, Metadex, Directory of Open Access Journals, Civil Engineering Abstracts
  • Keywords: transfer learning, curriculum learning, contrastive learning, self-taught learning, sentence embedding, natural language processing, semantic textual similarity, BERT
  • Yıldız Technical University Affiliated: Yes


© 2023 by the authors.BERT, the most popular deep learning language model, has yielded breakthrough results in various NLP tasks. However, the semantic representation space learned by BERT has the property of anisotropy. Therefore, BERT needs to be fine-tuned for certain downstream tasks such as Semantic Textual Similarity (STS). To overcome this problem and improve the sentence representation space, some contrastive learning methods have been proposed for fine-tuning BERT. However, existing contrastive learning models do not consider the importance of input triplets in terms of easy and hard negatives during training. In this paper, we propose the SelfCCL: Curriculum Contrastive Learning model by Transferring Self-taught Knowledge for Fine-Tuning BERT, which mimics the two ways that humans learn about the world around them, namely contrastive learning and curriculum learning. The former learns by contrasting similar and dissimilar samples. The latter is inspired by the way humans learn from the simplest concepts to the most complex concepts. Our model also performs this training by transferring self-taught knowledge. That is, the model figures out which triplets are easy or difficult based on previously learned knowledge, and then learns based on those triplets in the order of curriculum using a contrastive objective. We apply our proposed model to the BERT and Sentence BERT(SBERT) frameworks. The evaluation results of SelfCCL on the standard STS and SentEval transfer learning tasks show that using curriculum learning together with contrastive learning increases average performance to some extent.