Cyclical Curriculum Learning


Creative Commons License

KESGİN H. T., AMASYALI M. F.

IEEE Transactions on Neural Networks and Learning Systems, 2023 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1109/tnnls.2023.3265331
  • Dergi Adı: IEEE Transactions on Neural Networks and Learning Systems
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Applied Science & Technology Source, Biotechnology Research Abstracts, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, EMBASE, INSPEC, MEDLINE, Metadex, Civil Engineering Abstracts
  • Anahtar Kelimeler: Artificial neural networks, Curriculum learning (CL), Data models, deep learning, optimization, Spirals, Task analysis, Text categorization, Training, Training data
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Artificial neural networks (ANNs) are inspired by human learning. However, unlike human education, classical ANN does not use a curriculum. Curriculum learning (CL) refers to the process of ANN training in which samples are used in a meaningful order. When using CL, training begins with a subset of the dataset and new samples are added throughout the training, or training begins with the entire dataset and the number of samples used is reduced. With these changes in training dataset size, better results can be obtained with curriculum, anti-curriculum, or random-curriculum methods than the vanilla method. However, a generally efficient CL method for various architectures and datasets is not found. In this article, we propose cyclical CL (CCL), in which the data size used during training changes cyclically rather than simply increasing or decreasing. Instead of using only the vanilla method or only the curriculum method, using both methods cyclically like in CCL provides more successful results. We tested the method on 18 different datasets and 15 architectures in image and text classification tasks and obtained more successful results than no-CL and existing CL methods. We also have shown theoretically that it is less erroneous to apply CL and vanilla cyclically instead of using only CL or only the vanilla method. The code of the cyclical curriculum is available at https://github.com/CyclicalCurriculum/Cyclical-Curriculum.