KAFA-Merge: Model Merging Method With Layer-Based Bayes+Linear Search

YÜCE, Muzaffer; AMASYALI, Mehmet

doi:10.1109/access.2025.3582015

KAFA-Merge: Model Merging Method With Layer-Based Bayes+Linear Search

YÜCE M. K., AMASYALI M. F.

IEEE Access, cilt.13, ss.108743-108755, 2025 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 13
Basım Tarihi: 2025
Doi Numarası: 10.1109/access.2025.3582015
Dergi Adı: IEEE Access
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Sayfa Sayıları: ss.108743-108755
Anahtar Kelimeler: Bayesian optimization, large language models, layer-based model merging, linear optimization, model blending, multilayer learning, search space optimization, weight merging
Yıldız Teknik Üniversitesi Adresli: Evet

Özet

In this paper, we consider the model merging process for large language models (LLMs) under a two-stage optimization framework. Traditional merging methods usually apply fixed blending rates to all layers, which ignores the different levels of contribution of layers to the output. To overcome this limitation, we first group layers with similar functionality and use Bayesian optimization to determine the optimal blending rates for each group. Bayesian optimization offers a globally effective discovery strategy in high-dimensional and computationally expensive search spaces. However, an approach based solely on global search can lead to local artifacts being missed. Therefore, in the second stage, we perform Linear Search by applying controlled variations around the ratios obtained with Bayesian optimization and obtain more precise, locally optimized results. Extensive experiments on the Turkish versions of the Arc, Hellaswag, MMLU and GSM8K datasets show that our proposed Bayes+Linear strategy outperforms existing methods such as SLERP, Linear, Ties and Breadcrumbs. In addition to improving model accuracy and generalization capacity, the KAFA approach works without any additional fine-tuning, making it possible to effectively reuse large language models in different tasks.