Implementation of Hyperparameter Algorithms on Big Data Platforms: A Case Study


Mangliyeva M., Tanrıverdi B., AKTAŞ M. S., KALIPSIZ O., Balçık E.

4. Uluslararası Bilgisayar Bilimleri ve Mühendisliği Konferansı (UBMK 2019), 11 - 15 Eylül 2019 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/ubmk.2019.8907145
  • Anahtar Kelimeler: Hyperparameter Selection, Big Data, Data Analytics, Spark, Simulated Annealing, Bayesian Search, Tree-structured Parzen Estimators, Differential Evolution, Basin Hopping
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Algorithms in each step of data analytics application include hyperparameters which are independent of the data itself. The choice of hyperparameters is one of the most time consuming part of data analytics, since it cannot be performed precisely without the use of heuristic or empirical methods. In our project we have implemented hyperparameter selection algorithms: Simulated Annealing, Bayesian Search, Tree-structured Parzen Estimators, Differential Evolution, Basin Hopping on Spark, distributed big data processing platform. The performance is measured by comparing results we get from each algorithm with results of Random Search algorithm. We have tested the scalability and ability for parallelization of algorithms.