Disentangling Technical and Content Attributes in Search Engine Ranking: A Comparative Study of Google and Bing

Cebeci, Göker; DİRİ, Banu

doi:10.1109/access.2026.3657977

Disentangling Technical and Content Attributes in Search Engine Ranking: A Comparative Study of Google and Bing

Cebeci G., DİRİ B.

IEEE Access, cilt.14, ss.14777-14793, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 14
Basım Tarihi: 2026
Doi Numarası: 10.1109/access.2026.3657977
Dergi Adı: IEEE Access
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Sayfa Sayıları: ss.14777-14793
Anahtar Kelimeler: Comparative analysis, content relevance, ranking factors, semantic similarity, system profiling, technical performance
Yıldız Teknik Üniversitesi Adresli: Evet

Özet

This study presents a novel empirical methodology to characterize and compare the ranking environments of major information retrieval systems, specifically Google and Bing. By analyzing technical and content attributes from a dataset of 14,465 Search Engine Results Page (SERP) items collected from a homogeneous commercial discount domain comprising 500 queries, we aim to characterize observable associative patterns between resource attributes and ranking outcomes. The dataset includes Lighthouse performance metrics and advanced content features, such as Sentence-BERT-based semantic similarity. Using K-Means clustering, we identify five resource profiles representing emergent optimization archetypes. The analysis revealed that content-related factors had a higher aggregate importance for both systems (Google: 70.1%, Bing: 61.8%) than technical factors. Specifically, Random Forest feature importance analysis highlighted that for Bing, content volume was a dominant predictor, whereas for Google, semantic relevance signals outweighed pure keyword targeting. We further contextualize these findings within an "Authority–Optimization Trade-off" framework, suggesting that Google’s negative associations for certain on-page optimization signals likely reflect a ranking function that heavily weights latent domain authority over explicit on-page compliance. These findings highlight how modern learning-to-rank systems may differentially weight explicit content features and latent authority signals when balancing relevance, diversity, and quality.