33rd IEEE Conference on Signal Processing and Communications Applications, SIU 2025, İstanbul, Türkiye, 25 - 28 Haziran 2025, (Tam Metin Bildiri)
Wireless communication is inherently exposed to constant threats of attacks. Therefore, various methods have been developed to ensure data protection and integrity during wireless communication. This study aimed to protect the integrity and content of data against attacks during wireless data transmission. An environment with nine different channels and twelve distinct time slots was constructed using a Q-learning-based reinforcement learning method; within this environment, a signal jammer was deployed to launch attacks on specific channels at designated times. The problem of enabling the transmitter to avoid areas where the jammer is active and successfully transmit data was addressed by comparing the ϵ-greedy (EG) and upper confidence bound (UCB) policies under the Q-learning algorithm, and the results demonstrated that the upper confidence bound policy outperformed the ϵ-greedy policy.