Crowd Density Estimation by Using Attention Based Capsule Network and Multi-Column CNN

Creative Commons License

Kizrak M. A., BOLAT B.

IEEE ACCESS, vol.9, pp.75435-75445, 2021 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 9
  • Publication Date: 2021
  • Doi Number: 10.1109/access.2021.3081529
  • Journal Name: IEEE ACCESS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Page Numbers: pp.75435-75445
  • Keywords: Feature extraction, Estimation, Task analysis, Adaptation models, Distortion, Predictive models, Analytical models, Capsule attention, crowd counting, density map, multi-column CNN, CONVOLUTIONAL NEURAL-NETWORK, COUNTING PEOPLE, TRACKING, LOCALIZATION, RECOGNITION, MODEL
  • Yıldız Technical University Affiliated: Yes


We propose a strategy that focuses on estimating the number of people in a crowd, one of the aims of crowd analysis, using static images or video images. While manual feature extraction was not performed with pixel and regression-based methods in the first studies on crowd analysis, recent studies use Convolutional Neural Networks (CNN) based models. However, it is still difficult to extract spatial information such as position, orientation, posture, and angular value for crowd estimation from a density map. This study uses capsule networks and routing by agreement algorithm as an attention module. Our proposed approach consists of both CNN and capsule network-based attention modules in a two-column deep neural network architecture. We evaluate our proposed approach compared with other state-of-the-art methods using three well-known datasets: UCF-QNRF, UCF_CC_50, UCSD, ShangaiTech Part A, and WorldExpo'10.