Improving real-time energy decision-making model with an actor-critic agent in modern microgrids with energy storage devices

Bio Gassi K., BAYSAL M.

Energy, vol.263, 2023 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 263
  • Publication Date: 2023
  • Doi Number: 10.1016/
  • Journal Name: Energy
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Applied Science & Technology Source, Aquatic Science & Fisheries Abstracts (ASFA), CAB Abstracts, Communication Abstracts, Computer & Applied Sciences, Environment Index, INSPEC, Metadex, Pollution Abstracts, Public Affairs Index, Veterinary Science Database, Civil Engineering Abstracts
  • Keywords: Actor-critic, Reinforcement learning, Linear programming, Microgrid, Energy management system, Power flow
  • Yıldız Technical University Affiliated: Yes


© 2022 Elsevier LtdThe hereby study combines a reinforcement learning machine and a myopic optimization model to improve the real-time energy decisions in microgrids with renewable sources and energy storage devices. The reinforcement learning-based agent is built as an actor-critic agent making the aggregated near-optimal charging/discharging energy decisions of the microgrid energy storage devices from a discrete action space relying on a reward related to the microgrid online optimal objective function value. The next step time energy levels of storage devices are then computed and provided to the myopic optimization-based decision-making model as parameters which optimally find the incurred power flow within the microgrid minimizing the real-time microgrid energy cost. The real-time measurement of stochastic parameters of the microgrid coupled with the current energy levels of electrical and heat storage are input to the artificially intelligent machine as observations states. The actor-critic agent approximators are modeled as deep neural networks optimized using the Adam gradient descent algorithm with a gradient threshold. Although the proposed model with a 2-kWh increment of the charging/discharging energy training is time-consuming, it has been able at 100% to optimally make microgrid energy decisions and improve online energy decisions by 90.98% compared to the myopic model alone.