Skip to main content
abstract blue background

Education & Training

Login for Access

Description
Deep reinforcement learning (DRL) has received much attention and finds successful applications in various important fields, including games, robotics, transportation, and science. Despite its continuing success, DRL still faces several major challenges, including accurate value function estimation, improved sample efficiency, and efficient practical implementation. In this talk, we will present our recent results on tackling these issues in DRL. (i) Using Boltzman softmax operator for improving single-agent DRL value function estimate. We show that properly incorporating the softmax operator in continuous control helps smooth the optimization landscape and leads to efficient policy search and optimization. We then present the Softmax Deep Double Deterministic Policy Gradient (SD3) algorithm, which effectively improves the overestimation and underestimation bias and outperforms state-of-the-art methods. (ii) Using regularization and Softmax for efficient policy search in multi-agent RL (MARL). We first discover a gradient explosion issue suffered by existing methods, which severely affects value function estimation. We then propose a novel Softmax and regularization-based update scheme RES to penalize large joint action values that deviate from a baseline and demonstrate its effectiveness in policy learning. (iii) Applying DRL to sustainable computing applications. We develop highly scalable and efficient DRL algorithms for large-scale dockless bike sharing and network optimization problems, which significantly outperform state-of-the-art methods
Presenters
Longbo Huang
ComSoc Member Price
$0.00
IEEE Member Price
$4.99
Non-Member Price
$9.99