•  
  •  
 

Journal of System Simulation

Abstract

Abstract: Deep reinforcement learning (DRL) has achieved remarkable success in various domains. Nevertheless, existing policy networks in DRL still face significant challenges in areas such as generalizability, multi-task adaptability, and sample efficiency. Policy representation, as a crucial research direction for enhancing DRL capabilities, aims to improve an agent's adaptability to environmental changes and novel tasks by constructing more efficient and generalizable forms of policy expression. This paper provided a concise overview of key research advances in the field of policy representation. It introduced diverse policy architectures, ranging from traditional multi-layer perceptron (MLP) -based policies to those based on pointer networks, sequence generation models, diffusion models, hypernetworks, modular designs, mixture of experts models, and cross-modal policies based on serialized tokens. The paper sorted out cutting-edge research concerning policy representation methods, specifically addressing how semantic information within policy inputs and intermediate representations is encoded and optimized. It concluded with a summary and discussed prospects for future development.

First Page

1753

Last Page

1769

CLC

TP391.9

Recommended Citation

Chen Zhen, Wu Zhuoyi, Zhang Lin. Research on Policy Representation in Deep Reinforcement Learning[J]. Journal of System Simulation, 2025, 37(7): 1753-1769

Corresponding Author

Zhang Lin

DOI

10.16182/j.issn1004731x.joss.25-0533

Share

COinS