•  
  •  
 

Journal of System Simulation

Abstract

Abstract: Since the agent cannot sense the surrounding environment and cannot successfully avoid obstacles, reinforcement learning fails to be generalized to robot motion planning in difficult terrain. Therefore, a solution based on multimodal deep reinforcement learning, which learns to blend proprioceptive states with high-dimensional depth sensor inputs, is proposed for the motion planning of unmanned vehicles. To be specific, proprioceptive states offer contact measurement for immediate reaction, and the unmanned vehicle can learn and forecast environmental changes with its attached visual sensors, proactively navigating around obstacles and uneven terrains numerous time steps ahead. TransProAct (transformer-based proactive action), a unique end-to-end multimodal Transformer fusion model, is proposed. Proprioceptive states and visual data are fused through its self-attention mechanism, and then the deep reinforcement algorithm PPO is used to train the self-learning of motion planning by the unmanned vehicle. In addition, multimodal delay randomization is introduced to resolve the differences between simulation and reality. After being tested in difficult simulation environments with a variety of barriers and uneven ground, the proposed approach shows notable gains over the baseline and a remarkable improvement in generalization ability.

First Page

2631

Last Page

2643

CLC

TP242.6

Recommended Citation

Ding Kaiyuan, Askar Hamdulla, Zhu Bin, et al. End-to-end Motion Planning of Unmanned Vehicles Based on Multimodal Deep Reinforcement Learning[J]. Journal of System Simulation, 2024, 36(11): 2631-2643.

Corresponding Author

Askar Hamdulla

DOI

10.16182/j.issn1004731x.joss.23-0939

Share

COinS