Journal of System Simulation
Abstract
Abstract: To improve the accuracy, controllability, and realism of text-driven human motion generation, a novel method is proposed that integrates fine-grained textual semantics with spatial control signals. Within the diffusion model framework, both global text tokens and body-part-level local tokens are introduced. These are encoded using CLIP to obtain corresponding features, which are then fed into the motion diffusion model to enable fine control over different body parts. Spatial guidance is used to dynamically adjust joint positions during the diffusion denoising process, ensuring that the generated motion adheres to spatial constraints. Realism guidance is incorporated to enhance the naturalness and overall coordination of uncontrolled joints. Experiments conducted on the HumanML3D dataset involved fine-grained rewriting of 44 970 text samples using ChatGPT-4o to improve semantic alignment between text and motion. Results demonstrate that the proposed method outperforms existing approaches in motion semantic consistency, spatial control accuracy, and generation quality, and is capable of producing human motions that meet user expectations in both semantic alignment and motion quality.
Recommended Citation
Jiang, Binze; Song, Wenfeng; Hou, Xia; and Li, Shuai
(2026)
"Diffusion Model for Human Motion Generation with Fine-grained Text and Spatial Control Signals,"
Journal of System Simulation: Vol. 38:
Iss.
1, Article 11.
DOI: 10.16182/j.issn1004731x.joss.25-0832
Available at:
https://dc-china-simulation.researchcommons.org/journal/vol38/iss1/11
First Page
136
Last Page
157
CLC
TP391.9
Recommended Citation
Jiang Binze, Song Wenfeng, Hou Xia, et al. Diffusion Model for Human Motion Generation with Finegrained Text and Spatial Control Signals[J]. Journal of System Simulation, 2026, 38(1): 136-157.
DOI
10.16182/j.issn1004731x.joss.25-0832
Included in
Artificial Intelligence and Robotics Commons, Computer Engineering Commons, Numerical Analysis and Scientific Computing Commons, Operations Research, Systems Engineering and Industrial Engineering Commons, Systems Science Commons