•  
  •  
 

Journal of System Simulation

Abstract

Abstract: To enhance the semantic discrimination capability in point cloud semantic segmentation, a 3D point cloud semantic segmentation network named PL-Mamba is proposed, which is centered on the fusion of point cloud (P) and language (L) dual modalities. This method takes PointMamba as the backbone network, leveraging its excellent long-sequence modeling and global perception capabilities. It introduces a language prompt mechanism and uses a pretrained language model BERT to encode the context of category labels, obtaining semantically rich text features. The text information serves as a language guided token and is deeply integrated with point cloud features through cross modal attention mechanism, thereby achieving semantic alignment and region enhancement, effectively alleviating the problems of weak semantic expression ability and severe category confusion in the point cloud itself. The experimental results conducted on the ScanNet large-scale indoor point cloud segmentation dataset show that the proposed PL-Mamba method achieves 78.21% mIoU on ScanNet,which is 0.21% higher than the baseline BFANet (78.00%) and also better than Mamba-based models such as FEAST-Mamba (77.80%).

First Page

73

Last Page

83

CLC

TP391.41

Recommended Citation

Zhu He, Zhou Feng, Zhang Qi, et al. PL-Mamba: A 3D Point Cloud Semantic Segmentation Network Based on Bimodal Fusion[J]. Journal of System Simulation, 2026, 38(1): 73-83.

Corresponding Author

Zhou Feng

DOI

10.16182/j.issn1004731x.joss.25-0858

Share

COinS