Novel Part Segmentation in Point Cloud with Adaptive Prompts

Mengyuan Liu1, Zhongbin Fang2, 4, Xia Li3, Joachim M. Buhmann3, Deheng Ye4,
Xiangtai Li5, Chen Change Loy5
1National Key Laboratory of General Artificial Intelligence, Shenzhen Graduate School, Peking University
2Sun Yat-sen University  3Department of Computer Science, ETH Zurich  4Tencent Inc.  5S-Lab, Nanyang Technological University 
Teaser Image

Illustration of our proposed Adaptive Prompting Model~(APM). (a) Our proposed APM for multiple part segmentation datasets in 3D point clouds. (b) Our APM achieves SOTA in diverse part segmentation post a single unified training. (c) Our APM can generalize to novel part segmentation tasks never seen before.

Overview.

Our main contributions are summarized as follows:

  • Within the 3D prompting framework, we propose the Adaptive Prompting Model~(APM). In particular, we propose Adaptive Labeling and Prompt Enhancing to improve performance and generalization capability in diverse 3D point cloud part segmentation tasks after a single unified training. Furthermore, our APM can seamlessly integrate additional segmented datasets without redundant label points.
  • We establish the Multi-Entity Segmentation datasets, which comprises four available point cloud datasets on human and object segmentation, including ShapeNetPart, Human3D, BEHAVE, and AKB-48. Our goal is to fully evaluate the performance of models trained jointly on multiple segmentation datasets, as well as their generalization on unseen datasets.
  • We conduct extensive experiments on the Multi-Entity Segmentation datasets to validate our APM framework. Compared to other models, our APM achieves SOTA performance. Furthermore, we show that APM can produce excellent results on novel part segmentation datasets or customized part segmentation tasks, which makes our work more applicable in real-world scenarios.
  • Abstract

    Novel part segmentation in 3D point clouds involves segmenting parts not included in the training dataset or from unseen combinations of parts. In this paper, we attempt to address novel part segmentation on unseen datasets or tasks by using prompting technology due to its features of efficient model adaptation domain-generalization capability. Recent works explore prompting in 3D point cloud segmentation by associating label points with the fixed XYZ coordinates of each category. However, this method of using fixed label points limits the model to generalize to new datasets, as well as introduces redundancy in label points as datasets increase, hindering model performance. To overcome these limitations, we propose the Adaptive Prompting Model (APM) based on vanilla transformer architecture. Our APM comprises two core training strategies: Adaptive Labeling (AL) and Prompt Enhancing (PE). Adaptive Labeling replaces fixed label point coordinates with dynamic label points, aligning the semantics of shared parts by randomly assigning label points within each point cloud. Prompt Enhancing employs various corruption operations to provide the model with more diverse point cloud pairs. This enables the model to learn more robust mapping relationships from the prompt pairs. Notably, our APM method can seamlessly integrate new part segmentation datasets without causing label redundancy. It requires only the necessary prompts. Furthermore, our APM requires only a single training stage and does not necessitate fine-tuning. Despite this, extensive experiments demonstrate that APM significantly outperforms other models in one-shot testing on the novel segmentation dataset. It also allows for the execution of the novel part segmentation task through customized prompts. Remarkably, our APM achieves state-of-the-art performance in multi-dataset joint training.

    Model Architecture

    Teaser Image
    Overall scheme of our Adaptive Prompting Model~(APM). Top: Training pipeline of the Masked Point Modeling~(MPM) framework with Adaptive Labeling and Prompt Enhancing strategies. During training, each sample comprises two pairs of input and target point clouds that tackle the same task. These pairs are fed into the transformer model to perform the masked point reconstruction task, which follows a random masking process. Bottom: Inference on Multi-Entity Segmentation Datasets. Our APM could infer results across various part segmentation datasets, including ShapeNetPart, Human3D, BEHAVE, and AKB-48.


    Features

    New benchmark for 3D point cloud multi-dataset novel part segmentation
    • A new multi-dataset joint training benchmark comprising four available point cloud datasets on human and object segmentation, including ShapeNetPart, Human3D, BEHAVE, and AKB-48.
    More superior segmentation performance and stronger generalization capability
    • APM achieves SOTA results on multi-entity novel part segmentation benchmark. Compared to PIC, APM is more adept at integrating multiple datasets in segmentation tasks.
    • APM achieves SOTA results on one-shot testing on the out-of-domain dataset (AKB-48), which is not included in the training set.
    • APM can accurately generate unique part segmentation results via customized prompts.


    Adaptive Labeling & Prompt Enhancing

    Teaser Image
    (a) Comparison of generating targets between Adaptive Labeling (our APM) and pre-defined label map (PIC). P represents the number of all parts (PNB > |Ci|). (b) Illustration of Prompt Enhancing. We first randomly apply common corruptions on a clean point cloud and construct in-context pairs. Then we train our APM on the combined training set of Multi-Entity Segmentation Datasets and constructed prompt pairs.


    Visualization


    Visualization of APM
    Visualization of predictions obtained by PIC-S-Sep and their corresponding ground truth on Human & Object Segmentation In-Context Datasets
    Teaser Image

    Comparison with PIC-G
    Visualization of comparison results between two versions of PIC: PIC-S (extended version) and PIC-G (vanilla version).
    Teaser Image

    Generalization of APM
    We use customized prompts to guide the model to perform specified part segmentation. The red boxes indicate the output of the APM.
    Teaser Image

    BibTeX

    If you find our work useful in your research, please consider citing:
    
          @article{liu2024pointincontext,
            title={Point-In-Context: Understanding Point Cloud via In-Context Learning},
            author={Liu, Mengyuan and Fang, Zhongbin and Li, Xia and Buhmann, Joachim M and Li, Xiangtai and Loy, Chen Change},
            journal={arXiv preprint arXiv:2401.08210},
            year={2024}
          }
          @article{fang2024explore,
            title={Explore in-context learning for 3d point cloud understanding},
            author={Fang, Zhongbin and Li, Xiangtai and Li, Xia and Buhmann, Joachim M and Loy, Chen Change and Liu, Mengyuan},
            journal={Advances in Neural Information Processing Systems},
            volume={36},
            year={2024}
          }