With the emergence of large-scale models trained on diverse datasets, in-context learning has emerged as a promising paradigm for multitasking, notably in natural language processing and image processing. However, its application in 3D point cloud tasks remains largely unexplored. In this work, we introduce Point-In-Context (PIC), a novel framework for 3D point cloud understanding via in-context learning. We address the technical challenge of extending masked point modeling to 3D point clouds effectively by introducing a Joint Sampling module and propose a vanilla version of PIC called Point-In-Context-Generalist (PIC-G). PIC-G is designed as a generalist model for various 3D point cloud tasks, with both inputs and outputs modeled as coordinates. In this paradigm, the challenging segmentation task is achieved by assigning coordinates for each category; thus, the closest to the predictions is chosen as the final prediction.
To break the limitation by the fixed label-coordinate assignment, which has poor generalization upon novel classes, we propose two novel training strategies, In-Context Labeling and In-Context Enhancing, forming an extended version of PIC named Point-In-Context-Segmenter (PIC-S), targeting improving dynamic context labeling and model training. By utilizing dynamic in-context labels and extra in-context pairs, PIC-S achieves enhanced performance and generalization capability in and across part segmentation datasets. It is worth noting that PIC is a general framework so that other tasks or datasets can be seamlessly introduced into our PIC through a unified data format. We conduct extensive experiments to validate the versatility and adaptability of our proposed methods in handling a wide range of tasks and segmenting multi-datasets. Our PIC-S is especially capable of generalizing unseen datasets and performing novel part segmentation by customizing prompts.
@article{liu2024pointincontext,
title={Point-In-Context: Understanding Point Cloud via In-Context Learning},
author={Liu, Mengyuan and Fang, Zhongbin and Li, Xia and Buhmann, Joachim M and Li, Xiangtai and Loy, Chen Change},
journal={arXiv preprint arXiv:2401.08210},
year={2024}
}
@article{fang2024explore,
title={Explore in-context learning for 3d point cloud understanding},
author={Fang, Zhongbin and Li, Xiangtai and Li, Xia and Buhmann, Joachim M and Loy, Chen Change and Liu, Mengyuan},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}