MSPF-LMFF: Category-Level 6D Object Pose Estimation via Multi-Scale Prior Point Cloud Fusion and Lightweight Multi-Feature Fusion

Authors

DOI:

https://doi.org/10.5755/j01.itc.54.4.40202

Keywords:

特征融合、融合点云、轻量级模型、多尺度特征融合、轻量级多特征融合。

Abstract

Object pose estimation is a critical task in the field of machine vision. Existing pose estimation methods often suffer from challenges such as large parameter sizes, complex architectures, and high computational costs, which limit their applicability in real-world scenarios. To address these issues, we propose a novel category-level object pose estimation model, named MSPF-LMFF. This model eliminates the reliance on attention mechanisms or precise 3D models, significantly reduces computational complexity, and enhances pose estimation accuracy, demonstrating superior performance on both real and synthetic datasets. Specifically, the MSPF module enriches the features of point clouds by integrating multi-scale image texture features with prior point cloud features, making them closer to the target object point cloud. Subsequently, the LMFF module combines geometric features of fused point cloud, depth image features, and geometric features of the target object point cloud to enhance the robustness of the model. At the same time, this module fuses adaptive point cloud features with the target object’s geometric features to improve the reliability of shape information, thereby enhancing the model’s generalization capability across different instances of the same category. Following this, a multi-layer perceptron (MLP) generates deformation and mapping matrices to reconstruct the target object’s normalized object coordinate space (NOCS) model. Finally, based on the NOCS model, the point cloud registration module computes the target object’s 6D pose and 3D dimensions. Experimental results demonstrate that MSPF-LMFF outperforms existing methods on the NOCS-REAL and NOCS-CAMERA datasets while significantly reducing parameter sizes and training time. Moreover, the proposed model exhibits exceptional generalization capabilities on the Wild 6D dataset, further validating its effectiveness.

Downloads

Published

2025-12-19

Issue

Section

Articles