Weiguang Zhao
Open-Pose 3D zero-shot learning: Benchmark and challenges
Zhao, Weiguang; Yang, Guanyu; Zhang, Rui; Jiang, Chenru; Yang, Chaolong; Yan, Yuyao; Hussain, Amir; Huang, Kaizhu
Authors
Guanyu Yang
Rui Zhang
Chenru Jiang
Chaolong Yang
Yuyao Yan
Prof Amir Hussain A.Hussain@napier.ac.uk
Professor
Kaizhu Huang
Abstract
With the explosive 3D data growth, the urgency of utilizing zero-shot learning to facilitate data labeling becomes evident. Recently, methods transferring language or language-image pre-training models like Contrastive Language-Image Pre-training (CLIP) to 3D vision have made significant progress in the 3D zero-shot classification task. These methods primarily focus on 3D object classification with an aligned pose; such a setting is, however, rather restrictive, which overlooks the recognition of 3D objects with open poses typically encountered in real-world scenarios, such as an overturned chair or a lying teddy bear. To this end, we propose a more realistic and challenging scenario named open-pose 3D zero-shot classification, focusing on the recognition of 3D objects regardless of their orientation. First, we revisit the current research on 3D zero-shot classification and propose two benchmark datasets specifically designed for the open-pose setting. We empirically validate many of the most popular methods in the proposed open-pose benchmark. Our investigations reveal that most current 3D zero-shot classification models suffer from poor performance, indicating a substantial exploration room towards the new direction. Furthermore, we study a concise pipeline with an iterative angle refinement mechanism that automatically optimizes one ideal angle to classify these open-pose 3D objects. In particular, to make validation more compelling and not just limited to existing CLIP-based methods, we also pioneer the exploration of knowledge transfer based on Diffusion models. While the proposed solutions can serve as a new benchmark for open-pose 3D zero-shot classification, we discuss the complexities and challenges of this scenario that remain for further research development. The code is available publicly at https://github.com/weiguangzhao/Diff-OP3D
Citation
Zhao, W., Yang, G., Zhang, R., Jiang, C., Yang, C., Yan, Y., Hussain, A., & Huang, K. (2025). Open-Pose 3D zero-shot learning: Benchmark and challenges. Neural Networks, 181, Article 106775. https://doi.org/10.1016/j.neunet.2024.106775
Journal Article Type | Article |
---|---|
Acceptance Date | Sep 29, 2024 |
Online Publication Date | Oct 9, 2024 |
Publication Date | 2025-01 |
Deposit Date | Nov 6, 2024 |
Journal | Neural Networks |
Print ISSN | 0893-6080 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 181 |
Article Number | 106775 |
DOI | https://doi.org/10.1016/j.neunet.2024.106775 |
Keywords | Zero-shot, 3D classification, Open-pose, Text–image matching |
You might also like
MTFDN: An image copy‐move forgery detection method based on multi‐task learning
(2024)
Journal Article
Transition-aware human activity recognition using an ensemble deep learning framework
(2024)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search