Skip to content

😎 Awesome lists of papers and codes about open-vocabulary perception, including both 3D and 2D

Notifications You must be signed in to change notification settings

yangcaoai/Awesome-Open-Vocabulary-Perception

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 

Repository files navigation

Awesome-Open-Vocabulary-Perception Awesome

Papers and codes for open-vocabulary perception (3D&2D). 😎

This repo mainly focuses on the open-vocabulary perception tasks (both 3D and 2D). Please pull requests or email me by yangcao.cs@gmail.com if you want to recommend papers.

If you are interested in related tasks, you can reach me out by discord account: yangcao#9724 or WeChat: 85298328912.

3D

Open-Vocabulary 3D Object Detection

  1. [CoDAv2] Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection, Arxiv 2024. [Code]
  2. [CoDA] Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection, NeurIPS2023. [Code]
  3. [OV-3DET] Open-Vocabulary Point-Cloud Object Detection without 3D Annotation, CVPR2023. [Code]
  4. [FM-OV3D] FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection, AAAI2024. [Code]

Open-Vocabulary 3D Segmentation

  1. [OpenMask3D] OpenMask3D: Open-Vocabulary 3D Instance Segmentation, NeurIPS2023. [Code]
  2. [OpenScene] OpenScene: 3D Scene Understanding with Open Vocabularies, CVPR2023. [Code]
  3. [3D-OVS] Weakly Supervised 3D Open-vocabulary Segmentation, CVPR2023. [Code]
  4. [PLA] PLA: Language-Driven Open-Vocabulary 3D Scene Understanding, CVPR2023. [Code]
  5. [Open3DIS] Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance, CVPR2024. [Code]
  6. [MaskClustering] MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance Segmentation, CVPR2024. [Code
  7. [LEGaussians] LEGaussians: Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding, CVPR2024. [Code

2D

Open-Vocabulary 2D Object Detection

  1. [Detclip] Dictionary-enriched visual-concept paralleled pre-training for open-world detection, NeurIPS2023
  2. [Detclipv2] Detclipv2: Scalable open-vocabulary object detection pre-training via word-region alignment, CVPR2023
  3. [Detclipv3] DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection, CVPR2024
  4. [YOLO-World] YOLO-World: Real-Time Open-Vocabulary Object Detection, CVPR2024. [Code]

Open-Vocabulary 2D Segmentation

  1. [ODISE] Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models, CVPR2023 Highlight. [Code]
  2. [FreeDA] Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation, CVPR2024. [Code]
  3. [OVAM] Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models, CVPR2024. [Code]
  4. [PnP-OVSS] Plug-and-Play, Dense-Label-Free Extraction of Open-Vocabulary Semantic Segmentation from Vision-Language Models, CVPR2024. [Code]
  5. [OVFoodSeg] OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via Image-Informed Textual Representation, CVPR2024.
  6. [SED] SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation, CVPR2024.

About

😎 Awesome lists of papers and codes about open-vocabulary perception, including both 3D and 2D

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published