GitHub - XuZhang1211/PVPUFormer: Implementation of ''VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation''

VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation

Xu Zhang · Kailun Yang · Jiacheng Lin · Jin Yuan · Zhiyong Li · Shutao Li

Paper

🛠️ 👷 🚀

🔥 We will release code and checkpoints in the future. 🔥

Update

2023.06.08 Init repository.
2023.06.11 Release the arXiv version.

TODO List

Code release.

Abstract

The integration of diverse visual prompts like clicks, scribbles, and boxes in interactive image segmentation could significantly facilitate user interaction as well as improve interaction efficiency. Most existing studies focus on a single type of visual prompt by simply concatenating prompts and images as input for segmentation prediction, which suffers from low-efficiency prompt representation and weak interaction issues. This paper proposes a simple yet effective Visual Prompt Unified Transformer (VPUFormer), which introduces a concise unified prompt representation with deeper interaction to boost the segmentation performance. Specifically, we design a Prompt-unified Encoder (PuE) by using Gaussian mapping to generate a unified one-dimensional vector for click, box, and scribble prompts, which well captures users' intentions as well as provides a denser representation of user prompts. In addition, we present a Prompt-to-Pixel Contrastive Loss (P$^2$CL) that leverages user feedback to gradually refine candidate semantic features. On this basis, our approach injects prompt representations as queries into Dual-cross Merging Attention (DMA) blocks to perform a deeper interaction between image and query inputs.

VPUFormer model

Results

Contact

Feel free to contact me if you have additional questions or have interests in collaboration. Please drop me an email at xuzhang1211@hnu.edu.cn. =)

Name	Name	Last commit message	Last commit date
Latest commit XuZhang1211 feat: init repository Jun 13, 2023 46c310c · Jun 13, 2023 History 7 Commits
assets	assets	feat: init repository	Jun 13, 2023
LICENSE	LICENSE	Initial commit	Jun 8, 2023
README.md	README.md	feat: init repository	Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation

Update

TODO List

Abstract

VPUFormer model

Results

Contact

About

Releases

Packages

License

XuZhang1211/PVPUFormer

Folders and files

Latest commit

History

Repository files navigation

VPUFormer: Visual Prompt Unified Transformer for Interactive Image Segmentation

Update

TODO List

Abstract

VPUFormer model

Results

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages