Skip to content

we propose FlexEdit, an end-to-end image editing method that leverages both free-shape masks and language instructions for Flexible Editing.

Notifications You must be signed in to change notification settings

A-new-b/flex_edit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing

we propose FlexEdit, an end-to-end image editing method that leverages both free-shape masks and language instructions for Flexible Editing.

The following image is the architecture of the FlexEdit framework.FlexEdit integrates visual prompts and human instructions for complex image editing. It utilizes a VLLM backbone for multi-modal instruction understanding, a Q-Former for refining the hidden states, and a Mask Enhanced Adapter (MEA) for merging image and language model outputs. The final image generation is achieved through a diffusion model. image

The following image is multiple images editing comparison on rectangle mask, rectangle open mask, triangle mask, triangle open mask image Code and data will be available soon.

About

we propose FlexEdit, an end-to-end image editing method that leverages both free-shape masks and language instructions for Flexible Editing.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published