Skip to content

Commit

Permalink
commit DrawBBoxMask node
Browse files Browse the repository at this point in the history
  • Loading branch information
chflame163 committed Sep 21, 2024
1 parent a7437cc commit c6083de
Show file tree
Hide file tree
Showing 7 changed files with 360 additions and 6 deletions.
15 changes: 14 additions & 1 deletion README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ When this error has occurred, please check the network environment.
## Update
<font size="4">**If the dependency package error after updating, please double clicking ```repair_dependency.bat``` (for Official ComfyUI Protable) or ```repair_dependency_aki.bat``` (for ComfyUI-aki-v1.x) in the plugin folder to reinstall the dependency packages. </font><br />


* Commit [DrawBBoxMask](#DrawBBoxMask) node, used to convert the BBoxes output by the Object Detector node into a mask.
* Commit [UserPromptGeneratorTxtImg](#UserPromptGeneratorTxtImg) and [UserPromptGeneratorReplaceWord](#UserPromptGeneratorReplaceWord) nodes, Used to generate text and image prompts and replace prompt content.
* Commit [PhiPrompt](#PhiPrompt) node, Use Microsoft Phi 3.5 text and visual models for local inference. Can be used to generate prompt words, process prompt words, or infer prompt words from images. Running this model requires at least 16GB of video memory.
Download model files from [BaiduNetdisk](https://pan.baidu.com/s/1BdTLdaeGC3trh1U3V-6XTA?pwd=29dh) or [huggingface.co/microsoft/Phi-3.5-vision-instruct](https://huggingface.co/microsoft/Phi-3.5-vision-instruct/tree/main) and [huggingface.co/microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct/tree/main) and copy to ```ComfyUI\models\LLM``` folder.
Expand Down Expand Up @@ -1836,6 +1836,19 @@ Node Options:
* bboxes_3: Optional input. The third set of identification boxes.
* bboxes_4: Optional input. The fourth set of identification boxes.

### <a id="table1">DrawBBoxMask</a>
Draw the recognition BBoxes data output by the Object Detector node as a mask.
![image](image/draw_bbox_mask_example.jpg)

Node Options:
![image](image/draw_bbox_mask_node.jpg)
* image: Image input. It must be consistent with the image recognized by the Object Detector node.
* bboxes: Input recognition BBoxes data.
* grow_top: Each BBox expands upwards as a percentage of its height, positive values indicate upward expansion and negative values indicate downward expansion.
* grow_bottom: Each BBox expands downwards as a percentage of its height, positive values indicating downward expansion and negative values indicating upward expansion.
* grow_left: Each BBox expands to the left as a percentage of its width, positive values expand to the left and negative values expand to the right.
* grow_right: Each BBox expands to the right as a percentage of its width, positive values indicate expansion to the right and negative values indicate expansion to the left.

### <a id="table1">EVF-SAMUltra</a>
This node is implementation of [EVF-SAM](https://github.com/hustvl/EVF-SAM) in ComfyUI.
*Please download model files from [BaiduNetdisk](https://pan.baidu.com/s/1EvaxgKcCxUpMbYKzLnEx9w?pwd=69bn) or [huggingface/EVF-SAM2](https://huggingface.co/YxZhang/evf-sam2/tree/main), [huggingface/EVF-SAM](https://huggingface.co/YxZhang/evf-sam/tree/main) to ```ComfyUI/models/EVF-SAM``` folder(save the models in their respective subdirectories).
Expand Down
15 changes: 15 additions & 0 deletions README_CN.MD
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
## 更新说明
<font size="4">**如果本插件更新后出现依赖包错误,请双击运行插件目录下的```install_requirements.bat```(官方便携包),或 ```install_requirements_aki.bat```(秋叶整合包) 重新安装依赖包。

* 添加 [DrawBBoxMask](#DrawBBoxMask) 节点,用于将 ObjectDetector 节点输出的BBox转为遮罩。
* 添加 [UserPromptGeneratorTxtImg](#UserPromptGeneratorTxtImg) 以及 [UserPromptGeneratorReplaceWord](#UserPromptGeneratorReplaceWord) 节点, 用于生成文生图提示词和替换提示词内容。
* 添加 [PhiPrompt](#PhiPrompt) 节点,使用Micrisoft Phi 3.5文字及视觉模型进行本地推理。可以用于生成提示词,加工提示词或者反推图片的提示词。运行这个模型需要至少16GB的显存。
请从[百度网盘](https://pan.baidu.com/s/1BdTLdaeGC3trh1U3V-6XTA?pwd=29dh) 或者 [huggingface.co/microsoft/Phi-3.5-vision-instruct](https://huggingface.co/microsoft/Phi-3.5-vision-instruct/tree/main)[huggingface.co/microsoft/Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct/tree/main) 下载全部模型文件并放到 ```ComfyUI\models\LLM``` 文件夹。
Expand Down Expand Up @@ -1807,6 +1808,20 @@ https://github.com/user-attachments/assets/b2a45c96-4be1-4470-8ceb-addaf301b0cb
* bboxes_3: 可选输入。第三组识别框。
* bboxes_4: 可选输入。第四组识别框。

### <a id="table1">DrawBBoxMask</a>
将ObjectDetector节点输出的识别框数据绘制为遮罩。
![image](image/draw_bbox_mask_example.jpg)

节点选项说明:
![image](image/draw_bbox_mask_node.jpg)
* image: 图片输入。必须与ObjectDetector节点识别的图片一致。
* bboxes: 识别框数据输入。
* grow_top: 每个识别框向上扩展范围,为识别框高度的百分比。正值为向上扩展,负值为向下扩展。
* grow_bottom: 每个识别框向下扩展范围,为识别框高度的百分比,正值为向下扩展,负值为向上扩展。
* grow_left: 每个识别框向左扩展范围,为识别框宽度的百分比。正值为向左扩展,负值为向右扩展。
* grow_right: 每个识别框向右扩展范围,为识别框宽度的百分比。正值为向右扩展,负值为向左扩展。


### <a id="table1">EVF-SAMUltra</a>
本节点是[EVF-SAM](https://github.com/hustvl/EVF-SAM)在ComfyUI中的实现。
*请从[百度网盘](https://pan.baidu.com/s/1EvaxgKcCxUpMbYKzLnEx9w?pwd=69bn) 或者 [huggingface/EVF-SAM2](https://huggingface.co/YxZhang/evf-sam2/tree/main), [huggingface/EVF-SAM](https://huggingface.co/YxZhang/evf-sam/tree/main) 下载全部模型文件并复制到```ComfyUI/models/EVF-SAM```文件夹(请将模型保存在各自子目录中)。
Expand Down
Binary file added image/draw_bbox_mask_example.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added image/draw_bbox_mask_node.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
80 changes: 76 additions & 4 deletions py/object_detector.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,18 @@
select_list = ["all", "first", "by_index"]
sort_method_list = ["left_to_right", "top_to_bottom", "big_to_small"]


# 规范bbox,保证x1 < x2, y1 < y2, 并返回int
def standardize_bbox(bboxes:list) -> list:
ret_bboxes = []
for bbox in bboxes:
x1 = int(min(bbox[0], bbox[2]))
y1 = int(min(bbox[1], bbox[3]))
x2 = int(max(bbox[0], bbox[2]))
y2 = int(max(bbox[1], bbox[3]))
ret_bboxes.append([x1, y1, x2, y2])
return ret_bboxes

def sort_bboxes(bboxes:list, method:str) -> list:
sorted_bboxes = []
if method == "left_to_right":
Expand Down Expand Up @@ -121,7 +133,7 @@ def object_detector_fl2(self, image, prompt, florence2_model, sort_method, bbox_
log(f"{self.NODE_NAME} no object found", message_type='warning')
else:
log(f"{self.NODE_NAME} found {len(bboxes)} object(s)", message_type='info')
return (bboxes, torch.cat(ret_previews, dim=0))
return (standardize_bbox(bboxes), torch.cat(ret_previews, dim=0))

def fbboxes_to_list(self, F_BBOXES) -> list:
if isinstance(F_BBOXES, str):
Expand Down Expand Up @@ -220,7 +232,7 @@ def object_detector_mask(self, object_mask, sort_method, bbox_select, select_ind
else:
log(f"{self.NODE_NAME} found {len(bboxes)} object(s)", message_type='info')

return (bboxes, torch.cat(ret_previews, dim=0))
return (standardize_bbox(bboxes), torch.cat(ret_previews, dim=0))


class LS_OBJECT_DETECTOR_YOLO8:
Expand Down Expand Up @@ -281,7 +293,7 @@ def object_detector_yolo8(self, image, yolo_model, sort_method, bbox_select, sel
else:
log(f"{self.NODE_NAME} found {len(bboxes)} object(s)", message_type='info')

return (bboxes, torch.cat(ret_previews, dim=0),)
return (standardize_bbox(bboxes), torch.cat(ret_previews, dim=0),)

class LS_OBJECT_DETECTOR_YOLOWORLD:

Expand Down Expand Up @@ -344,7 +356,7 @@ def object_detector_yoloworld(self, image, yolo_world_model,
else:
log(f"{self.NODE_NAME} found {len(bboxes)} object(s)", message_type='info')

return (bboxes, torch.cat(ret_previews, dim=0))
return (standardize_bbox(bboxes), torch.cat(ret_previews, dim=0))

def process_categories(self, categories: str) -> List[str]:
return [category.strip().lower() for category in categories.split(',')]
Expand All @@ -357,8 +369,67 @@ def load_yolo_world_model(self,model_id: str, categories: str) -> List[torch.nn.
return model



class LS_DrawBBoxMask:

def __init__(self):
self.NODE_NAME = 'Draw BBOX Mask'
pass

@classmethod
def INPUT_TYPES(cls):
return {
"required": {
"image": ("IMAGE",),
"bboxes": ("BBOXES",),
"grow_top": ("FLOAT", {"default": 0, "min": -10, "max": 10, "step": 0.01}), # bbox向上扩展,按高度比例
"grow_bottom": ("FLOAT", {"default": 0, "min": -10, "max": 10, "step": 0.01}),
"grow_left": ("FLOAT", {"default": 0, "min": -10, "max": 10, "step": 0.01}),
"grow_right": ("FLOAT", {"default": 0, "min": -10, "max": 10, "step": 0.01}),
},
"optional": {
}
}

RETURN_TYPES = ("MASK",)
RETURN_NAMES = ("mask",)
FUNCTION = 'draw_bbox_mask'
CATEGORY = '😺dzNodes/LayerMask'

def draw_bbox_mask(self, image, bboxes, grow_top, grow_bottom, grow_left, grow_right
):

ret_masks = []
for img in image:
img = tensor2pil(img)
mask = Image.new("L", img.size, color='black')
for bbox in bboxes:
x1, y1, x2, y2 = bbox
w = x2 - x1
h = y2 - y1
if grow_top:
y1 = int(y1 - h * grow_top)
if grow_bottom:
y2 = int(y2 + h * grow_bottom)
if grow_left:
x1 = int(x1 - w * grow_left)
if grow_right:
x2 = int(x2 + w * grow_right)
if y1 > y2 or x1 > x2:
log(f"{self.NODE_NAME} Invalid bbox after extend: ({x1},{y1},{x2},{y2})", message_type='warning')
continue
draw = ImageDraw.Draw(mask)
draw.rectangle([x1, y1, x2, y2], fill='white', outline='white', width=0)
del draw
ret_masks.append(pil2tensor(mask))

log(f"{self.NODE_NAME} Processed {len(ret_masks)} mask(s).", message_type='finish')
return (torch.cat(ret_masks, dim=0),)


NODE_CLASS_MAPPINGS = {
"LayerMask: BBoxJoin": LS_BBOXES_JOIN,
"LayerMask: DrawBBoxMask": LS_DrawBBoxMask,
"LayerMask: ObjectDetectorFL2": LS_OBJECT_DETECTOR_FL2,
"LayerMask: ObjectDetectorMask": LS_OBJECT_DETECTOR_MASK,
"LayerMask: ObjectDetectorYOLO8": LS_OBJECT_DETECTOR_YOLO8,
Expand All @@ -367,6 +438,7 @@ def load_yolo_world_model(self,model_id: str, categories: str) -> List[torch.nn.

NODE_DISPLAY_NAME_MAPPINGS = {
"LayerMask: BBoxJoin": "LayerMask: BBox Join",
"LayerMask: DrawBBoxMask": "LayerMask: Draw BBox Mask",
"LayerMask: ObjectDetectorFL2": "LayerMask: Object Detector Florence2",
"LayerMask: ObjectDetectorMask": "LayerMask: Object Detector Mask",
"LayerMask: ObjectDetectorYOLO8": "LayerMask: Object Detector YOLO8",
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[project]
name = "comfyui_layerstyle"
description = "A set of nodes for ComfyUI it generate image like Adobe Photoshop's Layer Style. the Drop Shadow is first completed node, and follow-up work is in progress."
version = "1.0.57"
version = "1.0.58"
license = "MIT"
dependencies = ["numpy", "pillow", "torch", "matplotlib", "Scipy", "scikit_image", "scikit_learn", "opencv-contrib-python", "pymatting", "segment_anything", "timm", "addict", "yapf", "colour-science", "wget", "mediapipe", "loguru", "typer_config", "fastapi", "rich", "google-generativeai", "diffusers", "omegaconf", "tqdm", "transformers", "kornia", "image-reward", "ultralytics", "blend_modes", "blind-watermark", "qrcode", "pyzbar", "transparent-background", "huggingface_hub", "accelerate", "bitsandbytes", "torchscale", "wandb", "hydra-core", "psd-tools", "inference-cli[yolo-world]", "inference-gpu[yolo-world]", "onnxruntime"]

Expand Down
Loading

0 comments on commit c6083de

Please sign in to comment.