Skip to content

Commit

Permalink
commit SAM2 and ObjectDetectors nodes
Browse files Browse the repository at this point in the history
  • Loading branch information
chflame163 committed Sep 2, 2024
1 parent c4368ba commit 0defae2
Show file tree
Hide file tree
Showing 47 changed files with 8,530 additions and 22 deletions.
114 changes: 113 additions & 1 deletion README.MD
Original file line number Diff line number Diff line change
Expand Up @@ -100,10 +100,13 @@ Solution:
### Requests.exceptions.ProxyError: HTTPSConnectionPool(xxxx...)
When this error has occurred, please check the network environment.


## Update
<font size="4">**If the dependency package error after updating, please double clicking ```repair_dependency.bat``` (for Official ComfyUI Protable) or ```repair_dependency_aki.bat``` (for ComfyUI-aki-v1.x) in the plugin folder to reinstall the dependency packages. </font><br />

* Commit [SAM2Ultra](#SAM2Ultra), [SAM2VideoUltra](#SAM2VideoUltra), [ObjectDetectorFL2](#ObjectDetectorFL2), [ObjectDetectorYOLOWorld](#ObjectDetectorYOLOWorld), [ObjectDetectorYOLO8](#ObjectDetectorYOLO8), [ObjectDetectorMask](#ObjectDetectorMask) and [BBoxJoin](#BBoxJoin) nodes.
Download models from [BaiduNetdisk](https://pan.baidu.com/s/1xaQYBA6ktxvAxm310HXweQ?pwd=auki) or [huggingface.co/Kijai/sam2-safetensors](https://huggingface.co/Kijai/sam2-safetensors/tree/main) and copy to ```ComfyUI/models/sam2``` folder,
Download models from [BaiduNetdisk](https://pan.baidu.com/s/1QpjajeTA37vEAU2OQnbDcQ?pwd=nqsk) or [GoogleDrive](https://drive.google.com/drive/folders/1nrsfq4S-yk9ewJgwrhXAoNVqIFLZ1at7?usp=sharing) and copy to ```ComfyUI/models/yolo-world``` folder.
This update introduces new dependencies, please reinstall the dependency package.
* Commit [RandomGenerator](#RandomGenerator) node, Used to generate random numbers within a specified range, with outputs of int, float, and boolean, supporting batch generation of different random numbers by image batch.
* Commit [EVF-SAMUltra](#EVFSAMUltra) node, it is implementation of [EVF-SAM](https://github.com/hustvl/EVF-SAM) in ComfyUI. Please download model files from [BaiduNetdisk](https://pan.baidu.com/s/1EvaxgKcCxUpMbYKzLnEx9w?pwd=69bn) or [huggingface/EVF-SAM2](https://huggingface.co/YxZhang/evf-sam2/tree/main), [huggingface/EVF-SAM](https://huggingface.co/YxZhang/evf-sam/tree/main) to ```ComfyUI/models/EVF-SAM``` folder(save the models in their respective subdirectories).
Due to the introduction of new dependencies package, after the plugin upgrade, please reinstall the dependency packages.
Expand Down Expand Up @@ -1588,6 +1591,115 @@ On the basis of SegmentAnythingUltra, the following changes have been made:
* device: Set whether the VitMatte to use cuda.
* max_megapixels: Set the maximum size for VitMate operations.


### <a id="table1">SAM2Ultra</a>
This node is modified from [kijai/ComfyUI-segment-anything-2](https://github.com/kijai/ComfyUI-segment-anything-2). Thank to [kijai](https://github.com/kijai) for making significant contributions to the Comfyui community.
SAM2 Ultra node only support single image. If you need to process multiple images, please first convert the image batch to image list.
*Download models from [BaiduNetdisk](https://pan.baidu.com/s/1xaQYBA6ktxvAxm310HXweQ?pwd=auki) or [huggingface.co/Kijai/sam2-safetensors](https://huggingface.co/Kijai/sam2-safetensors/tree/main) and copy to ```ComfyUI/models/sam2``` folder.

![image](image/sam2_example.jpg)

Node Options:
![image](image/sam2_ultra_node.jpg)

* image: The image to segment.
* bboxes: Input recognition box data.
* sam2_model: Select the SAM2 model.
* presicion: Model's persicion. can be selected from fp16, bf16, and fp32.
* bbox_select: Select the input box data. There are three options: "all" to select all, "first" to select the box with the highest confidence, and "by_index" to specify the index of the box.
* select_index: This option is valid when bbox_delect is 'by_index'. 0 is the first one. Multiple values can be entered, separated by any non numeric character, including but not limited to commas, periods, semicolons, spaces or letters, and even Chinese.
* cache_model: Whether to cache the model. After caching the model, it will save time for model loading.
* detail_method: Edge processing methods. provides VITMatte, VITMatte(local), PyMatting, GuidedFilter. If the model has been downloaded after the first use of VITMatte, you can use VITMatte (local) afterwards.
* detail_erode: Mask the erosion range inward from the edge. the larger the value, the larger the range of inward repair.
* detail_dilate: The edge of the mask expands outward. the larger the value, the wider the range of outward repair.
* black_point: Edge black sampling threshold.
* white_point: Edge white sampling threshold.
* process_detail: Set to false here will skip edge processing to save runtime.
* device: Set whether the VitMatte to use cuda.
* max_megapixels: Set the maximum size for VitMate operations.

### <a id="table1">SAM2VideoUltra</a>
SAM2 Video Ultra node support processing multiple frames of images or video sequences. Please define the recognition box data in the first frame of the sequence to ensure correct recognition.


https://github.com/user-attachments/assets/4726b8bf-9b98-4630-8f54-cb7ed7a3d2c5


Node Options:
![image](image/sam2_video_ultra_node.jpg)

* image: The image to segment.
* bboxes: Input recognition box data.
* pre_mask: Optional input mask, which will serve as a focus range limitation and help improve recognition accuracy.
* sam2_model: Select the SAM2 model.
* presicion: Model's persicion. can be selected from fp16, bf16, and fp32.
* cache_model: Whether to cache the model. After caching the model, it will save time for model loading.
* individual_object: When set to True, it will focus on identifying a single object. When set to False, attempts will be made to generate recognition boxes for multiple objects.
* mask_preview_color: Display the color of non masked areas in the preview output.
* detail_method: Edge processing methods. Only VITMatte method can be used.
* detail_erode: Mask the erosion range inward from the edge. the larger the value, the larger the range of inward repair.
* detail_dilate: The edge of the mask expands outward. the larger the value, the wider the range of outward repair.
* black_point: Edge black sampling threshold.
* white_point: Edge white sampling threshold.
* process_detail: Set to false here will skip edge processing to save runtime.
* device: Only cuda can be used.
* max_megapixels: Set the maximum size for VitMate operations.A larger size will result in finer mask edges, but it will lead to a significant decrease in computation speed.

### <a id="table1">ObjectDetectorFL2</a>
Use the Florence2 model to identify objects in images and output recognition box data.
*Download models from [BaiduNetdisk](https://pan.baidu.com/s/1hzw9-QiU1vB8pMbBgofZIA?pwd=mfl3) and copy to ```ComfyUI/models/florence2``` folder.

Node Options:
![image](image/object_detector_fl2_node.jpg)
* image: The image to segment.
* florence2_model: Florence2 model, it from [LoadFlorence2Model](#LoadFlorence2Model) node.
* prompt: Describe the object that needs to be identified.
* bbox_select: Select the input box data. There are three options: "all" to select all, "first" to select the box with the highest confidence, and "by_index" to specify the index of the box.
* select_index: This option is valid when bbox_delect is 'by_index'. 0 is the first one. Multiple values can be entered, separated by any non numeric character, including but not limited to commas, periods, semicolons, spaces or letters, and even Chinese.

### <a id="table1">ObjectDetectorYOLOWorld</a>
Use the YOLO-World model to identify objects in images and output recognition box data.
*Download models from [BaiduNetdisk](https://pan.baidu.com/s/1QpjajeTA37vEAU2OQnbDcQ?pwd=nqsk) or [GoogleDrive](https://drive.google.com/drive/folders/1nrsfq4S-yk9ewJgwrhXAoNVqIFLZ1at7?usp=sharing) and copy to ```ComfyUI/models/yolo-world``` folder.

Node Options:
![image](image/object_detector_yolo_world_node.jpg)
* image: The image to segment.
* confidence_threshold: The threshold of confidence.
* nms_iou_threshold: The threshold of Non-Maximum Suppression.
* prompt: Describe the object that needs to be identified.
* bbox_select: Select the input box data. There are three options: "all" to select all, "first" to select the box with the highest confidence, and "by_index" to specify the index of the box.
* select_index: This option is valid when bbox_delect is 'by_index'. 0 is the first one. Multiple values can be entered, separated by any non numeric character, including but not limited to commas, periods, semicolons, spaces or letters, and even Chinese.

### <a id="table1">ObjectDetectorYOLO8</a>
Use the YOLO-8 model to identify objects in images and output recognition box data.
*Download models from [GoogleDrive](https://drive.google.com/drive/folders/1I5TISO2G1ArSkKJu1O9b4Uvj3DVgn5d2) or [BaiduNetdisk](https://pan.baidu.com/s/1pEY6sjABQaPs6QtpK0q6XA?pwd=grqe) and copy to ```ComfyUI/models/yolo``` folder.

Node Options:
![image](image/object_detector_yolo8_node.jpg)
* image: The image to segment.
* yolo_model: Choose the yolo model.
* bbox_select: Select the input box data. There are three options: "all" to select all, "first" to select the box with the highest confidence, and "by_index" to specify the index of the box.
* select_index: This option is valid when bbox_delect is 'by_index'. 0 is the first one. Multiple values can be entered, separated by any non numeric character, including but not limited to commas, periods, semicolons, spaces or letters, and even Chinese.

### <a id="table1">ObjectDetectorMask</a>
Use mask as recognition box data. All areas surrounded by white areas on the mask will be recognized as an object. Multiple enclosed areas will be identified separately.

Node Options:
![image](image/object_detector_mask_node.jpg)
* object_mask: The mask input.
* bbox_select: Select the input box data. There are three options: "all" to select all, "first" to select the box with the highest confidence, and "by_index" to specify the index of the box.
* select_index: This option is valid when bbox_delect is 'by_index'. 0 is the first one. Multiple values can be entered, separated by any non numeric character, including but not limited to commas, periods, semicolons, spaces or letters, and even Chinese.

### <a id="table1">BBoxJoin</a>
Merge recognition box data.

Node Options:
![image](image/bbox_join_node.jpg)
* bboxes_1: Required input. The first set of identification boxes.
* bboxes_2: Optional input. The second set of identification boxes.
* bboxes_3: Optional input. The third set of identification boxes.
* bboxes_4: Optional input. The fourth set of identification boxes.

### <a id="table1">EVF-SAMUltra</a>
This node is implementation of [EVF-SAM](https://github.com/hustvl/EVF-SAM) in ComfyUI.
*Please download model files from [BaiduNetdisk](https://pan.baidu.com/s/1EvaxgKcCxUpMbYKzLnEx9w?pwd=69bn) or [huggingface/EVF-SAM2](https://huggingface.co/YxZhang/evf-sam2/tree/main), [huggingface/EVF-SAM](https://huggingface.co/YxZhang/evf-sam/tree/main) to ```ComfyUI/models/EVF-SAM``` folder(save the models in their respective subdirectories).
Expand Down
Loading

0 comments on commit 0defae2

Please sign in to comment.