Commit ImageTaggerSave and ImageAutoCropV3 nodes

chflame163 · Aug 25, 2024 · deedca4 · deedca4
1 parent 18e2103
commit deedca4
Show file tree

Hide file tree

Showing 10 changed files with 958 additions and 1 deletion.
diff --git a/README.MD b/README.MD
@@ -102,6 +102,7 @@ When this error has occurred, please check the network environment.
 ## Update
 <font size="4">**If the dependency package error after updating, please reinstall the relevant dependency packages. </font><br /> 
 
+* Commit [ImageTaggerSave](#ImageTaggerSave) and [ImageAutoCropV3](#ImageAutoCropV3) nodes. Used to implement the automatic trimming and marking workflow for the training set (the workflow ```image_tagger_save.json``` is located in the workflow directory).
 * Commit [CheckMaskV2](#CheckMaskV2) node, Added the ```simple``` method to detect masks more quickly.
 * Commit [ImageReel](#ImageReel) and [ImageReelComposite](#ImageReelComposite) nodes to composite multiple images on a canvas.
 * [NumberCalculatorV2](#NumberCalculatorV2) and [NumberCalculator](#NumberCalculator) add the ```min``` and ```max``` method.
@@ -1178,6 +1179,26 @@ The V2 upgrad version of ```ImageAutoCrop```, it has made the following changes
 * scale_by_length: The value here is used as ```scale_by``` to specify the length of the edge.
 
 
+### <a id="table1">ImageAutoCropV3</a> 
+Automatically crop the image to the specified size. You can input a mask to preserve the specified area of the mask. This node is designed to generate image materials for training the model. 
+
+Node Options:
+![image](image/image_auto_crop_v3_node.jpg) 
+* image: The input image.
+* mask: Optional input mask. The masking part will be preserved within the range of the cutting aspect ratio.
+* aspect_ratio: The aspect ratio of the output. Here are common frame ratios provided, with "custom" being the custom ratio and "original" being the original frame ratio.
+* proportional_width: Proportionally wide. If the aspect_ratio option is not 'custom', this setting will be ignored.
+* proportional_height: High proportion. If the aspect_ratio option is not 'custom', this setting will be ignored.
+* method: Scaling sampling methods include Lanczos, Bicubic, Hamming, Bilinear, Box, and Nearest.
+* scale_to_side: Allow scaling to be specified by long side, short side, width, height, or total pixels.
+* scale_to_length: The value here is used as the scale_to-side to specify the length of the edge or the total number of pixels (kilo pixels).
+* round_to_multiple: Multiply to the nearest whole. For example, if set to 8, the width and height will be forcibly set to multiples of 8.
+
+Outputs:
+cropped_image: The cropped image.
+box_preview: Preview of cutting position.
+
+
 ### <a id="table1">HLFrequencyDetailRestore</a>
 Using low frequency filtering and retaining high frequency to recover image details. Compared to [kijai's DetailTransfer](https://github.com/kijai/ComfyUI-IC-Light), this node is better integrated with the environment while retaining details.
 ![image](image/hl_frequency_detail_restore_example.jpg) 
@@ -1367,6 +1388,23 @@ Node Options:
 
 <sup>*</sup> Enter```%date``` for the current date (YY-mm-dd) and ```%time``` for the current time (HH-MM-SS). You can enter ```/``` for subdirectories. For example, ```%date/name_%tiem``` will output the image to the ```YY-mm-dd``` folder, with ```name_HH-MM-SS``` as the file name prefix.
 
+### <a id="table1">ImageTaggerSave</a> 
+![image](image/image_tagger_save_example.jpg) 
+The node used to save the training set images and their text labels, where the image files and text label files have the same file name. Customizable directory for saving images, adding timestamps to file names, selecting save formats, and setting image compression rates.
+*The workflow image_tagger_stave.exe is located in the workflow directory.
+
+Node Options:
+![image](image/image_tagger_save_node.jpg) 
+* iamge: The input image.
+* tag_text: Text label of image.
+* custom_path<sup>*</sup>: User-defined directory, enter the directory name in the correct format. If empty, it is saved in the default output directory of ComfyUI.
+* filename_prefix<sup>*</sup>: The prefix of file name.
+* timestamp: Timestamp the file name, opting for date, time to seconds, and time to milliseconds.
+* format: The format of image save. Currently available in ```png``` and ```jpg```. Note that only png format is supported for RGBA mode pictures.
+* quality: Image quality, the value range 10-100, the higher the value, the better the picture quality, the volume of the file also correspondingly increases.
+* preview: Preview switch.
+
+<sup>*</sup> Enter```%date``` for the current date (YY-mm-dd) and ```%time``` for the current time (HH-MM-SS). You can enter ```/``` for subdirectories. For example, ```%date/name_%tiem``` will output the image to the ```YY-mm-dd``` folder, with ```name_HH-MM-SS``` as the file name prefix.
 
 
 ### <a id="table1">AddBlindWaterMark</a>

diff --git a/README_CN.MD b/README_CN.MD
@@ -103,6 +103,7 @@ git clone https://github.com/chflame163/ComfyUI_LayerStyle.git
 ## 更新说明
 <font size="4">**如果本插件更新后出现依赖包错误，请重新安装相关依赖包。
 
+* 添加 [ImageTaggerSave](#ImageTaggerSave) 和 [ImageAutoCropV3](#ImageAutoCropV3) 节点，用于实现训练集自动裁切打标工作流(工作流```image_tagger_save_example.json```在workflow目录中)。
 * 添加 [CheckMaskV2](#CheckMaskV2) 节点，增加了```simple```方法以更快速检测遮罩。
 * 添加 [ImageReel ](#ImageReel) 和 [ImageReelComposit](#ImageReelComposit) 节点，可将多张图片显示在一起。
 * [NumberCalculatorV2](#NumberCalculatorV2) 和 [NumberCalculator](#NumberCalculator) 节点增加 ```min``` 和 ```max``` 方法。
@@ -1158,6 +1159,26 @@ cropped_mask: 裁切后的遮罩。
 * scale_by: 允许按长边、短边、宽度或高度指定尺寸缩放。
 * scale_by_length: 这里的数值作为scale_by指定边的长度。
 
+### <a id="table1">ImageAutoCropV3</a> 
+自动裁切图片到指定的尺寸。可输入mask以保留遮罩指定的区域。这个节点是为生成训练模型的图片素材而设计的。 
+
+
+节点选项说明:
+![image](image/image_auto_crop_v3_node.jpg) 
+* image: 输入的图像。
+* mask: 可选输入遮罩。遮罩部分将在裁切长宽比例范围内得到保留。
+* aspect_ratio: 输出的宽高比。这里提供了常见的画幅比例， "custom"为自定义比例， "original"为原始画面比例。
+* proportional_width: 比例宽。如果aspect_ratio选项不是"custom"，此处设置将被忽略。
+* proportional_height: 比例高。如果aspect_ratio选项不是"custom"，此处设置将被忽略。
+* method: 缩放的采样方法，包括lanczos、bicubic、hamming、bilinear、box和nearest。
+* scale_to_side: 允许按长边、短边、宽度、高度或总像素指定尺寸缩放。
+* scale_to_length: 这里的数值作为scale_to_side指定边的长度, 或者总像素数量(kilo pixels)。
+* round_to_multiple: 倍数取整。例如设置为8，宽和高将强制设置为8的倍数。
+
+输出:
+cropped_image: 裁切后的图像。
+box_preview: 裁切位置预览。
+
 ### <a id="table1">HLFrequencyDetailRestore</a>
 使用低频滤波加保留高频来恢复图像细节。相比[kijai's DetailTransfer](https://github.com/kijai/ComfyUI-IC-Light), 这个节点在保留细节的同时，与环境的融合度更好。
 ![image](image/hl_frequency_detail_restore_example.jpg) 
@@ -1350,6 +1371,24 @@ BooleanOperator的升级版，增加了节点内数值输入，增加了大于
 
 <sup>*</sup>输入```%date```表示当前日期(YY-mm-dd)，```%time```表示当前时间(HH-MM-SS)。可以输入```/```表示子目录。例如```%date/name_%time``` 将输出图片到```YY-mm-dd```文件夹下，以```name_HH-MM-SS```为文件名前缀。
 
+### <a id="table1">ImageTaggerSave</a> 
+![image](image/image_tagger_save_example.jpg) 
+用于保存训练集图片及其文本标签的节点，图片文件和文本标签文件具有相同的文件名。可自定义保存图片的目录，文件名增加时间戳，选择保存格式，设置图片压缩率。
+*工作流image_tagger_save_example.json在workflow目录中。
+
+节点选项说明:
+![image](image/image_tagger_save_node.jpg) 
+* iamge: 输入的图片。
+* tag_text: 文本标签。
+* custom_path<sup>*</sup>: 用户自定义目录，请按正确的格式输入目录名。如果为空则保存在ComfyUI默认的output目录。
+* filename_prefix<sup>*</sup>:文件名前缀。。
+* timestamp: 为文件名加上时间戳，可选择日期、时间到秒和时间到毫秒。
+* format:图片保存格式。目前提供png和jpg两种。
+* quality:图片质量，数值范围10-100，数值越高，图片质量越好，文件的体积也对应增大。
+* preview: 预览开关。
+
+<sup>*</sup>输入```%date```表示当前日期(YY-mm-dd)，```%time```表示当前时间(HH-MM-SS)。可以输入```/```表示子目录。例如```%date/name_%time``` 将输出图片到```YY-mm-dd```文件夹下，以```name_HH-MM-SS```为文件名前缀。
+
 ### <a id="table1">AddBlindWaterMark</a>
 ![image](image/watermark_example.jpg) 
 给图片添加隐形水印。以肉眼无法觉察的方式添加水印图片，使用```ShowBlindWaterMark```节点可以解码水印。

diff --git a/image/image_auto_crop_v3_node.jpg b/image/image_auto_crop_v3_node.jpg
diff --git a/image/image_tagger_save_example.jpg b/image/image_tagger_save_example.jpg
diff --git a/image/image_tagger_save_node.jpg b/image/image_tagger_save_node.jpg
diff --git a/py/image_auto_crop_v3.py b/py/image_auto_crop_v3.py
@@ -0,0 +1,187 @@
+from .imagefunc import *
+
+NODE_NAME = 'ImageAutoCropV3'
+
+class ImageAutoCropV3:
+
+ def __init__(self):
+ pass
+
+ @classmethod
+ def INPUT_TYPES(self):
+ ratio_list = ['1:1', '3:2', '4:3', '16:9', '2:3', '3:4', '9:16', 'custom', 'original']
+ scale_to_side_list = ['None', 'longest', 'shortest', 'width', 'height', 'total_pixel(kilo pixel)']
+ multiple_list = ['8', '16', '32', '64', '128', '256', '512', 'None']
+ method_mode = ['lanczos', 'bicubic', 'hamming', 'bilinear', 'box', 'nearest']
+ return {
+ "required": {
+ "image": ("IMAGE", ),
+ "aspect_ratio": (ratio_list,),
+ "proportional_width": ("INT", {"default": 1, "min": 1, "max": 99999999, "step": 1}),
+ "proportional_height": ("INT", {"default": 1, "min": 1, "max": 99999999, "step": 1}),
+ "method": (method_mode,),
+ "scale_to_side": (scale_to_side_list,),
+ "scale_to_length": ("INT", {"default": 1024, "min": 4, "max": 999999, "step": 1}),
+ "round_to_multiple": (multiple_list,),
+ },
+ "optional": {
+ "mask": ("MASK",),
+ }
+ }
+
+ RETURN_TYPES = ("IMAGE", "IMAGE",)
+ RETURN_NAMES = ("cropped_image", "box_preview",)
+ FUNCTION = 'image_auto_crop_v3'
+ CATEGORY = '😺dzNodes/LayerUtility'
+
+ def image_auto_crop_v3(self, image, aspect_ratio,
+ proportional_width, proportional_height, method,
+ scale_to_side, scale_to_length, round_to_multiple,
+ mask=None,
+ ):
+
+ ret_images = []
+ ret_box_previews = []
+ ret_masks = []
+ input_images = []
+ input_masks = []
+ crop_boxs = []
+
+ for l in image:
+ input_images.append(torch.unsqueeze(l, 0))
+ m = tensor2pil(l)
+ if m.mode == 'RGBA':
+ input_masks.append(m.split()[-1])
+ if mask is not None:
+ if mask.dim() == 2:
+ mask = torch.unsqueeze(mask, 0)
+ input_masks = []
+ for m in mask:
+ input_masks.append(tensor2pil(torch.unsqueeze(m, 0)).convert('L'))
+
+ if len(input_masks) > 0 and len(input_masks) != len(input_images):
+ input_masks = []
+ log(f"Warning, {NODE_NAME} unable align alpha to image, drop it.", message_type='warning')
+
+ fit = 'crop'
+ _image = tensor2pil(input_images[0])
+ (orig_width, orig_height) = _image.size
+ if aspect_ratio == 'custom':
+ ratio = proportional_width / proportional_height
+ elif aspect_ratio == 'original':
+ ratio = orig_width / orig_height
+ else:
+ s = aspect_ratio.split(":")
+ ratio = int(s[0]) / int(s[1])
+
+ resize_sampler = Image.LANCZOS
+ if method == "bicubic":
+ resize_sampler = Image.BICUBIC
+ elif method == "hamming":
+ resize_sampler = Image.HAMMING
+ elif method == "bilinear":
+ resize_sampler = Image.BILINEAR
+ elif method == "box":
+ resize_sampler = Image.BOX
+ elif method == "nearest":
+ resize_sampler = Image.NEAREST
+
+ # calculate target width and height
+ if ratio > 1:
+ if scale_to_side == 'longest':
+ target_width = scale_to_length
+ target_height = int(target_width / ratio)
+ elif scale_to_side == 'shortest':
+ target_height = scale_to_length
+ target_width = int(target_height * ratio)
+ elif scale_to_side == 'width':
+ target_width = scale_to_length
+ target_height = int(target_width / ratio)
+ elif scale_to_side == 'height':
+ target_height = scale_to_length
+ target_width = int(target_height * ratio)
+ elif scale_to_side == 'total_pixel(kilo pixel)':
+ target_width = math.sqrt(ratio * scale_to_length * 1000)
+ target_height = target_width / ratio
+ target_width = int(target_width)
+ target_height = int(target_height)
+ else:
+ target_width = orig_width
+ target_height = int(target_width / ratio)
+ else:
+ if scale_to_side == 'longest':
+ target_height = scale_to_length
+ target_width = int(target_height * ratio)
+ elif scale_to_side == 'shortest':
+ target_width = scale_to_length
+ target_height = int(target_width / ratio)
+ elif scale_to_side == 'width':
+ target_width = scale_to_length
+ target_height = int(target_width / ratio)
+ elif scale_to_side == 'height':
+ target_height = scale_to_length
+ target_width = int(target_height * ratio)
+ elif scale_to_side == 'total_pixel(kilo pixel)':
+ target_width = math.sqrt(ratio * scale_to_length * 1000)
+ target_height = target_width / ratio
+ target_width = int(target_width)
+ target_height = int(target_height)
+ else:
+ target_height = orig_height
+ target_width = int(target_height * ratio)
+
+ if round_to_multiple != 'None':
+ multiple = int(round_to_multiple)
+ target_width = num_round_up_to_multiple(target_width, multiple)
+ target_height = num_round_up_to_multiple(target_height, multiple)
+
+ for i in range(len(input_images)):
+ _image = tensor2pil(input_images[i]).convert('RGB')
+
+ if len(input_masks) > 0:
+ _mask = input_masks[i]
+ else:
+ _mask = Image.new('L', _image.size, color='black')
+
+ bluredmask = gaussian_blur(_mask, 20).convert('L')
+ (mask_x, mask_y, mask_w, mask_h) = mask_area(bluredmask)
+ orig_ratio = _image.width / _image.height
+ target_ratio = target_width / target_height
+ # crop image to target ratio
+ if orig_ratio > target_ratio: # crop LiftRight side
+ crop_w = int(_image.height * target_ratio)
+ crop_h = _image.height
+ else: # crop TopBottom side
+ crop_w = _image.width
+ crop_h = int(_image.width / target_ratio)
+ crop_x = mask_w // 2 + mask_x - crop_w // 2
+ if crop_x < 0:
+ crop_x = 0
+ if crop_x + crop_w > _image.width:
+ crop_x = _image.width - crop_w
+ crop_y = mask_h // 2 + mask_y - crop_h // 2
+ if crop_y < 0:
+ crop_y = 0
+ if crop_y + crop_h > _image.height:
+ crop_y = _image.height - crop_h
+ crop_image = _image.crop((crop_x, crop_y, crop_x + crop_w, crop_y + crop_h))
+ line_width = (_image.width + _image.height) // 200
+ preview_image = draw_rect(_image, crop_x, crop_y,
+ crop_w, crop_h,
+ line_color="#F00000", line_width=line_width)
+ ret_image = crop_image.resize((target_width, target_height), resize_sampler)
+ ret_images.append(pil2tensor(ret_image))
+ ret_box_previews.append(pil2tensor(preview_image))
+
+ log(f"{NODE_NAME} Processed {len(ret_images)} image(s).", message_type='finish')
+ return (torch.cat(ret_images, dim=0),
+ torch.cat(ret_box_previews, dim=0),
+ )
+
+NODE_CLASS_MAPPINGS = {
+ "LayerUtility: ImageAutoCrop V3": ImageAutoCropV3
+}
+
+NODE_DISPLAY_NAME_MAPPINGS = {
+ "LayerUtility: ImageAutoCrop V3": "LayerUtility: ImageAutoCrop V3"
+}