Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train normal but val map all 0 #531

Open
huihui6666 opened this issue Nov 26, 2024 · 3 comments
Open

train normal but val map all 0 #531

huihui6666 opened this issue Nov 26, 2024 · 3 comments

Comments

@huihui6666
Copy link

Hi,I finetune in my dataset,the loss of train drop from 150 to 59,but the val map is all 0

@huihui6666
Copy link
Author

huihui6666 commented Nov 26, 2024

I use the config file yolo_world_v2_l_efficient_neck_2e-4_80e_8gpus_mask-refine_finetune_coco.py
`base = (
'../../third_party/mmyolo/configs/yolov8/'
'yolov8_l_mask-refine_syncbn_fast_8xb16-500e_coco.py')
custom_imports = dict(
imports=['yolo_world'],
allow_failed_imports=False)

hyper-parameters

num_classes = 57
num_training_classes = 57
max_epochs = 80 # Maximum training epochs
close_mosaic_epochs = 10
save_epoch_intervals = 1
text_channels = 512
neck_embed_channels = [128, 256, base.last_stage_out_channels // 2]
neck_num_heads = [4, 8, base.last_stage_out_channels // 2 // 32]
base_lr = 2e-4
weight_decay = 0.05
train_batch_size_per_gpu = 16
load_from = 'pretrained_models/yolo_world_l_clip_t2i_bn_2e-3adamw_32xb16-100e_obj365v1_goldg_cc3mlite_train-ca93cd1f.pth'
text_model_name = '../pretrained_models/clip-vit-base-patch32-projection'
text_model_name = 'openai/clip-vit-base-patch32'
persistent_workers = False

model settings

model = dict(
type='YOLOWorldDetector',
mm_neck=True,
num_train_classes=num_training_classes,
num_test_classes=num_classes,
data_preprocessor=dict(type='YOLOWDetDataPreprocessor'),
backbone=dict(
delete=True,
type='MultiModalYOLOBackbone',
image_model={{base.model.backbone}},
text_model=dict(
type='HuggingCLIPLanguageBackbone',
model_name=text_model_name,
frozen_modules=['all'])),
neck=dict(type='YOLOWorldPAFPN',
guide_channels=text_channels,
embed_channels=neck_embed_channels,
num_heads=neck_num_heads,
block_cfg=dict(type='EfficientCSPLayerWithTwoConv')),
bbox_head=dict(type='YOLOWorldHead',
head_module=dict(type='YOLOWorldHeadModule',
use_bn_head=True,
embed_dims=text_channels,
num_classes=num_training_classes)),
train_cfg=dict(assigner=dict(num_classes=num_training_classes)))

dataset settings

text_transform = [
dict(type='RandomLoadText',
num_neg_samples=(num_classes, num_classes),
max_num_samples=num_training_classes,
padding_to_max=True,
padding_value=''),
dict(type='mmdet.PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',
'flip_direction', 'texts'))
]
mosaic_affine_transform = [
dict(
type='MultiModalMosaic',
img_scale=base.img_scale,
pad_val=114.0,
pre_transform=base.pre_transform),
dict(type='YOLOv5CopyPaste', prob=base.copypaste_prob),
dict(
type='YOLOv5RandomAffine',
max_rotate_degree=0.0,
max_shear_degree=0.0,
max_aspect_ratio=100.,
scaling_ratio_range=(1 - base.affine_scale,
1 + base.affine_scale),
# img_scale is (width, height)
border=(-base.img_scale[0] // 2, -base.img_scale[1] // 2),
border_val=(114, 114, 114),
min_area_ratio=base.min_area_ratio,
use_mask_refine=base.use_mask2refine)
]
train_pipeline = [
*base.pre_transform,
mosaic_affine_transform,
dict(
type='YOLOv5MultiModalMixUp',
prob=base.mixup_prob,
pre_transform=[
base.pre_transform,
*mosaic_affine_transform]),
*base.last_transform[:-1],
*text_transform
]
train_pipeline_stage2 = [
*base.train_pipeline_stage2[:-1],
*text_transform
]
coco_train_dataset = dict(
delete=True,
type='MultiModalDataset',
dataset=dict(
type='YOLOv5CocoDataset',
data_root='/home/lihui/datasets/coco',
ann_file='annotations/new_train2017_instances.json',
data_prefix=dict(img='images/train2017/'),
filter_cfg=dict(filter_empty_gt=False, min_size=32)),
class_text_path='data/texts/coco_class_texts.json',
pipeline=train_pipeline)

train_dataloader = dict(
persistent_workers=persistent_workers,
batch_size=train_batch_size_per_gpu,
collate_fn=dict(type='yolow_collate'),
dataset=coco_train_dataset)
test_pipeline = [
*base.test_pipeline[:-1],
dict(type='LoadText'),
dict(
type='mmdet.PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor', 'pad_param', 'texts'))
]
coco_val_dataset = dict(
delete=True,
type='MultiModalDataset',
dataset=dict(
type='YOLOv5CocoDataset',
data_root='/home/lihui/datasets/coco',
ann_file='annotations/new_val2017_instances.json',
data_prefix=dict(img='images/val2017/'),
filter_cfg=dict(filter_empty_gt=False, min_size=32)),
class_text_path='data/texts/my_coco_class_texts.json',
pipeline=test_pipeline)
val_dataloader = dict(dataset=coco_val_dataset)
test_dataloader = val_dataloader

training settings

default_hooks = dict(
param_scheduler=dict(
scheduler_type='linear',
lr_factor=0.01,
max_epochs=max_epochs),
checkpoint=dict(
max_keep_ckpts=-1,
save_best=None,
interval=save_epoch_intervals))
custom_hooks = [
dict(
type='EMAHook',
ema_type='ExpMomentumEMA',
momentum=0.0001,
update_buffers=True,
strict_load=False,
priority=49),
dict(
type='mmdet.PipelineSwitchHook',
switch_epoch=max_epochs - close_mosaic_epochs,
switch_pipeline=train_pipeline_stage2)
]
train_cfg = dict(
max_epochs=max_epochs,
val_interval=5,
dynamic_intervals=[((max_epochs - close_mosaic_epochs),
base.val_interval_stage2)])
optim_wrapper = dict(
optimizer=dict(
delete=True,
type='AdamW',
lr=base_lr,
weight_decay=weight_decay,
batch_size_per_gpu=train_batch_size_per_gpu),
paramwise_cfg=dict(
custom_keys={'backbone.text_model': dict(lr_mult=0.01),
'logit_scale': dict(weight_decay=0.0)}),
constructor='YOLOWv5OptimizerConstructor')

evaluation settings

val_evaluator = dict(
delete=True,
type='mmdet.CocoMetric',
proposal_nums=(100, 1, 10),
ann_file='/home/lihui/datasets/coco/annotations/new_val2017_instances.json',
metric='bbox')
`

@huihui6666
Copy link
Author

some of my train log as follow
2024/11/26 11:33:52 - mmengine - INFO - Exp name: yolo_world_v2_m_vlpan_bn_2e-4_80e_8gpus_mask-refine_finetune_coco_my_20241126_091045
2024/11/26 11:34:01 - mmengine - INFO - Epoch(train) [35][7050/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:35:59 time: 0.2570 data_time: 0.0024 memory: 7823 grad_norm: 127.9921 loss: 59.7234 loss_cls: 22.0544 loss_bbox: 17.9890 loss_dfl: 19.6800
2024/11/26 11:34:14 - mmengine - INFO - Epoch(train) [35][7100/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:36:01 time: 0.2550 data_time: 0.0025 memory: 7570 grad_norm: 122.7781 loss: 58.5399 loss_cls: 21.2850 loss_bbox: 17.7463 loss_dfl: 19.5087
2024/11/26 11:34:27 - mmengine - INFO - Epoch(train) [35][7150/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:36:06 time: 0.2585 data_time: 0.0025 memory: 7476 grad_norm: 136.3488 loss: 59.4846 loss_cls: 22.1040 loss_bbox: 17.7713 loss_dfl: 19.6094
2024/11/26 11:34:40 - mmengine - INFO - Epoch(train) [35][7200/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:36:10 time: 0.2595 data_time: 0.0024 memory: 7890 grad_norm: 123.3398 loss: 58.9759 loss_cls: 21.3812 loss_bbox: 18.1015 loss_dfl: 19.4932
2024/11/26 11:34:53 - mmengine - INFO - Epoch(train) [35][7250/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:36:17 time: 0.2636 data_time: 0.0025 memory: 7543 grad_norm: 126.9679 loss: 58.8478 loss_cls: 21.6973 loss_bbox: 17.7954 loss_dfl: 19.3551
2024/11/26 11:35:06 - mmengine - INFO - Epoch(train) [35][7300/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:36:18 time: 0.2523 data_time: 0.0024 memory: 7516 grad_norm: 142.3983 loss: 59.2025 loss_cls: 21.8677 loss_bbox: 17.7598 loss_dfl: 19.5749
2024/11/26 11:35:18 - mmengine - INFO - Epoch(train) [35][7350/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:36:13 time: 0.2399 data_time: 0.0025 memory: 7743 grad_norm: 134.2802 loss: 59.7969 loss_cls: 22.3881 loss_bbox: 17.8545 loss_dfl: 19.5544
2024/11/26 11:35:29 - mmengine - INFO - Epoch(train) [35][7400/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:36:01 time: 0.2236 data_time: 0.0024 memory: 7504 grad_norm: 131.0194 loss: 58.4260 loss_cls: 21.4178 loss_bbox: 17.6953 loss_dfl: 19.3128
2024/11/26 11:35:40 - mmengine - INFO - Epoch(train) [35][7450/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:35:44 time: 0.2118 data_time: 0.0024 memory: 7690 grad_norm: 126.3277 loss: 59.0065 loss_cls: 21.8161 loss_bbox: 17.7280 loss_dfl: 19.4624
2024/11/26 11:35:51 - mmengine - INFO - Epoch(train) [35][7500/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:35:30 time: 0.2173 data_time: 0.0023 memory: 7410 grad_norm: 127.3711 loss: 59.7367 loss_cls: 22.2691 loss_bbox: 17.9271 loss_dfl: 19.5405
2024/11/26 11:36:02 - mmengine - INFO - Epoch(train) [35][7550/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:35:16 time: 0.2196 data_time: 0.0024 memory: 7543 grad_norm: 143.3271 loss: 59.3853 loss_cls: 22.1371 loss_bbox: 17.8032 loss_dfl: 19.4451
2024/11/26 11:36:13 - mmengine - INFO - Epoch(train) [35][7600/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:35:03 time: 0.2201 data_time: 0.0024 memory: 7664 grad_norm: 126.6539 loss: 58.3427 loss_cls: 21.2495 loss_bbox: 17.7076 loss_dfl: 19.3856
2024/11/26 11:36:24 - mmengine - INFO - Epoch(train) [35][7650/7676] base_lr: 2.0000e-04 lr: 1.1833e-04 eta: 21:34:48 time: 0.2168 data_time: 0.0024 memory: 7436 grad_norm: 132.4634 loss: 58.1278 loss_cls: 21.3929 loss_bbox: 17.3028 loss_dfl: 19.4322
2024/11/26 11:36:29 - mmengine - INFO - Exp name: yolo_world_v2_m_vlpan_bn_2e-4_80e_8gpus_mask-refine_finetune_coco_my_20241126_091045
2024/11/26 11:36:29 - mmengine - INFO - Saving checkpoint at 35 epochs

2024/11/26 11:40:20 - mmengine - INFO - Epoch(val) [35][6100/6429] eta: 0:00:12 time: 0.0351 data_time: 0.0003 memory: 1123
2024/11/26 11:40:21 - mmengine - INFO - Epoch(val) [35][6150/6429] eta: 0:00:10 time: 0.0352 data_time: 0.0003 memory: 1123
2024/11/26 11:40:23 - mmengine - INFO - Epoch(val) [35][6200/6429] eta: 0:00:08 time: 0.0338 data_time: 0.0003 memory: 1123
2024/11/26 11:40:25 - mmengine - INFO - Epoch(val) [35][6250/6429] eta: 0:00:06 time: 0.0365 data_time: 0.0003 memory: 1123
2024/11/26 11:40:27 - mmengine - INFO - Epoch(val) [35][6300/6429] eta: 0:00:04 time: 0.0343 data_time: 0.0003 memory: 1123
2024/11/26 11:40:28 - mmengine - INFO - Epoch(val) [35][6350/6429] eta: 0:00:02 time: 0.0321 data_time: 0.0003 memory: 1123
2024/11/26 11:40:30 - mmengine - INFO - Epoch(val) [35][6400/6429] eta: 0:00:01 time: 0.0336 data_time: 0.0003 memory: 1123
2024/11/26 11:40:42 - mmengine - INFO - Evaluating bbox...
2024/11/26 11:41:52 - mmengine - INFO - bbox_mAP_copypaste: 0.000 0.001 0.000 0.000 0.001 0.000
2024/11/26 11:41:53 - mmengine - INFO - Epoch(val) [35][6429/6429] coco/bbox_mAP: 0.0000 coco/bbox_mAP_50: 0.0010 coco/bbox_mAP_75: 0.0000 coco/bbox_mAP_s: 0.0000 coco/bbox_mAP_m: 0.0010 coco/bbox_mAP_l: 0.0000 data_time: 0.0004 time: 0.0367

@Vireakdara
Copy link

Vireakdara commented Dec 21, 2024

Have you found the solution yet ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants