Commands to MQ Training with VSGN #1

JunweiLiang · 2022-07-08T03:57:32Z

Hi, thanks for releasing the code!

Could you provide some instructions on how to run VSGN training with EgoVLP features (hyper-parameters, learning rate, etc.)? Thanks!

Junwei

QinghongLin · 2022-07-19T09:05:32Z

Hello Junwei,

Thanks for your interest for our work,
I will update the instruction and related details for MQ next,

Thank you for your patience!

QinghongLin · 2022-08-18T06:09:20Z

Hi Junwei,

I have uploaded the video features for MQ tasks to G drive: train&val / test, so that you can download it directly.
What you need to do is replace the input features with our features.
and I have attached our config of the best VSGN model in here config.txt.

Please try it out and let us know if you have new results.

JunweiLiang · 2022-08-18T08:01:42Z

I have downloaded the features but they seem to be a single file. Are they a single pickle binary with dictionary keys? How to read them and map them to the videos (for example, slowfast8x8_r101_k400/ has 9645 *.pt files each corresponds to a video)?

Thanks.

QinghongLin · 2022-08-18T08:11:42Z

There is a gz file, after unzipping it (I unzip it on my mac), you will see a document that contains multiple *.pt.
e.g., 0a8f6747-7f79-4176-85ca-f5ec01a15435.pt, this pt file corresponding to the video features of the clip: 0a8f6747-7f79-4176-85ca-f5ec01a15435.

The clip information is provided by the MQ metadata, i.e., clip xxx come from the video yyy with start time t1 and end time t2.

JunweiLiang · 2022-08-18T08:19:39Z

I see. The file you provided on Google drive is a .tar.gz file, and I extract it with tar -zxf and got 2034 *.pt file for the train/val part. Will try them.

JunweiLiang · 2022-08-18T08:48:44Z

So 0a8f6747-7f79-4176-85ca-f5ec01a15435 is the clip ID instead of video ID? Could you provide the feature files of the whole video as the VSGN baseline? They read the feature of the whole video and then cut the corresponding clip (see here). To follow your instructions I would need this video-level features.

Thanks.

QinghongLin · 2022-08-18T11:11:22Z

Yes, it is the clip ID. And sorry, I am currently unable to provide video-level features, a solution is to rewrite the data loader so that supports clip features as input.

srama2512 · 2022-09-15T04:47:22Z

@QinghongLin - Thanks for providing the clip features. I tried training the VSGN model using the Ego4D episodic-memory codebase instructions. But I'm not able to reproduce the val results from the paper. The numbers are quite a bit lower than the paper results (2nd row vs. 3rd row in the figure below).

Here is the training command I used. Note: I modified the data loader to use clip features instead of video features.

 python Train.py \
     --use_xGPN \
     --is_train true \
     --dataset ego4d \
     --feature_path data/egovlp_feats_official \
     --checkpoint_path checkpoints/ \
     --tb_dir tb/ \
     --batch_size 24 \
     --train_lr 0.00005 \
     --use_clip_features true \
     --input_feat_dim 256 \
     --num_epoch 100

QinghongLin · 2022-09-19T04:55:40Z

Hi, @srama2512 ,
I released the codebase here MQ.zip, you can check the data loader detail regarding clip-level feature loading.
Besides, I am able to check the config parameters, can you have a try at the following parameters?

{'dataset': 'ego4d', 'is_train': 'true', 'out_prop_map': 'true', 'feature_path': '/mnt/sdb1/Datasets/Ego4d/action_feature_canonical', 'clip_anno': 'Evaluation/ego4d/annot/clip_annotations.json', 'moment_classes': 'Evaluation/ego4d/annot/moment_classes_idx.json', 'checkpoint_path': 'checkpoint', 'output_path': './outputs/hps_search_egovlp_egonce_features/23/', 'prop_path': 'proposals', 'prop_result_file': 'proposals_postNMS.json', 'detect_result_file': 'detections_postNMS.json', 'retrieval_result_file': 'retreival_postNMS.json', 'detad_sensitivity_file': 'detad_sensitivity', 'batch_size': 32, 'train_lr': 5e-05, 'weight_decay': 0.0001, 'num_epoch': 50, 'step_size': 15, 'step_gamma': 0.1, 'focal_alpha': 0.01, 'nms_alpha_detect': 0.46, 'nms_alpha_prop': 0.75, 'nms_thr': 0.4, 'temporal_scale': 928, 'input_feat_dim': 2304, 'bb_hidden_dim': 256, 'decoder_num_classes': 111, 'num_levels': 5, 'num_head_layers': 4, 'nfeat_mode': 'feat_ctr', 'num_neigh': 12, 'edge_weight': 'false', 'agg_type': 'max', 'gcn_insert': 'par', 'iou_thr': [0.5, 0.5, 0.7], 'anchor_scale': [1, 10], 'base_stride': 1, 'stitch_gap': 30, 'short_ratio': 0.4, 'clip_win_size': 0.38, 'use_xGPN': False, 'use_VSS': False, 'num_props': 200, 'tIoU_thr': [0.1, 0.2, 0.3, 0.4, 0.5], 'eval_stage': 'all', 'infer_datasplit': 'val'}

srama2512 · 2022-11-06T16:32:44Z

@QinghongLin - Thanks for sharing your code and the hyperparameters. I was able to obtain a similar performance. It turns out that there was a bug in the test_mq.py feature-extraction code that I used. I modified test_mq.py to increase the batch size here to 128.

EgoVLP/run/test_mq.py

Lines 77 to 87 in dc4a60f

    
           batch = 4 
        
           times = data['video'].shape[0] // batch 
        
           for j in range(times): 
        
               start = j*batch 
        
               if (j+1) * batch > data['video'].shape[0]: 
        
                   end = data['video'].shape[0] 
        
               else: 
        
                   end = (j+1)*batch 
        
               outs[start:end,] = \ 
        
                   model.compute_video(data['video'][start:end,])

The calculation of times = data['video'].shape[0] // batch does not work when video shape is not a multiple of the batch. It gets much worse when we increase the batch, leaving a residual set of all-zero features in the end. After changing that part of the code to the snippet below, it works as expected.

if data['video'].shape[0]% batch == 0:
    times = data['video'].shape[0] // batch
else:
    times = data['video'].shape[0] // batch + 1

Happy to send a PR if you'd like this bug-fix to be a part of the EgoVLP repo. This affects most of the test_*.py and causes a significant issue if anyone increases batch.

QinghongLin closed this as completed Nov 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commands to MQ Training with VSGN #1

Commands to MQ Training with VSGN #1

JunweiLiang commented Jul 8, 2022

QinghongLin commented Jul 19, 2022

QinghongLin commented Aug 18, 2022

JunweiLiang commented Aug 18, 2022

QinghongLin commented Aug 18, 2022

JunweiLiang commented Aug 18, 2022

JunweiLiang commented Aug 18, 2022

QinghongLin commented Aug 18, 2022

srama2512 commented Sep 15, 2022

QinghongLin commented Sep 19, 2022 •

edited

Loading

srama2512 commented Nov 6, 2022 •

edited

Loading

Commands to MQ Training with VSGN #1

Commands to MQ Training with VSGN #1

Comments

JunweiLiang commented Jul 8, 2022

QinghongLin commented Jul 19, 2022

QinghongLin commented Aug 18, 2022

JunweiLiang commented Aug 18, 2022

QinghongLin commented Aug 18, 2022

JunweiLiang commented Aug 18, 2022

JunweiLiang commented Aug 18, 2022

QinghongLin commented Aug 18, 2022

srama2512 commented Sep 15, 2022

QinghongLin commented Sep 19, 2022 • edited Loading

srama2512 commented Nov 6, 2022 • edited Loading

QinghongLin commented Sep 19, 2022 •

edited

Loading

srama2512 commented Nov 6, 2022 •

edited

Loading