Question about the mAR #53

ZachWong123 · 2023-11-27T03:45:46Z

I've observed a common practice in video understanding models and research papers where the recall rate is essentially not provided. Recently, my teacher assigned me a project that requires both accuracy (acc) and recall (AR). When I explained to him that papers usually don't include recall, he suggested running the code and modifying it to output the recall rate. However, I'm uncertain about how to proceed with this. I would greatly appreciate your prompt response. Thank you!

Andy1621 · 2023-11-29T17:01:01Z

Here are the code for accuracy：

UniFormerV2/slowfast/utils/metrics.py

Lines 9 to 64 in 722a434

 def topks_correct(preds, labels, ks): 

 """ 

  Given the predictions, labels, and a list of top-k values, compute the 

  number of correct predictions for each top-k value. 

  Args: 

  preds (array): array of predictions. Dimension is batchsize 

  N x ClassNum. 

  labels (array): array of labels. Dimension is batchsize N. 

  ks (list): list of top-k values. For example, ks = [1, 5] correspods 

  to top-1 and top-5. 

  Returns: 

  topks_correct (list): list of numbers, where the `i`-th entry 

  corresponds to the number of top-`ks[i]` correct predictions. 

  """ 

 assert preds.size(0) == labels.size( 

 0 

 ), "Batch dim of predictions and labels must match" 

 # Find the top max_k predictions for each sample 

 _top_max_k_vals, top_max_k_inds = torch.topk( 

 preds, max(ks), dim=1, largest=True, sorted=True 

 ) 

 # (batch_size, max_k) -> (max_k, batch_size). 

 top_max_k_inds = top_max_k_inds.t() 

 # (batch_size, ) -> (max_k, batch_size). 

 rep_max_k_labels = labels.view(1, -1).expand_as(top_max_k_inds) 

 # (i, j) = 1 if top i-th prediction for the j-th sample is correct. 

 top_max_k_correct = top_max_k_inds.eq(rep_max_k_labels) 

 # Compute the number of topk correct predictions for each k. 

 topks_correct = [top_max_k_correct[:k, :].float().sum() for k in ks] 

 return topks_correct 

 def topk_errors(preds, labels, ks): 

 """ 

  Computes the top-k error for each k. 

  Args: 

  preds (array): array of predictions. Dimension is N. 

  labels (array): array of labels. Dimension is N. 

  ks (list): list of ks to calculate the top accuracies. 

  """ 

 num_topks_correct = topks_correct(preds, labels, ks) 

 return [(1.0 - x / preds.size(0)) * 100.0 for x in num_topks_correct] 

 def topk_accuracies(preds, labels, ks): 

 """ 

  Computes the top-k accuracy for each k. 

  Args: 

  preds (array): array of predictions. Dimension is N. 

  labels (array): array of labels. Dimension is N. 

  ks (list): list of ks to calculate the top accuracies. 

  """ 

 num_topks_correct = topks_correct(preds, labels, ks) 

 return [(x / preds.size(0)) * 100.0 for x in num_topks_correct]

The below code is modified by GPT, please try it

def mean_recall(preds, labels, num_classes):
    """
    Calculate the mean recall given predictions and labels.

    Args:
        preds (Tensor): Predictions from the model. Dimension is N x ClassNum.
        labels (Tensor): True labels. Dimension is N.
        num_classes (int): Number of classes.

    Returns:
        mean_recall (float): The mean recall over all classes.
    """
    assert preds.size(0) == labels.size(0), "Batch dim of predictions and labels must match"
    
    # Convert predictions to class indices
    _, predicted_classes = preds.max(dim=1)
    
    # Initialize TP and FN counters
    TP = [0] * num_classes
    FN = [0] * num_classes

    # Count TP and FN for each class
    for i in range(num_classes):
        TP[i] = ((predicted_classes == i) & (labels == i)).sum().item()
        FN[i] = ((predicted_classes != i) & (labels == i)).sum().item()

    # Calculate recall for each class
    recalls = [TP[i] / (TP[i] + FN[i]) if (TP[i] + FN[i]) > 0 else 0 for i in range(num_classes)]

    # Calculate mean recall
    mean_recall = sum(recalls) / num_classes

    return mean_recall

ZachWong123 · 2023-11-30T04:29:02Z

Thank you for your response!I will try it later.

ZachWong123 · 2023-12-05T07:30:18Z

"Hello, may I ask if you could provide the link to BaiduNetdisk again? The dataset I obtained does not match the .csv annotations you provided as the same.The one in the picture is no longer accessible.The code is wrong.

Andy1621 · 2023-12-05T09:20:39Z

链接: https://pan.baidu.com/s/1T8omLX_HE88CdbFpVupalA 密码: f4vw

ZachWong123 · 2024-01-23T07:15:13Z

链接: https://pan.baidu.com/s/1T8omLX_HE88CdbFpVupalA 密码: f4vw

非常感谢您的回复,目前我已经在您的模型上测试了k400_b16_f8x224模型在k400数据集上的表现,得到了83.38的结果,跟您给出的点数基本一致,于是我想要去测试一下您的k400+k710_l14_f64x336模型，但是测试在我的单卡3090上好像没办法启动，请问您能给我一些配置文件的修改意见么，我将贴出我现在使用的test.sh和config.yaml
test.sh
NUM_SHARDS=1
NUM_GPUS=1
BATCH_SIZE=16
BASE_LR=1.5e-5
PYTHONPATH=$PYTHONPATH:./slowfast
python tools/run_net_multi_node.py
--init_method tcp://localhost:10125
--cfg ./exp/k400/k400+k710_l14_f64x336/config.yaml
--num_shards $NUM_SHARDS
DATA.PATH_TO_DATA_DIR /root/data/UniformerV2/new_k400/kinetics_400/videos_320
DATA.PATH_PREFIX /root/data/UniformerV2/new_k400/kinetics_400/videos_320
DATA.PATH_LABEL_SEPARATOR ","
TRAIN.EVAL_PERIOD 1
TRAIN.CHECKPOINT_PERIOD 100
TRAIN.BATCH_SIZE $BATCH_SIZE
TRAIN.SAVE_LATEST False
NUM_GPUS $NUM_GPUS
NUM_SHARDS $NUM_SHARDS
SOLVER.MAX_EPOCH 5
SOLVER.BASE_LR $BASE_LR
SOLVER.BASE_LR_SCALE_NUM_SHARDS False
SOLVER.WARMUP_EPOCHS 1.
TRAIN.ENABLE False
TEST.NUM_ENSEMBLE_VIEWS 2
TEST.NUM_SPATIAL_CROPS 3
TEST.TEST_BEST True
TEST.ADD_SOFTMAX True
TEST.BATCH_SIZE 4
RNG_SEED 6666
OUTPUT_DIR .

config.yaml
TRAIN:
ENABLE: True
DATASET: kinetics_sparse
BATCH_SIZE: 256
EVAL_PERIOD: 1
CHECKPOINT_PERIOD: 5
AUTO_RESUME: True
DATA:
USE_OFFSET_SAMPLING: True
DECODING_BACKEND: decord
NUM_FRAMES: 64
SAMPLING_RATE: 16
TRAIN_JITTER_SCALES: [384, 480]
TRAIN_CROP_SIZE: 336
TEST_CROP_SIZE: 336
INPUT_CHANNEL_NUM: [3]

PATH_TO_DATA_DIR: path-to-imagenet-dir

TRAIN_JITTER_SCALES_RELATIVE: [0.08, 1.0]
TRAIN_JITTER_ASPECT_RELATIVE: [0.75, 1.3333]
UNIFORMERV2:
BACKBONE: 'uniformerv2_l14_336'
N_LAYERS: 4
N_DIM: 1024
N_HEAD: 16
MLP_FACTOR: 4.0
BACKBONE_DROP_PATH_RATE: 0.
DROP_PATH_RATE: 0.
MLP_DROPOUT: [0.5, 0.5, 0.5, 0.5]
CLS_DROPOUT: 0.5
RETURN_LIST: [20, 21, 22, 23]
NO_LMHRA: True
TEMPORAL_DOWNSAMPLE: False
#PRETRAIN: './exp/k400/k400+k710_l14_f64x336/k710_uniformerv2_l14_8x336.pyth'
DELETE_SPECIAL_HEAD: True
AUG:
NUM_SAMPLE: 1
ENABLE: True
COLOR_JITTER: 0.4
AA_TYPE: rand-m7-n4-mstd0.5-inc1
INTERPOLATION: bicubic
RE_PROB: 0.
RE_MODE: pixel
RE_COUNT: 1
RE_SPLIT: False
BN:
USE_PRECISE_STATS: False
NUM_BATCHES_PRECISE: 200
SOLVER:
ZERO_WD_1D_PARAM: True
BASE_LR_SCALE_NUM_SHARDS: True
BASE_LR: 4e-4
COSINE_AFTER_WARMUP: True
COSINE_END_LR: 1e-6
WARMUP_START_LR: 1e-6
WARMUP_EPOCHS: 0.
LR_POLICY: cosine
MAX_EPOCH: 50
MOMENTUM: 0.9
WEIGHT_DECAY: 0.05
OPTIMIZING_METHOD: adamw
COSINE_AFTER_WARMUP: True
MODEL:
NUM_CLASSES: 400
ARCH: uniformerv2
MODEL_NAME: Uniformerv2
LOSS_FUNC: cross_entropy
DROPOUT_RATE: 0.5
USE_CHECKPOINT: True
CHECKPOINT_NUM: [24]
TEST:
ENABLE: True
DATASET: kinetics_sparse
BATCH_SIZE: 256
NUM_SPATIAL_CROPS: 1
NUM_ENSEMBLE_VIEWS: 1
CHECKPOINT_FILE_PATH: "./exp/k400/k400+k710_l14_f64x336/k400_k710_uniformerv2_l14_64x336.pyth"
DATA_LOADER:
NUM_WORKERS: 2
PIN_MEMORY: True
TENSORBOARD:
ENABLE: False
NUM_GPUS: 8
NUM_SHARDS: 1
RNG_SEED: 0
OUTPUT_DIR: .

恳切希望得到您的答复,非常感谢

Andy1621 · 2024-01-23T08:13:38Z

单卡3090，跑64帧应该显存不够，可以跑16帧或者32帧，结果差不多的嘞。你也可以试试我们新的模型，显存开销更小，结果更好

ZachWong123 · 2024-01-23T08:15:47Z

单卡3090，跑64帧应该显存不够，可以跑16帧或者32帧，结果差不多的嘞。你也可以试试我们新的模型，显存开销更小，结果更好

哇非常感谢作者大大我回头试试你们的新模型!

ZachWong123 · 2024-03-06T03:33:04Z

单卡3090，跑64帧应该显存不够，可以跑16帧或者32帧，结果差不多的嘞。你也可以试试我们新的模型，显存开销更小，结果更好

作者大大您好目前我这边想要将模型应用到其他行为标签上大概可能有100种左右的动作准备数据是否可以参考k400数据集来做仿照您提供的annotation files 也就是kinetic_categories.txt train.csv test.csv 来构建自己的行为标签训练集标注文件和测试集标注文件然后再在config文件中修改num_class后进行训练和测试呢此外,我想要测试的其他行为属于比较常见的行为比如打字、写字、打电话、做饭、撑伞等等我想要小规模标注一些数据进行微调和测试应该也能得到比较不错的效果? 恳切希望得到您的答复,非常感谢.

Andy1621 · 2024-03-06T08:39:51Z

可以的，可以先小规模标一下，然后划分train和val微调K400的checkpoint试试

ZachWong123 · 2024-05-30T05:46:31Z

您好请问我发现在k400的任务中 LMHRA和T-Down都没有使用到查看了论文是因为k400的任务单纯使用global的设计就能达到最佳性能么如果我要尝试修改模型在k400数据集上提点是否应该从修改global的设计入手呢恳切希望得到您的答复

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the mAR #53

Question about the mAR #53

ZachWong123 commented Nov 27, 2023 •

edited

Loading

Andy1621 commented Nov 29, 2023 •

edited

Loading

ZachWong123 commented Nov 30, 2023

ZachWong123 commented Dec 5, 2023

Andy1621 commented Dec 5, 2023

ZachWong123 commented Jan 23, 2024 •

edited

Loading

Andy1621 commented Jan 23, 2024

ZachWong123 commented Jan 23, 2024

ZachWong123 commented Mar 6, 2024 •

edited

Loading

Andy1621 commented Mar 6, 2024

ZachWong123 commented May 30, 2024

Question about the mAR #53

Question about the mAR #53

Comments

ZachWong123 commented Nov 27, 2023 • edited Loading

Andy1621 commented Nov 29, 2023 • edited Loading

ZachWong123 commented Nov 30, 2023

ZachWong123 commented Dec 5, 2023

Andy1621 commented Dec 5, 2023

ZachWong123 commented Jan 23, 2024 • edited Loading

PATH_TO_DATA_DIR: path-to-imagenet-dir

Andy1621 commented Jan 23, 2024

ZachWong123 commented Jan 23, 2024

ZachWong123 commented Mar 6, 2024 • edited Loading

Andy1621 commented Mar 6, 2024

ZachWong123 commented May 30, 2024

ZachWong123 commented Nov 27, 2023 •

edited

Loading

Andy1621 commented Nov 29, 2023 •

edited

Loading

ZachWong123 commented Jan 23, 2024 •

edited

Loading

ZachWong123 commented Mar 6, 2024 •

edited

Loading