Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the mAR #53

Open
ZachWong123 opened this issue Nov 27, 2023 · 10 comments
Open

Question about the mAR #53

ZachWong123 opened this issue Nov 27, 2023 · 10 comments

Comments

@ZachWong123
Copy link

ZachWong123 commented Nov 27, 2023

I've observed a common practice in video understanding models and research papers where the recall rate is essentially not provided. Recently, my teacher assigned me a project that requires both accuracy (acc) and recall (AR). When I explained to him that papers usually don't include recall, he suggested running the code and modifying it to output the recall rate. However, I'm uncertain about how to proceed with this. I would greatly appreciate your prompt response. Thank you!

@Andy1621
Copy link
Collaborator

Andy1621 commented Nov 29, 2023

Here are the code for accuracy:

def topks_correct(preds, labels, ks):
"""
Given the predictions, labels, and a list of top-k values, compute the
number of correct predictions for each top-k value.
Args:
preds (array): array of predictions. Dimension is batchsize
N x ClassNum.
labels (array): array of labels. Dimension is batchsize N.
ks (list): list of top-k values. For example, ks = [1, 5] correspods
to top-1 and top-5.
Returns:
topks_correct (list): list of numbers, where the `i`-th entry
corresponds to the number of top-`ks[i]` correct predictions.
"""
assert preds.size(0) == labels.size(
0
), "Batch dim of predictions and labels must match"
# Find the top max_k predictions for each sample
_top_max_k_vals, top_max_k_inds = torch.topk(
preds, max(ks), dim=1, largest=True, sorted=True
)
# (batch_size, max_k) -> (max_k, batch_size).
top_max_k_inds = top_max_k_inds.t()
# (batch_size, ) -> (max_k, batch_size).
rep_max_k_labels = labels.view(1, -1).expand_as(top_max_k_inds)
# (i, j) = 1 if top i-th prediction for the j-th sample is correct.
top_max_k_correct = top_max_k_inds.eq(rep_max_k_labels)
# Compute the number of topk correct predictions for each k.
topks_correct = [top_max_k_correct[:k, :].float().sum() for k in ks]
return topks_correct
def topk_errors(preds, labels, ks):
"""
Computes the top-k error for each k.
Args:
preds (array): array of predictions. Dimension is N.
labels (array): array of labels. Dimension is N.
ks (list): list of ks to calculate the top accuracies.
"""
num_topks_correct = topks_correct(preds, labels, ks)
return [(1.0 - x / preds.size(0)) * 100.0 for x in num_topks_correct]
def topk_accuracies(preds, labels, ks):
"""
Computes the top-k accuracy for each k.
Args:
preds (array): array of predictions. Dimension is N.
labels (array): array of labels. Dimension is N.
ks (list): list of ks to calculate the top accuracies.
"""
num_topks_correct = topks_correct(preds, labels, ks)
return [(x / preds.size(0)) * 100.0 for x in num_topks_correct]

The below code is modified by GPT, please try it

def mean_recall(preds, labels, num_classes):
    """
    Calculate the mean recall given predictions and labels.

    Args:
        preds (Tensor): Predictions from the model. Dimension is N x ClassNum.
        labels (Tensor): True labels. Dimension is N.
        num_classes (int): Number of classes.

    Returns:
        mean_recall (float): The mean recall over all classes.
    """
    assert preds.size(0) == labels.size(0), "Batch dim of predictions and labels must match"
    
    # Convert predictions to class indices
    _, predicted_classes = preds.max(dim=1)
    
    # Initialize TP and FN counters
    TP = [0] * num_classes
    FN = [0] * num_classes

    # Count TP and FN for each class
    for i in range(num_classes):
        TP[i] = ((predicted_classes == i) & (labels == i)).sum().item()
        FN[i] = ((predicted_classes != i) & (labels == i)).sum().item()

    # Calculate recall for each class
    recalls = [TP[i] / (TP[i] + FN[i]) if (TP[i] + FN[i]) > 0 else 0 for i in range(num_classes)]

    # Calculate mean recall
    mean_recall = sum(recalls) / num_classes

    return mean_recall

@ZachWong123
Copy link
Author

Thank you for your response!I will try it later.

@ZachWong123
Copy link
Author

OJ3C0PW1BYMCJQ2 BM%JEVJ
"Hello, may I ask if you could provide the link to BaiduNetdisk again? The dataset I obtained does not match the .csv annotations you provided as the same.The one in the picture is no longer accessible.The code is wrong.

@Andy1621
Copy link
Collaborator

Andy1621 commented Dec 5, 2023

链接: https://pan.baidu.com/s/1T8omLX_HE88CdbFpVupalA 密码: f4vw

@ZachWong123
Copy link
Author

ZachWong123 commented Jan 23, 2024

链接: https://pan.baidu.com/s/1T8omLX_HE88CdbFpVupalA 密码: f4vw

非常感谢您的回复,目前我已经在您的模型上测试了k400_b16_f8x224模型在k400数据集上的表现,得到了83.38的结果,跟您给出的点数基本一致,于是我想要去测试一下您的k400+k710_l14_f64x336模型,但是测试在我的单卡3090上好像没办法启动,请问您能给我一些配置文件的修改意见么,我将贴出我现在使用的test.sh和config.yaml
test.sh
NUM_SHARDS=1
NUM_GPUS=1
BATCH_SIZE=16
BASE_LR=1.5e-5
PYTHONPATH=$PYTHONPATH:./slowfast
python tools/run_net_multi_node.py
--init_method tcp://localhost:10125
--cfg ./exp/k400/k400+k710_l14_f64x336/config.yaml
--num_shards $NUM_SHARDS
DATA.PATH_TO_DATA_DIR /root/data/UniformerV2/new_k400/kinetics_400/videos_320
DATA.PATH_PREFIX /root/data/UniformerV2/new_k400/kinetics_400/videos_320
DATA.PATH_LABEL_SEPARATOR ","
TRAIN.EVAL_PERIOD 1
TRAIN.CHECKPOINT_PERIOD 100
TRAIN.BATCH_SIZE $BATCH_SIZE
TRAIN.SAVE_LATEST False
NUM_GPUS $NUM_GPUS
NUM_SHARDS $NUM_SHARDS
SOLVER.MAX_EPOCH 5
SOLVER.BASE_LR $BASE_LR
SOLVER.BASE_LR_SCALE_NUM_SHARDS False
SOLVER.WARMUP_EPOCHS 1.
TRAIN.ENABLE False
TEST.NUM_ENSEMBLE_VIEWS 2
TEST.NUM_SPATIAL_CROPS 3
TEST.TEST_BEST True
TEST.ADD_SOFTMAX True
TEST.BATCH_SIZE 4
RNG_SEED 6666
OUTPUT_DIR .

config.yaml
TRAIN:
ENABLE: True
DATASET: kinetics_sparse
BATCH_SIZE: 256
EVAL_PERIOD: 1
CHECKPOINT_PERIOD: 5
AUTO_RESUME: True
DATA:
USE_OFFSET_SAMPLING: True
DECODING_BACKEND: decord
NUM_FRAMES: 64
SAMPLING_RATE: 16
TRAIN_JITTER_SCALES: [384, 480]
TRAIN_CROP_SIZE: 336
TEST_CROP_SIZE: 336
INPUT_CHANNEL_NUM: [3]

PATH_TO_DATA_DIR: path-to-imagenet-dir

TRAIN_JITTER_SCALES_RELATIVE: [0.08, 1.0]
TRAIN_JITTER_ASPECT_RELATIVE: [0.75, 1.3333]
UNIFORMERV2:
BACKBONE: 'uniformerv2_l14_336'
N_LAYERS: 4
N_DIM: 1024
N_HEAD: 16
MLP_FACTOR: 4.0
BACKBONE_DROP_PATH_RATE: 0.
DROP_PATH_RATE: 0.
MLP_DROPOUT: [0.5, 0.5, 0.5, 0.5]
CLS_DROPOUT: 0.5
RETURN_LIST: [20, 21, 22, 23]
NO_LMHRA: True
TEMPORAL_DOWNSAMPLE: False
#PRETRAIN: './exp/k400/k400+k710_l14_f64x336/k710_uniformerv2_l14_8x336.pyth'
DELETE_SPECIAL_HEAD: True
AUG:
NUM_SAMPLE: 1
ENABLE: True
COLOR_JITTER: 0.4
AA_TYPE: rand-m7-n4-mstd0.5-inc1
INTERPOLATION: bicubic
RE_PROB: 0.
RE_MODE: pixel
RE_COUNT: 1
RE_SPLIT: False
BN:
USE_PRECISE_STATS: False
NUM_BATCHES_PRECISE: 200
SOLVER:
ZERO_WD_1D_PARAM: True
BASE_LR_SCALE_NUM_SHARDS: True
BASE_LR: 4e-4
COSINE_AFTER_WARMUP: True
COSINE_END_LR: 1e-6
WARMUP_START_LR: 1e-6
WARMUP_EPOCHS: 0.
LR_POLICY: cosine
MAX_EPOCH: 50
MOMENTUM: 0.9
WEIGHT_DECAY: 0.05
OPTIMIZING_METHOD: adamw
COSINE_AFTER_WARMUP: True
MODEL:
NUM_CLASSES: 400
ARCH: uniformerv2
MODEL_NAME: Uniformerv2
LOSS_FUNC: cross_entropy
DROPOUT_RATE: 0.5
USE_CHECKPOINT: True
CHECKPOINT_NUM: [24]
TEST:
ENABLE: True
DATASET: kinetics_sparse
BATCH_SIZE: 256
NUM_SPATIAL_CROPS: 1
NUM_ENSEMBLE_VIEWS: 1
CHECKPOINT_FILE_PATH: "./exp/k400/k400+k710_l14_f64x336/k400_k710_uniformerv2_l14_64x336.pyth"
DATA_LOADER:
NUM_WORKERS: 2
PIN_MEMORY: True
TENSORBOARD:
ENABLE: False
NUM_GPUS: 8
NUM_SHARDS: 1
RNG_SEED: 0
OUTPUT_DIR: .

恳切希望得到您的答复,非常感谢

@Andy1621
Copy link
Collaborator

单卡3090,跑64帧应该显存不够,可以跑16帧或者32帧,结果差不多的嘞。你也可以试试我们新的模型,显存开销更小,结果更好

@ZachWong123
Copy link
Author

单卡3090,跑64帧应该显存不够,可以跑16帧或者32帧,结果差不多的嘞。你也可以试试我们新的模型,显存开销更小,结果更好

哇 非常感谢作者大大 我回头试试你们的新模型!

@ZachWong123
Copy link
Author

ZachWong123 commented Mar 6, 2024

单卡3090,跑64帧应该显存不够,可以跑16帧或者32帧,结果差不多的嘞。你也可以试试我们新的模型,显存开销更小,结果更好

作者大大您好 目前我这边想要将模型应用到其他行为标签上 大概可能有100种左右的动作 准备数据是否可以参考k400数据集来做 仿照您提供的annotation files 也就是kinetic_categories.txt train.csv test.csv 来构建自己的行为标签 训练集标注文件和测试集标注文件 然后再在config文件中修改num_class后进行训练和测试呢 此外,我想要测试的其他行为属于比较常见的行为 比如打字、写字、打电话、做饭、撑伞等等 我想要小规模标注一些数据进行微调和测试 应该也能得到比较不错的效果? 恳切希望得到您的答复,非常感谢.

@Andy1621
Copy link
Collaborator

Andy1621 commented Mar 6, 2024

可以的,可以先小规模标一下,然后划分train和val微调K400的checkpoint试试

@ZachWong123
Copy link
Author

您好 请问我发现在k400的任务中 LMHRA和T-Down都没有使用到 查看了论文 是因为k400的任务单纯使用global的设计就能达到最佳性能么 如果我要尝试修改模型在k400数据集上提点 是否应该从修改global的设计入手呢 恳切希望得到您的答复

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants