Questions on FELT Dataset and Baseline Trackers in Your Research #6

MR-Vico · 2024-11-07T14:04:01Z

Dear FELT research team,

First, I would like to express my sincere gratitude for your outstanding work on the FELT dataset and your paper, which has become an invaluable reference for my current research. I am currently writing a paper that builds upon your methods, and I am very impressed by the comprehensive approach your team has taken to address the challenges of long-term frame-event tracking.

However, I have a few questions regarding the re-training of the baseline trackers on the FELT dataset. The paper mentions that 15 baseline trackers were re-trained, but I couldn’t find specific details about the re-training process. Could you clarify the methodology you used? Any additional information about the training configuration or settings would be very helpful for my research.

Additionally, I noticed that the evaluation tool for FELT is hosted on Baidu, which unfortunately I cannot access. Would it be possible to provide access to the tool through another platform or method?

Finally, I encountered sequences in the FELT dataset with bounding boxes initialized at (0, 0, 0, 0). Could you explain how the trackers handle such initializations and how these cases are evaluated within your framework, given my limited access to the evaluation tool?

Thank you very much for your time and assistance. I greatly appreciate any insights you could provide, and I look forward to your response.

wangxiao5791509 · 2024-11-08T02:41:41Z

@MR-Vico Hi, thanks for your attention to our work.
1). The specific details about the re-training process are not provided, we follow their default settings to train the tracker.
2). You can check their source code for these detailed information. The evaluation tool for FELT has been updated on
DropBox: FELT_eval_toolkit.zip
3). For the sequences in the FELT dataset with bounding boxes initialized at (0, 0, 0, 0), would please tell us which video?
Thanks for your feedback.

MR-Vico · 2024-11-10T16:49:36Z

@wangxiao5791509

Thank you very much for your prompt and helpful response. I am especially grateful for providing the updated link to the evaluation tool on Dropbox, which is extremely valuable for advancing my work.

I also appreciate the additional information regarding the training setup and default configurations used for the baseline trackers.

As for the sequences with an initial bounding box of (0, 0, 0, 0), here is a list of the specific videos where this was observed:

In /test – 10 sequences:

dvSave-2022_10_27_18_58_42
dvSave-2022_10_23_17_26_32
dvSave-2022_10_17_21_13_52
dvSave-2022_10_28_19_53_10
dvSave-2022_10_28_20_14_09
dvSave-2022_10_27_21_01_48
dvSave-2022_10_25_19_25_33
dvSave-2022_10_11_23_25_00
dvSave-2022_10_25_19_39_09
dvSave-2022_10_25_20_33_03

In /train1 – 1 sequence:

dvSave-2022_10_15_17_25_18

In /train2 – 10 sequences:

dvSave-2022_10_29_17_17_32
dvSave-2022_10_25_20_27_40
dvSave-2022_10_22_20_32_30
dvSave-2022_10_27_20_42_05
dvSave-2022_10_22_21_30_26
dvSave-2022_10_22_17_34_24
dvSave-2022_10_28_20_51_33
dvSave-2022_10_27_21_09_43
dvSave-2022_10_31_10_38_32
dvSave-2022_10_27_21_06_28

When analyzing the code of the evaluation tool, I noticed that frames without targets are filtered out, which actually solves the problem of invalid initial bounding boxes. This insight was incredibly helpful.

Thank you once again for your time and assistance. Your support has significantly advanced my research efforts.

wangxiao5791509 · 2024-11-10T23:53:44Z

@MR-Vico It's cool. The training samples are randomly sampled from the given training videos, thus, the bounding box of (0, 0, 0, 0) doesn't influence the training. For tracking, only a non-empty rectangle will be used as the starting position (more details can be found in our source code). The evaluation will filter out all zero annotations.

MR-Vico · 2024-11-11T17:47:41Z

@wangxiao5791509

When analyzing the evaluation tool further, I noticed that the actual tracking does not seem to take place within the evaluation tool itself. It is purely an evaluation of the tracking results. I would like to ask about the specific tracking methods.

In particular, I would like to know whether a specific tracking tool was used or whether the trackers were applied directly to the FELT dataset. Since the evaluation tool stores the results in the form of bounding boxes in .txt files, I assume that the trackers were applied directly to the dataset without the use of a tracking tool. Please correct me if I am wrong here.

How were initial bounding boxes with the coordinates (0, 0, 0, 0) handled during tracking? Do I understand correctly that the tracker starts tracking with the first image in which the object is present?

The following groundtruth.txt for the sequence dvSave-2022_10_27_18_58_42 illustrates my question:

Groundtruth:

0,0,0,0
0,0,0,0
0,0,0,0
0,0,0,0
0,0,0,0
231.6437,85.4265,18.2735,5.8811
209.75754,87.73694,17.64338,5.96512
187.87138,90.04738,17.01326,6.0491399999999995

Results for the DiMP tracker on this sequence:

0	0	0	0
0	0	0	0
0	0	0	0
0	0	0	0
0	0	0	0
231	85	18	5
231	85	18	5
231	85	18	5

The timestamps for the results are listed as follows:

In order to reproduce the tracking results, I would need this additional information about how the trackers were applied. Especially for the DiMP tracker.

wsasdsda · 2024-11-12T03:21:24Z

@MR-Vico You can take a closer look at the code under this path: AMTTrack/lib/test/evaluation/tracker. py. We first find the first target box that is not (0,0,0,0) based on the gt and use it as a template，and directly replace the initial frames where the target disappears with (0,0,0,0). This has nothing to do with the evaluation toolkit.

MR-Vico · 2024-11-25T16:41:49Z

@wangxiao5791509, @wsasdsda

I would like to thank you for your earlier answers. I really appreciate your time and effort in answering my questions.

I’m reaching out once more as I find myself needing a bit more clarification to fully understand the process, particularly with respect to re-training the baseline trackers. I apologize for having to ask again, but this is vital for replicating the results in your study. I hope you won’t mind clarifying a few more points.

From your initial response, I understand that specific details about the re-training process are not provided, and you mentioned following the default settings to train the tracker. Based on this, I attempted to reproduce results using the DiMP tracker as an example. However, I encountered a few issues that I hope you can help clarify:

Which repository was used for the tracker?

The official DiMP implementation I am aware of is PyTracking on GitHub. Could you confirm if this was the one you used, or if another repository was utilized?
Was the LTR tool (which is included in Pytracking) used to retrain the baseline trackers? For example, for the DiMP tracker?

Was the FELT dataset actually used to train the baseline trackers?

The default settings for DiMP reference datasets like LaSOT or GOT-10k. If the FELT dataset was used instead, it would be helpful to know how this was configured.
Additionally, was the training performed separately on the APS (standard image frames) and DVS (event frames) or were these modalities combined during the training process of the baseline tracker?

Thank you again for your time and for sharing your excellent work. I greatly appreciate any further details you can provide. Please let me know if additional context or clarification is needed on my end.

wangxiao5791509 · 2024-11-26T00:09:59Z

@MR-Vico The pytracking is a collection of trackers, not a GitHub for a single tracker. We merge the two modalities into one representation and re-training on the FELT SOT dataset. For the details of how to adapt the pytracking for new datasets, please refer to their tutorial. If you want to understand the tracking processing clearly, you may need to read their source code first and then run the basic rgb-based datasets (e.g., LaSOT or GOT-10k).

wsasdsda · 2024-11-26T11:19:35Z

@MR-Vico Our tracker is improved based on OSTrack and CEUTrack. For a fair comparison with other baselines, we retrain them based on their own configs using APS, DVS, and APS+DVS, respectively. For example, the config of DiMP is set as dimp50.py. Simply put, we only replaced the input with our own dataset while keeping other parameters unchanged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions on FELT Dataset and Baseline Trackers in Your Research #6

Questions on FELT Dataset and Baseline Trackers in Your Research #6

MR-Vico commented Nov 7, 2024 •

edited

Loading

wangxiao5791509 commented Nov 8, 2024

MR-Vico commented Nov 10, 2024

wangxiao5791509 commented Nov 10, 2024

MR-Vico commented Nov 11, 2024

wsasdsda commented Nov 12, 2024

MR-Vico commented Nov 25, 2024

wangxiao5791509 commented Nov 26, 2024

wsasdsda commented Nov 26, 2024

Questions on FELT Dataset and Baseline Trackers in Your Research #6

Questions on FELT Dataset and Baseline Trackers in Your Research #6

Comments

MR-Vico commented Nov 7, 2024 • edited Loading

wangxiao5791509 commented Nov 8, 2024

MR-Vico commented Nov 10, 2024

wangxiao5791509 commented Nov 10, 2024

MR-Vico commented Nov 11, 2024

wsasdsda commented Nov 12, 2024

MR-Vico commented Nov 25, 2024

wangxiao5791509 commented Nov 26, 2024

wsasdsda commented Nov 26, 2024

MR-Vico commented Nov 7, 2024 •

edited

Loading