How to prevent models from cheating on the test set？ #75

haixingDu · 2024-01-09T03:26:56Z

Hello, I'm intrigued by TGB and have a few questions for which I'm seeking answers.

1、In the edge prediction test set (tgbl-wiki-v2), are the data labeled?
2、If they are labeled, I'm interested in understanding the measures implemented to ensure the fairness of the testing process. Specifically, how are potential manipulations after model prediction - such as adjusting the position of positive samples in the MRR ranking test to artificially enhance performance - identified and prevented?

Your responses to these queries would be highly appreciated.

shenyangHuang · 2024-01-09T19:56:18Z

Hi Haixing,

Thanks for your interest in our work and the questions. Hope the following answers address your concerns.

The edge labels are provided, this is only for evaluation purpose. For example on how to use the TGB evaluator, see an example here.
The TGB datasets are designed for research purpose and we trust the users to only use the test set for evaluation when submitting to the TGB leaderboard. This is similar to many open benchmark initiatives such as OGB and TDC. When submitting to the leaderboard, please fill in the google form and we ask the authors to provide a link to their paper and public code repository (which the results can then be verified by the community).

Best,
Andy

shenyangHuang self-assigned this Jan 9, 2024

shenyangHuang added the question Further information is requested label Jan 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to prevent models from cheating on the test set？ #75

How to prevent models from cheating on the test set？ #75

haixingDu commented Jan 9, 2024

shenyangHuang commented Jan 9, 2024

How to prevent models from cheating on the test set？ #75

How to prevent models from cheating on the test set？ #75

Comments

haixingDu commented Jan 9, 2024

shenyangHuang commented Jan 9, 2024