-
Notifications
You must be signed in to change notification settings - Fork 18
[Task Submission] Prompt ComVE #31
base: main
Are you sure you want to change the base?
Conversation
Hello! We are getting quite close to the deadline, which is why I wanted to remind you of the fact that your PR still needs some attention: please double-check the automated checks that failed, and ensure that the files in your submission match the desired template. Good luck finalising your PR and paper, feel free to tag us if you have questions. |
Dear GenBench team,
I tried to pull my data but I do not understand how can I check it.
I followed your tutorial and I sent the request.
Can you help me to understand the issues?
Thank you for your availability,
Leonardo Ranaldi.
Il ven 1 set 2023, 13:16 Verna Dankers ***@***.***> ha
scritto:
… Hello!
We are getting quite close to the deadline, which is why I wanted to
remind you of the fact that your PR still needs some attention: please
double-check the automated checks that failed, and ensure that the files in
your submission match the desired template.
You also modified the template for the PR's description above, which
should contain the task title, the authors, the implementation, the usage
and the checklist (see #5
<https://github.com/GenBench/genbench_cbt/pull/5> and #9
<https://github.com/GenBench/genbench_cbt/pull/9> for some examples).
Good luck finalising your PR and paper, feel free to tag us if you have
questions.
Cheers, Verna
*On behalf of the GenBench team*
—
Reply to this email directly, view it on GitHub
<https://github.com/GenBench/genbench_cbt/pull/31#issuecomment-1702515875>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BCC34ZGK247TJFQV6SWI5WLXYGYXFANCNFSM6AAAAAA4DDAYNQ>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Thanks for submitting a task to GenBench. I realized that in you PR, you deleted the Moreover, If understand your task correctly, you wouldn't need a task_dict, yet I see that you defined in the init.py. I'd suggest you start from scratch using our genbench-cli task creation pipeline. |
#30 PromptComVE
This resource propose a benchmark to probe generalization abilities over commonsese reasoning task. Hence, inspired by the Commonsense Validation and Explanation (ComVE), through a systematic analysis, we analyzed several It-LLMs, in particular, whether they are able to distinguish statements that make sense from those that do not by providing comprehensive explanations. Thus, we show that It-LLMs have good generalization abilities and achieve good accuracy in commonsense reasoning tasks. However, despite the impressive performance and improvements, we found some weaknesses that cast doubt on whether the models improved understanding of task instructions in a similar way to humans' use of task instructions.
Examples
match: {"input": "Which of the two statements is impossible? a)he put an elephant into the fridge. b)he put a turkey into the fridge", "target": "a)"}
Authors
Leonardo Ranaldi
Giulia Pucci
Fabio Massimo Zanzotto
Implementation
My submission has a custom modification of task.py, and specifically the format_example() function.
In promptcomve, one example consists of two parts: a promt with statement and two or three choises and the target.