[Task Submission] Prompt ComVE #31

lranaldii · 2023-08-29T14:57:22Z

#30 PromptComVE

This resource propose a benchmark to probe generalization abilities over commonsese reasoning task. Hence, inspired by the Commonsense Validation and Explanation (ComVE), through a systematic analysis, we analyzed several It-LLMs, in particular, whether they are able to distinguish statements that make sense from those that do not by providing comprehensive explanations. Thus, we show that It-LLMs have good generalization abilities and achieve good accuracy in commonsense reasoning tasks. However, despite the impressive performance and improvements, we found some weaknesses that cast doubt on whether the models improved understanding of task instructions in a similar way to humans' use of task instructions.

Examples

match: {"input": "Which of the two statements is impossible? a)he put an elephant into the fridge. b)he put a turkey into the fridge", "target": "a)"}

Authors

Leonardo Ranaldi
Giulia Pucci
Fabio Massimo Zanzotto

Implementation

My submission has a custom modification of task.py, and specifically the format_example() function.
In promptcomve, one example consists of two parts: a promt with statement and two or three choises and the target.

vernadankers · 2023-09-01T10:16:07Z

Hello!

We are getting quite close to the deadline, which is why I wanted to remind you of the fact that your PR still needs some attention: please double-check the automated checks that failed, and ensure that the files in your submission match the desired template.
You also modified the template for the PR's description above, which should contain the task title, the authors, the implementation, the usage and the checklist (see https://github.com/GenBench/genbench_cbt/pull/5 and https://github.com/GenBench/genbench_cbt/pull/9 for some examples).

Good luck finalising your PR and paper, feel free to tag us if you have questions.
Cheers, Verna
On behalf of the GenBench team

lranaldii · 2023-09-01T11:14:08Z

Dear GenBench team, I tried to pull my data but I do not understand how can I check it. I followed your tutorial and I sent the request. Can you help me to understand the issues? Thank you for your availability, Leonardo Ranaldi. Il ven 1 set 2023, 13:16 Verna Dankers ***@***.***> ha scritto:

…

Hello! We are getting quite close to the deadline, which is why I wanted to remind you of the fact that your PR still needs some attention: please double-check the automated checks that failed, and ensure that the files in your submission match the desired template. You also modified the template for the PR's description above, which should contain the task title, the authors, the implementation, the usage and the checklist (see #5 <https://github.com/GenBench/genbench_cbt/pull/5> and #9 <https://github.com/GenBench/genbench_cbt/pull/9> for some examples). Good luck finalising your PR and paper, feel free to tag us if you have questions. Cheers, Verna *On behalf of the GenBench team* — Reply to this email directly, view it on GitHub <https://github.com/GenBench/genbench_cbt/pull/31#issuecomment-1702515875>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BCC34ZGK247TJFQV6SWI5WLXYGYXFANCNFSM6AAAAAA4DDAYNQ> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

kazemnejad · 2023-09-04T15:28:54Z

Thanks for submitting a task to GenBench.

I realized that in you PR, you deleted the sample_task. This is not allowed. Please submit another PR that only makes changes to the submitted task not the other files of the framework!

Moreover, If understand your task correctly, you wouldn't need a task_dict, yet I see that you defined in the init.py. I'd suggest you start from scratch using our genbench-cli task creation pipeline.

lranaldii added 5 commits August 29, 2023 16:49

Create doc.md

ea2dc1d

Create config.jsonnet

6b69df5

Delete src/genbench/tasks/sample_task directory

a6349d0

Create __init__.py

8522cc2

Create task.py

c0403ff

lranaldii closed this Aug 29, 2023

lranaldii reopened this Aug 31, 2023

kazemnejad added the task-submission label Aug 31, 2023

vernadankers added task-submission and removed task-submission labels Sep 1, 2023

Update __init__.py

5b8c121

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Task Submission] Prompt ComVE #31

[Task Submission] Prompt ComVE #31

lranaldii commented Aug 29, 2023 •

edited

Loading

vernadankers commented Sep 1, 2023

lranaldii commented Sep 1, 2023 via email

kazemnejad commented Sep 4, 2023

[Task Submission] Prompt ComVE #31

Are you sure you want to change the base?

[Task Submission] Prompt ComVE #31

Conversation

lranaldii commented Aug 29, 2023 • edited Loading

Examples

Authors

Implementation

vernadankers commented Sep 1, 2023

lranaldii commented Sep 1, 2023 via email

kazemnejad commented Sep 4, 2023

lranaldii commented Aug 29, 2023 •

edited

Loading