Skip to content
This repository has been archived by the owner on Jul 23, 2024. It is now read-only.

[Task Submission] Prompt ComVE #31

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

lranaldii
Copy link

@lranaldii lranaldii commented Aug 29, 2023

#30 PromptComVE

This resource propose a benchmark to probe generalization abilities over commonsese reasoning task. Hence, inspired by the Commonsense Validation and Explanation (ComVE), through a systematic analysis, we analyzed several It-LLMs, in particular, whether they are able to distinguish statements that make sense from those that do not by providing comprehensive explanations. Thus, we show that It-LLMs have good generalization abilities and achieve good accuracy in commonsense reasoning tasks. However, despite the impressive performance and improvements, we found some weaknesses that cast doubt on whether the models improved understanding of task instructions in a similar way to humans' use of task instructions.

Examples

match: {"input": "Which of the two statements is impossible? a)he put an elephant into the fridge. b)he put a turkey into the fridge", "target": "a)"}

Authors

Leonardo Ranaldi
Giulia Pucci
Fabio Massimo Zanzotto

Implementation

My submission has a custom modification of task.py, and specifically the format_example() function.
In promptcomve, one example consists of two parts: a promt with statement and two or three choises and the target.

@vernadankers
Copy link
Contributor

Hello!

We are getting quite close to the deadline, which is why I wanted to remind you of the fact that your PR still needs some attention: please double-check the automated checks that failed, and ensure that the files in your submission match the desired template.
You also modified the template for the PR's description above, which should contain the task title, the authors, the implementation, the usage and the checklist (see https://github.com/GenBench/genbench_cbt/pull/5 and https://github.com/GenBench/genbench_cbt/pull/9 for some examples).

Good luck finalising your PR and paper, feel free to tag us if you have questions.
Cheers, Verna
On behalf of the GenBench team

@lranaldii
Copy link
Author

lranaldii commented Sep 1, 2023 via email

@kazemnejad
Copy link
Contributor

Thanks for submitting a task to GenBench.

I realized that in you PR, you deleted the sample_task. This is not allowed. Please submit another PR that only makes changes to the submitted task not the other files of the framework!

Moreover, If understand your task correctly, you wouldn't need a task_dict, yet I see that you defined in the init.py. I'd suggest you start from scratch using our genbench-cli task creation pipeline.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants