-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the reproduction. #4
Comments
Hi @jumptoliujj, Did you figure it out? I have also faced this issue. I am not sure whether and where I was doing wrong. |
Just use text-davinci-003 instead of gpt-3.5-turbo, and you will get the similar results in the paper. |
Thank you! |
I have the same issue.. The logical program only obtained 51.4 on ProntoQA with GPT-3.5-turbo, which is far from the reported result (61) in the paper. |
I tried to run the models as per the commands in readme but it is giving "Error in generating example". |
I have problems with Pyke as it is not always deterministic, also there are problems with cache |
We run experiments on PrOntoQA and FOLIO, the results of accuracy are only about 51%~53%. We run logic_program.py, self_refinement.py, and then logic_inference.py, evaluation.py. Any wrong with my steps? Correct me.
The text was updated successfully, but these errors were encountered: