-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Get empty git diff after inference although the agent did some work #228
Comments
So why the workspace is empty. What I can do for that? Thanks |
Besides, I found some diff is in a invalid format. Do you know what's wrong with that? Thanks! |
Check
It shouldn't be. It is generated using |
Thanks for the reply. Could you please tell me more about how to solve the workspace empty problem? I don't know how to go inside the environment. And I would like to ask if this empty problem happened from the building of docker or from each running. Thanks! |
You could use docker desktop Maybe, there is a bug in the docker image. Here, the files are copied. |
I am testing on a Linux server, so might can not use the desktop. So if I have already finished the Docker building, is there anything I can do now to save the test? I have tested on two different Linux servers, and both have the empty git diff problem. Is there any temperate patch I can do? A little bit hurry for the testing. Please let me know. Thanks! |
You could run I am debugging that instance now.
Which test?
u meant temporary? |
Yeah I know how to run Docker, but the container will delete after running, so I cannot go into the environment now. As for I docker run bash, I can see the testbed there.
Really thanks for your help!
I mean my running on swe-bench
Yes, temporary (damn writing completion tools) |
Is the |
Seems no.
|
Workaround: You can change the directory to |
I am using the commit: 6498204
This seems to works! I am running on that now. |
All-Hands-AI#5549 This change fixed that. |
I went through this bug fix! Before I merge this bugfix, all the git diffs are empty. After merging this bugfix, some of the git diffs are empty. Not sure they come from the same reason. |
All-Hands-AI#5659 Did you apply this one too? |
No I didn't try the 5659. I will try later. Thanks for your help! |
added 9262560 on top of that to not remove the container and run only that instance; @tangken333 Edit: |
Why did you choose this commit? Original traj uses |
I am following the commitId from the metadata. And I check the version of CodeAct agent, it is V2.1 (correct). By the way, after I got the result, I face some new problems.
After I changed the model from GPT-4o to Claude-3.5-S, test on three instances, I can still get 1 invalid patch, but better. Is that related to the model? |
Would you provide the URL? Your Is there any difference between the two diffs? eval_infer.sh uses this script to convert to swe_bench format. |
I don't think there are differences because I directly get the value from the dict. I give you the output.jsonl. |
Would you give the direct link to the |
I checkout to this commitId, and run by this command using the "HEAD" |
I got the valid patch now, by changing the swebench version and running the script. Thanks! |
interesting, I don't know. But they both belong to one pull request: All-Hands-AI#4537, which is strong related to this task. I guess there is not too much different except for some small bugfix? |
my bad, I will look at this also. But I am using a newer version. Should be better? |
Is the objective to just evaluate using GPT-4o? Why not verified-mini? |
I am only using 4o now for my test. poor qwq
Oh I just know that. Thanks I will take a look. |
All-Hands-AI#6407 Issue
@tangken333, In matplotlib__matplotlib-25442, the agent stuck in a loop
as the workspace is emptydue to the symlink.You can use this script to visualize the eval like this.
The text was updated successfully, but these errors were encountered: