princeton-nlp · john-b-yang · Apr 5, 2024 · Apr 2, 2024 · Apr 2, 2024
diff --git a/README.md b/README.md
@@ -24,7 +24,7 @@ Code and data for our ICLR 2024 paper <a href="http://swe-bench.github.io/paper.
     </a>
 </p>
 
-Please refer our [website](http://swe-bench.github.io) for the public leaderboard and the [change log](https://github.com/princeton-nlp/SWE-bench/blob/master/CHANGELOG.md) for information on the latest updates to the SWE-bench benchmark.
+Please refer our [website](http://swe-bench.github.io) for the public leaderboard and the [change log](https://github.com/princeton-nlp/SWE-bench/blob/main/CHANGELOG.md) for information on the latest updates to the SWE-bench benchmark.
 
 ## 👋 Overview
 SWE-bench is a benchmark for evaluating large language models on real world software issues collected from GitHub.
@@ -44,9 +44,9 @@ You can download the SWE-bench dataset directly ([dev](https://drive.google.com/
 
 To use SWE-Bench, you can:
 * Train your own models on our pre-processed datasets  
-* Run [inference](https://github.com/princeton-nlp/SWE-bench/blob/master/inference/) on existing models (either models you have on-disk like LLaMA, or models you have access to through an API like GPT-4). The inference step is where you get a repo and an issue and have the model try to generate a fix for it.
-* [Evaluate](https://github.com/princeton-nlp/SWE-bench/blob/master/harness/) models against SWE-bench. This is where you take a SWE-Bench task and a model-proposed solution and evaluate its correctness. 
-*  Run SWE-bench's [data collection procedure](https://github.com/princeton-nlp/SWE-bench/blob/master/collect/) on your own repositories, to make new SWE-Bench tasks. 
+* Run [inference](https://github.com/princeton-nlp/SWE-bench/blob/main/inference/) on existing models (either models you have on-disk like LLaMA, or models you have access to through an API like GPT-4). The inference step is where you get a repo and an issue and have the model try to generate a fix for it.
+* [Evaluate](https://github.com/princeton-nlp/SWE-bench/blob/main/swebench/harness/) models against SWE-bench. This is where you take a SWE-Bench task and a model-proposed solution and evaluate its correctness. 
+*  Run SWE-bench's [data collection procedure](https://github.com/princeton-nlp/SWE-bench/blob/main/swebench/collect/) on your own repositories, to make new SWE-Bench tasks. 
 
 ## ⬇️ Downloads
 | Datasets | Models |

diff --git a/swebench/collect/README.md b/swebench/collect/README.md
@@ -5,7 +5,7 @@ We include a comprehensive [tutorial](https://github.com/princeton-nlp/SWE-bench
 
 > SWE-bench's collection pipeline is currently designed to target PyPI packages. We hope to expand SWE-bench to more repositories and languages in the future.
 
-<img src="../assets/collection.png">
+<img src="../../assets/collection.png">
 
 ## Collection Procedure
 To run collection on your own repositories, run the `run_get_tasks_pipeline.sh` script. Given a repository or list of repositories (formatted as `owner/name`), for each repository this command will generate...

diff --git a/swebench/harness/README.md b/swebench/harness/README.md
@@ -10,7 +10,7 @@ The `engine_evaluation.py` and `run_evaluation.py` code is used for evaluating m
 
 The evaluation script generally performs the following steps:
 
-![evaluation](../assets/evaluation.png)
+![evaluation](../../assets/evaluation.png)
 
 the `run_evaluation.py` script is invoked using the `./run_evaluation.sh` script with the following arguments:
 ```
@@ -35,7 +35,7 @@ In the context of the collection pipeline, you should use this script after
 
 The validation script generally performs the following steps:
 
-![validation](../assets/validation.png)
+![validation](../../assets/validation.png)
 
 The `engine_validation.py` script is invoked using the `./run_validation.sh` script with the following arguments:
 ```

diff --git a/tutorials/validation.ipynb b/tutorials/validation.ipynb
@@ -9,13 +9,13 @@
    "source": [
     "import glob, json, os, sys\n",
     "\n",
-    "sys.path.append('/path/to/metrics/') # TODO: Replace with path to `SWE-bench/metrics` folder\n",
+    "sys.path.append('/path/to/metrics/') # TODO: Replace with path to `SWE-bench/swebench/metrics` folder\n",
     "from conversion import convert_log_to_ground_truth\n",
     "from getters import get_logs_gold\n",
     "from monitor import monitor_validation, monitor_logs_same_diff\n",
     "sys.path = sys.path[:-1]\n",
     "\n",
-    "sys.path.append('/path/to/harness/') # TODO: Replace with path to `SWE-bench/harness` folder\n",
+    "sys.path.append('/path/to/harness/') # TODO: Replace with path to `SWE-bench/swebench/harness` folder\n",
     "from utils import has_attribute_or_import_error\n",
     "sys.path = sys.path[:-1]"
    ]