Fix typos and reformat sentences

castorini · Nov 12, 2020 · c7ece99 · c7ece99
1 parent c038312
commit c7ece99
Showing 1 changed file with 12 additions and 17 deletions.
diff --git a/docs/working-with-spacy.md b/docs/working-with-spacy.md
@@ -173,12 +173,11 @@ Then we have sentences:
 
 ## Entity Linking
 
-Unfortunately, spaCy does not provide any pre-trained Entity Linking model currently. However, we found another great
-Entity Linking package called [Radboud Entity Linker (REL)](https://github.com/informagi/REL#rel-radboud-entity-linker).
+Unfortunately, spaCy does not provide any pre-trained entity linking model currently.
+However, we found another great entity linking package called [Radboud Entity Linker (REL)](https://github.com/informagi/REL#rel-radboud-entity-linker).
 
-In this section, we introduce an entity linking [script](../scripts/entity_linking.py) which links texts to both Wikipedia and Wikidata entities, using spaCy NER and
-REL Entity Linker. The input should be a JSONL file which has one json object per line, like [this](https://github.com/castorini/pyserini/blob/master/integrations/resources/sample_collection_jsonl/documents.jsonl),
-while the output is also a JSONL file, where each json object is of format:
+In this section, we introduce an entity linking [script](../scripts/entity_linking.py) which links texts to both Wikipedia and Wikidata entities, using spaCy NER and REL Entity Linker.
+The input should be a JSONL file which has one json object per line, like [this](https://github.com/castorini/pyserini/blob/master/integrations/resources/sample_collection_jsonl/documents.jsonl), while the output is also a JSONL file, where each json object is of format:
 
 ```
 {
@@ -211,20 +210,17 @@ For example, given the input file
 
 ### Input Prep
 
-Let us take MS MARCO passage dataset as an example. We need to download the MS MARCO passage dataset and convert the tsv collection into jsonl files by following the
-detailed instruction [here](https://github.com/x389liu/pyserini/blob/master/docs/experiments-msmarco-passage.md#data-prep).
-Now we should have 9 jsonl files in `collections/msmarco-passage/collection_jsonl`, and each file path can be considered as
-`input_path` in our scripts.
+Let us take MS MARCO passage dataset as an example.
+We need to download the MS MARCO passage dataset and convert the tsv collection into jsonl files by following the detailed instruction [here](https://github.com/castorini/pyserini/blob/master/docs/experiments-msmarco-passage.md#data-prep).
+Now we should have 9 jsonl files in `collections/msmarco-passage/collection_jsonl`, and each file path can be considered as `input_path` in our scripts.
 
 ### REL
 
-First, we follow the github [instruction](https://github.com/informagi/REL#installation-from-source) to install REL and
-download required generic file, appropriate wikipedia corpus as well as the corresponding ED model. Then we set up
-variable `base_url` as explained in this [tutorial](https://github.com/informagi/REL/blob/master/tutorials/01_How_to_get_started.md#how-to-get-started).
+First, we follow the Github [instruction](https://github.com/informagi/REL#installation-from-source) to install REL and download required generic file, appropriate wikipedia corpus as well as the corresponding ED model.
+Then we set up variable `base_url` as explained in this [tutorial](https://github.com/informagi/REL/blob/master/tutorials/01_How_to_get_started.md#how-to-get-started).
 
 Note that the `base_url` and ED model path are required as `rel_base_url` and `rel_ed_model_path` in our script respectively.
-Another parameter `rel_wiki_version` depends on the version of wikipedia corpus downloaded, e.g.
-`wiki_2019` for 2019 Wikipedia corpus.
+Another parameter `rel_wiki_version` depends on the version of wikipedia corpus downloaded, e.g. `wiki_2019` for 2019 Wikipedia corpus.
 
 ### wikimapper
 
@@ -243,6 +239,5 @@ python entity_linking.py --input_path [input_jsonl_file] --rel_base_url [base_ur
 --spacy_model [en_core_web_sm, en_core_web_lg, etc.] --output_path [output_jsonl_file]
 ```
 
-It should take about 5 to 10 minutes to run entity linking on 5,000 MS MARCO passages on Compute Canada. See
-[this](https://github.com/castorini/onboarding/blob/master/docs/cc-guide.md#compute-canada) for instructions about
-running scripts on Compute Canada.
+It should take about 5 to 10 minutes to run entity linking on 5,000 MS MARCO passages on Compute Canada.
+See [this](https://github.com/castorini/onboarding/blob/master/docs/cc-guide.md#compute-canada) for instructions about running scripts on Compute Canada.