FAQ2018.txt

Hi,
In the /data folder are the ten training sets (or training "topics").

Within each /annotation folder for each topic, there is an annotation file ending with *ann.txt. 
There is a folder in every topic, called Reference_XML. ​​​The sentence id's in *ann.txt files reference the XML version of the reference document, available in the Reference_XML folder in every topic.

This is the legend to understand each row of the file: 

a. Reference article means the "topic" article which all the other articles in this folder are citing.
b. Citing article refers to the EXACT filename. 
c. The citances are in no particular order.

1. Every row looks as follows. This has Reference offsets and Citation offsets as SENTENCE ID's of reference_xml and citance_xml.
Citance Number: 1 | Reference Article:  X96-1048.txt | Citing Article:  A97-1028.txt | Citation Marker Offset:  ['10'] | Citation Marker:  Sundheim, 1995b | Citation Offset:  ['10'] | Citation Text:  Named Entity evaluation began as a part of recent Message Understanding Conferences (MUC), whose objective was to standardize the evaluation of IE tasks (Sundheim, 1995b). | Reference Offset:  ['357'] | Reference Text:  In addition, there are plans to put evaluations on line, with public access, starting with the NE evaluation; this is intended to make the NE task familiar to new sites and to give them a convenient and low-pressure way to try their hand at following a standardized test procedure | Discourse Facet:  Implication_Citation | Annotator:  Muthu Kumar Chandrasekaran, NUS |


Below, we have addressed some FAQs. Please feel free to contact us with follow-up questions. 
We thank the participants from 2014 and 2016 for their feedback, which has helped us to improve our corpus.

Regards

Kokil Jaidka
Coordinator, SciSumm
____________________________________________________________________

FAQ

Q. How is the 2017 corpus different from the 2016 corpus?
The 2017 corpus comprises the training, development and test corpus from the 2016 Shared Task after considerable cleaning and refinement. Based on the feedback of the participants in 2016, we removed those citations for which no text in the referencing article factually matched the information being cited. We also removed those citations where several studies within a broad topic were mentioned in a single sentence, as they often denoted a very broad fact that could not be pinpointed to a single span of text in the reference article.

Q. I noticed "..." in the annotations, what does that mean?
"..." followed the BioMedSumm standard practice of indicating discontiguous texts.
In Citation Text and Reference Text fields, the "..." means that there is a gap between two text spans (citation spans or reference spans). They may be on different pages, so the gap might be a text. There might be a formula or a figure there, or some text encoding which is not a part of the annotation.

Q. I notice there are 30 training topics (folders), where is the evaluation set?
The test sets will be released in the following months. Meanwhile you can pilot your system based on this training set.

Q. There are errors in parsing the file - such as misspelt words, spaces within words, sentences in the wrong place and so on.
Unfortunately the errors you see are OCR parsing errors in the "txt" version of the Documents, available in the Documents_TXT folder, and the XML version in Reference_XML folder. This is not in our control. We recommend that your string matching should be lenient enough to confront such problems.

Q. The citation/reference offset numbers are not matching the actual values in the txt files.
Please refer to the sentence ids in the XML files, which precede every sentence. We will be using those to evaluate how accurate the citation and reference matching is.

Q. The sentences in the *ann.txt files are not complete.
In some cases the sentences could be limited only to the relevant snippets in either the citance or the reference span.

Q. How is Task1b evaluated? OR Why are there multiple discourse facets in the annotations?
There may be multiple discourse facets identified in the reference span, which means that the citing article is referencing more than one kind of information in its citation. Depending on the confidence intervals of your classifier, you may include more than one discourse facet when you generate your system output, and we will evaluate the overlap against the gold standard. Since Task 1b is contingent on Task 1a, only those discourse facets will be evaluated where Task 1a was correctly performed.