Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where does the score file in FastRerank come from? #2

Open
itscassie opened this issue Aug 5, 2019 · 3 comments
Open

Where does the score file in FastRerank come from? #2

itscassie opened this issue Aug 5, 2019 · 3 comments

Comments

@itscassie
Copy link

In config.py, on line 26-28
it seems like the preprocessing step need article.txt, title.txt, template, samples.index.json and _score.json to be prepared in config.py to run the whole process
But after doing retrieve, I only got train/test/dev.sample.index other than the original article and title file
So how can I get all the other data I need such as sample.index.json and _score.json ?

@InitialBug
Copy link
Owner

InitialBug commented Aug 6, 2019

You can use the index file to get the templates (summaries of the corresponding training article), and each line contains 30 indices of one sample. The score is the ROUGE-1 of template evaluated with true summary. I'm sorry I didn't include this code, because when I implemented this project, there was no suitable python wrapper for ROUGE evaluation. I had to use the perl version seperatedly for this job. Which was, however, overwritten by other codes.

@itscassie
Copy link
Author

Could you give an quick example of what does these 2 json file look like?
Cause I’m now getting a text file after doing Retrieve part and not sure how the require .json file suppose to format.

@InitialBug
Copy link
Owner

InitialBug commented Aug 6, 2019

The score.json looks like this,
[{"art_idx":"0","scores":[0.25,0.1333333333,0.25,0.25,0.1111111111,0.125,0.1428571429,0.2666666667,0.2857142857,0.2666666667,0.2666666667,0.2,0.25,0.125,0.2666666667,0.2666666667,0.2666666667,0.2,0.1538461538,0.375,0.625,0.25,0.5,0.2666666667,0.125,0.5333333333,0.2666666667,0.2666666667,0.25,0.125],"tp_idx":[280563,468740,2977802,2978740,1305283,810428,143628,3305902,96755,227145,227356,228893,2668569,2669230,2669605,2579854,2579826,86884,54116,88311,186211,342885,414963,558914,1305361,897042,2608945,2832328,554728,98514]}]

Actually I didn't use the sample.index.json in my code, it is the previous version which I forgot to delete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants