-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Conversation
jaseweston
commented
Nov 22, 2021
•
edited
Loading
edited
- moved to mutators.py files
- moved "wow" and "woi" always to the front of mutator names, it was inconsistent sometimes at front or back
- new mutator to convert to "woi" format, so then you can e.g. use all the woi mutators for wow data
d = docs[ind] | ||
# Guarantees that checked sentences are not split in half (as we split by space). | ||
for i in range(len(checked_sentences)): | ||
d = d.replace(checked_sentences[i], "||CHECKED_SENTENCE_" + str(i) + "||") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: f"||CHECKED_SENTENCE_{i}||"
looks slightly better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in my opinion, asking for these kind of nit changes just increases the chance of already working code suddenly becoming buggy by accident for not much gain, + taking time to check if it still works when we're trying to like do some research, but i can try..
# last chunk | ||
for i in range(len(checked_sentences)): | ||
d = d.replace( | ||
"||CHECKED_SENTENCE_" + str(i) + "||", checked_sentences[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above for the string.
) | ||
|
||
def message_mutation(self, message: Message) -> Message: | ||
chunk_sz = self.opt.get('woi_doc_chunk_size', 500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the mistakes that we had with setting flags before, may be we should have this as something like
if not message.get('woi_doc_chunk_size'):
logging.warning("The --woi-doc-chunk-size flag is received as None, using the default value of X")
It has happened to me that I set a flag without knowing that it was ignored. This at least warns about issue like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, what we actually need to do is fix the bug in parlai .. i was hoping kurt would do this...
if remove[i] == 0: | ||
new_docs.append(docs[i]) | ||
else: | ||
# We may still keep the doc if it contains the gold checked sentence(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case should we mark one of the other docs to be removed instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll just have one less, i dont think it's a big deal