-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CLARE Attack #356
Add CLARE Attack #356
Conversation
@jinyongyoo flake8 textattack/ |
@Hanyu-Liu-123 please review! |
will do today! |
merged_words = self._get_merged_words(current_text, merge_indices) | ||
transformed_texts = [] | ||
for i in range(len(merged_words)): | ||
index_to_modify = indices_to_modify[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line should actually be index_to_modify = merge_indices[i]
. We're trying to merge the indices specified by merge_indices
and not indices_to_modify
. Those 2 lists don't necessarily match.
for example, suppose we have the sentence "I love to play soccer", merge_indices = [3]
, indices_to_modify = [0,1,2,3,4]
, and merged_words = ["sleep"]
. Calling index_to_modify = indices_to_modify[i]
would give us "sleep to play soccer", which is wrong. The correct output should be "I love to sleep".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
Also, does merge_indices have an order? e.g., which index should be merged first? some indices should have higher priority since they are easily attacked.
Additionally, could you let me know which line of the code has the function that orders the three perturbations at the same indice? e.g., at [3], maybe the perturbation effect has: Insert > Replace > Merge. In this case, we only use Insert to perturb the text and will discard the other two perturbations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
Also, does merge_indices have an order? e.g., which index should be merged first? some indices should have higher priority since they are easily attacked.
Additionally, could you let me know which line of the code has the function that orders the three perturbations at the same indice? e.g., at [3], maybe the perturbation effect has: Insert > Replace > Merge. In this case, we only use Insert to perturb the text and will discard the other two perturbations.
merge_indices
does not have a priority order, since it is just the output of find_merge_index
. find_merge_index
runs on the modifiable indices of the text (something like [0,1,2,5]) and is in ascending order of numerical value.
We actually generate all merged sentences first, and then feed them along with all the perturbed sentences generated from masked_insertion
and masked_swap
to the search function. So we don't really order the three perturbations and just generate every eligible perturbation, then find the best one using the search function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hanyu-Liu-123 Thanks for catching this. I copied the code from word swap language model and forgot to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me except for that one line!
Also, was the weird character problem solved by just stripping away those characters?
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with the whole codebase and pipeline. So I just leave some high-level comments when compared with my original implementations.
method="bae", | ||
masked_language_model=shared_masked_lm, | ||
tokenizer=shared_tokenizer, | ||
max_candidates=20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does max_candidates = 20 do? Does it further constraint the replacement set after confidence thresholding?
I remember min_confidence=5e-4 will roughly create a set with 37 on average, then 20 may be less than 37?
Actually, it really doesn't need this constraint in most cases. If you are afraid of a too-large candidate set, then I think 50 should be reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_candidates
is like taking top-k after filtering by min confidence. I set it to 50.
merged_words = self._get_merged_words(current_text, merge_indices) | ||
transformed_texts = [] | ||
for i in range(len(merged_words)): | ||
index_to_modify = indices_to_modify[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
Also, does merge_indices have an order? e.g., which index should be merged first? some indices should have higher priority since they are easily attacked.
Additionally, could you let me know which line of the code has the function that orders the three perturbations at the same indice? e.g., at [3], maybe the perturbation effect has: Insert > Replace > Merge. In this case, we only use Insert to perturb the text and will discard the other two perturbations.
return ["masked_lm_name", "max_length", "max_candidates", "min_confidence"] | ||
|
||
|
||
def find_merge_index(token_tags, indices=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my developed code. You can try to relax the constraints or discard the function to see whether the generated texts look reasonable.
# and calculate their cosine similarity in the embedding space (Jin et al., 2020)." | ||
# The original implementation uses similarity of 0.7. Since the CLARE code is based on the TextFooler code, | ||
# we need to adjust the threshold to account for the missing / pi in the cosine similarity comparison | ||
# So the final threshold is 1 - (1 - 0.7) / pi = 0.904507034. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my original threshold doesn't have the PI in cosine similarity. Not sure whether it has some significant impacts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is whether you used cosine similarity or angular similarity. Did you use cosine similarity? Then it should just be 0.7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I just used cosine similarity
still need to add PSO constraint!
still need to add PSO constraint!
This PR adds CLARE attack, which uses distilled RoBERTa masked language model to perform word swaps, word insertions, word merges (which is where we combine two adjacent words and replace it with another word) in a greedily manner.
New additions:
clare
recipeWordInsertionMaskedLM
, which is similar toWordSwapMaskedLM
but instead insert new words.WordMergeMasedLM
WordInsertion
andWordMerge
.Changes:
WordSwapMaskedLM
has been updated to have a minimum confidence score cutoff and batching has been added for faster performance.