Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CLARE Attack #356

Merged
merged 18 commits into from
Dec 12, 2020
Merged

Add CLARE Attack #356

merged 18 commits into from
Dec 12, 2020

Conversation

jinyongyoo
Copy link
Collaborator

@jinyongyoo jinyongyoo commented Nov 23, 2020

This PR adds CLARE attack, which uses distilled RoBERTa masked language model to perform word swaps, word insertions, word merges (which is where we combine two adjacent words and replace it with another word) in a greedily manner.

New additions:

  • clare recipe
  • WordInsertionMaskedLM, which is similar to WordSwapMaskedLM but instead insert new words.
  • WordMergeMasedLM
  • Base classes WordInsertion and WordMerge.

Changes:

  • WordSwapMaskedLM has been updated to have a minimum confidence score cutoff and batching has been added for faster performance.
  • Changes in directory structure for transformations.

@qiyanjun
Copy link
Member

@jinyongyoo flake8 textattack/
textattack/transformations/init.py:10:1: F403 'from .word_swaps import *' used; unable to detect undefined names
textattack/transformations/init.py:11:1: F403 'from .word_insertions import *' used; unable to detect undefined names
textattack/transformations/init.py:12:1: F403 'from .word_merges import *' used; unable to detect undefined names

@qiyanjun
Copy link
Member

qiyanjun commented Dec 2, 2020

@Hanyu-Liu-123 please review!

@Hanyu-Liu-123
Copy link
Collaborator

@Hanyu-Liu-123 please review!

will do today!

merged_words = self._get_merged_words(current_text, merge_indices)
transformed_texts = []
for i in range(len(merged_words)):
index_to_modify = indices_to_modify[i]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line should actually be index_to_modify = merge_indices[i]. We're trying to merge the indices specified by merge_indices and not indices_to_modify. Those 2 lists don't necessarily match.

for example, suppose we have the sentence "I love to play soccer", merge_indices = [3], indices_to_modify = [0,1,2,3,4], and merged_words = ["sleep"]. Calling index_to_modify = indices_to_modify[i] would give us "sleep to play soccer", which is wrong. The correct output should be "I love to sleep".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

Also, does merge_indices have an order? e.g., which index should be merged first? some indices should have higher priority since they are easily attacked.

Additionally, could you let me know which line of the code has the function that orders the three perturbations at the same indice? e.g., at [3], maybe the perturbation effect has: Insert > Replace > Merge. In this case, we only use Insert to perturb the text and will discard the other two perturbations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

Also, does merge_indices have an order? e.g., which index should be merged first? some indices should have higher priority since they are easily attacked.

Additionally, could you let me know which line of the code has the function that orders the three perturbations at the same indice? e.g., at [3], maybe the perturbation effect has: Insert > Replace > Merge. In this case, we only use Insert to perturb the text and will discard the other two perturbations.

merge_indices does not have a priority order, since it is just the output of find_merge_index. find_merge_index runs on the modifiable indices of the text (something like [0,1,2,5]) and is in ascending order of numerical value.

We actually generate all merged sentences first, and then feed them along with all the perturbed sentences generated from masked_insertion and masked_swap to the search function. So we don't really order the three perturbations and just generate every eligible perturbation, then find the best one using the search function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hanyu-Liu-123 Thanks for catching this. I copied the code from word swap language model and forgot to change it.

Copy link
Collaborator

@Hanyu-Liu-123 Hanyu-Liu-123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me except for that one line!

Also, was the weird character problem solved by just stripping away those characters?

Thanks!

Copy link

@cookielee77 cookielee77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the whole codebase and pipeline. So I just leave some high-level comments when compared with my original implementations.

method="bae",
masked_language_model=shared_masked_lm,
tokenizer=shared_tokenizer,
max_candidates=20,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does max_candidates = 20 do? Does it further constraint the replacement set after confidence thresholding?

I remember min_confidence=5e-4 will roughly create a set with 37 on average, then 20 may be less than 37?

Actually, it really doesn't need this constraint in most cases. If you are afraid of a too-large candidate set, then I think 50 should be reasonable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_candidates is like taking top-k after filtering by min confidence. I set it to 50.

merged_words = self._get_merged_words(current_text, merge_indices)
transformed_texts = []
for i in range(len(merged_words)):
index_to_modify = indices_to_modify[i]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

Also, does merge_indices have an order? e.g., which index should be merged first? some indices should have higher priority since they are easily attacked.

Additionally, could you let me know which line of the code has the function that orders the three perturbations at the same indice? e.g., at [3], maybe the perturbation effect has: Insert > Replace > Merge. In this case, we only use Insert to perturb the text and will discard the other two perturbations.

return ["masked_lm_name", "max_length", "max_candidates", "min_confidence"]


def find_merge_index(token_tags, indices=None):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my developed code. You can try to relax the constraints or discard the function to see whether the generated texts look reasonable.

# and calculate their cosine similarity in the embedding space (Jin et al., 2020)."
# The original implementation uses similarity of 0.7. Since the CLARE code is based on the TextFooler code,
# we need to adjust the threshold to account for the missing / pi in the cosine similarity comparison
# So the final threshold is 1 - (1 - 0.7) / pi = 0.904507034.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So my original threshold doesn't have the PI in cosine similarity. Not sure whether it has some significant impacts.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue is whether you used cosine similarity or angular similarity. Did you use cosine similarity? Then it should just be 0.7.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just used cosine similarity

@qiyanjun qiyanjun merged commit 04b7c6f into master Dec 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants