Add CLARE Attack #356

jinyongyoo · 2020-11-23T03:20:42Z

This PR adds CLARE attack, which uses distilled RoBERTa masked language model to perform word swaps, word insertions, word merges (which is where we combine two adjacent words and replace it with another word) in a greedily manner.

New additions:

clare recipe
WordInsertionMaskedLM, which is similar to WordSwapMaskedLM but instead insert new words.
WordMergeMasedLM
Base classes WordInsertion and WordMerge.

Changes:

WordSwapMaskedLM has been updated to have a minimum confidence score cutoff and batching has been added for faster performance.
Changes in directory structure for transformations.

qiyanjun · 2020-11-25T13:51:10Z

@jinyongyoo flake8 textattack/
textattack/transformations/init.py:10:1: F403 'from .word_swaps import *' used; unable to detect undefined names
textattack/transformations/init.py:11:1: F403 'from .word_insertions import *' used; unable to detect undefined names
textattack/transformations/init.py:12:1: F403 'from .word_merges import *' used; unable to detect undefined names

qiyanjun · 2020-12-02T16:45:57Z

@Hanyu-Liu-123 please review!

Hanyu-Liu-123 · 2020-12-02T17:10:47Z

@Hanyu-Liu-123 please review!

will do today!

Hanyu-Liu-123 · 2020-12-03T05:46:03Z

textattack/transformations/word_merges/word_merge_masked_lm.py

+        merged_words = self._get_merged_words(current_text, merge_indices)
+        transformed_texts = []
+        for i in range(len(merged_words)):
+            index_to_modify = indices_to_modify[i]


I think this line should actually be index_to_modify = merge_indices[i]. We're trying to merge the indices specified by merge_indices and not indices_to_modify. Those 2 lists don't necessarily match.

for example, suppose we have the sentence "I love to play soccer", merge_indices = [3], indices_to_modify = [0,1,2,3,4], and merged_words = ["sleep"]. Calling index_to_modify = indices_to_modify[i] would give us "sleep to play soccer", which is wrong. The correct output should be "I love to sleep".

I agree.

Also, does merge_indices have an order? e.g., which index should be merged first? some indices should have higher priority since they are easily attacked.

Additionally, could you let me know which line of the code has the function that orders the three perturbations at the same indice? e.g., at [3], maybe the perturbation effect has: Insert > Replace > Merge. In this case, we only use Insert to perturb the text and will discard the other two perturbations.

I agree.

Also, does merge_indices have an order? e.g., which index should be merged first? some indices should have higher priority since they are easily attacked.

Additionally, could you let me know which line of the code has the function that orders the three perturbations at the same indice? e.g., at [3], maybe the perturbation effect has: Insert > Replace > Merge. In this case, we only use Insert to perturb the text and will discard the other two perturbations.

merge_indices does not have a priority order, since it is just the output of find_merge_index. find_merge_index runs on the modifiable indices of the text (something like [0,1,2,5]) and is in ascending order of numerical value.

We actually generate all merged sentences first, and then feed them along with all the perturbed sentences generated from masked_insertion and masked_swap to the search function. So we don't really order the three perturbations and just generate every eligible perturbation, then find the best one using the search function.

@Hanyu-Liu-123 Thanks for catching this. I copied the code from word swap language model and forgot to change it.

Hanyu-Liu-123

Looks good to me except for that one line!

Also, was the weird character problem solved by just stripping away those characters?

Thanks!

cookielee77

I'm not familiar with the whole codebase and pipeline. So I just leave some high-level comments when compared with my original implementations.

cookielee77 · 2020-12-03T16:02:31Z

textattack/attack_recipes/clare_li_2020.py

+                    method="bae",
+                    masked_language_model=shared_masked_lm,
+                    tokenizer=shared_tokenizer,
+                    max_candidates=20,


What does max_candidates = 20 do? Does it further constraint the replacement set after confidence thresholding?

I remember min_confidence=5e-4 will roughly create a set with 37 on average, then 20 may be less than 37?

Actually, it really doesn't need this constraint in most cases. If you are afraid of a too-large candidate set, then I think 50 should be reasonable.

max_candidates is like taking top-k after filtering by min confidence. I set it to 50.

cookielee77 · 2020-12-04T03:36:16Z

textattack/transformations/word_merges/word_merge_masked_lm.py

+        merged_words = self._get_merged_words(current_text, merge_indices)
+        transformed_texts = []
+        for i in range(len(merged_words)):
+            index_to_modify = indices_to_modify[i]


I agree.

Also, does merge_indices have an order? e.g., which index should be merged first? some indices should have higher priority since they are easily attacked.

Additionally, could you let me know which line of the code has the function that orders the three perturbations at the same indice? e.g., at [3], maybe the perturbation effect has: Insert > Replace > Merge. In this case, we only use Insert to perturb the text and will discard the other two perturbations.

cookielee77 · 2020-12-04T03:40:22Z

textattack/transformations/word_merges/word_merge_masked_lm.py

+        return ["masked_lm_name", "max_length", "max_candidates", "min_confidence"]
+
+
+def find_merge_index(token_tags, indices=None):


This was my developed code. You can try to relax the constraints or discard the function to see whether the generated texts look reasonable.

cookielee77 · 2020-12-04T03:41:53Z

textattack/attack_recipes/clare_li_2020.py

+        # and calculate their cosine similarity in the embedding space (Jin et al., 2020)."
+        # The original implementation uses similarity of 0.7. Since the CLARE code is based on the TextFooler code,
+        # we need to adjust the threshold to account for the missing / pi in the cosine similarity comparison
+        #  So the final threshold is 1 - (1 - 0.7) / pi =  0.904507034.


So my original threshold doesn't have the PI in cosine similarity. Not sure whether it has some significant impacts.

I think the issue is whether you used cosine similarity or angular similarity. Did you use cosine similarity? Then it should just be 0.7.

Yes, I just used cosine similarity

still need to add PSO constraint!

jinyongyoo requested a review from Hanyu-Liu-123 November 23, 2020 03:43

Hanyu-Liu-123 reviewed Dec 3, 2020

View reviewed changes

Hanyu-Liu-123 requested changes Dec 3, 2020

View reviewed changes

cookielee77 reviewed Dec 4, 2020

View reviewed changes

jinyongyoo and others added 18 commits December 11, 2020 08:37

add masked-lm-replacement for clare

973e4d5

add word_insertion_masked_lm and command line stuff

7f5ea10

add word_merge_maked_Im

3bfb42c

still need to add PSO constraint!

Add POS Order Constraint to Masked_Merge

27c52e2

Change superclass of Insertion and Merge

15a11c4

fix errors!

850bc31

add masked-lm-replacement for clare

0dbbcd5

add word_insertion_masked_lm and command line stuff

da1b226

add word_merge_maked_Im

aef7af6

still need to add PSO constraint!

Add POS Order Constraint to Masked_Merge

d44b2cc

WIP: organize transformations

8814393

wip

e2012aa

wip: fix word merge

7047f44

fix bugs

68246ea

fix more bugs

3426940

fix documentation errors

b120fb5

resolve bugs with LM predictions

5f978ed

formatting

bdbd370

jinyongyoo force-pushed the clare branch from 67195fc to bdbd370 Compare December 11, 2020 15:30

Hanyu-Liu-123 approved these changes Dec 12, 2020

View reviewed changes

qiyanjun merged commit 04b7c6f into master Dec 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CLARE Attack #356

Add CLARE Attack #356

jinyongyoo commented Nov 23, 2020 •

edited

Loading

qiyanjun commented Nov 25, 2020

qiyanjun commented Dec 2, 2020

Hanyu-Liu-123 commented Dec 2, 2020

Hanyu-Liu-123 Dec 3, 2020

cookielee77 Dec 4, 2020

Hanyu-Liu-123 Dec 4, 2020

jinyongyoo Dec 8, 2020

Hanyu-Liu-123 left a comment

cookielee77 left a comment

cookielee77 Dec 3, 2020

jinyongyoo Dec 11, 2020

cookielee77 Dec 4, 2020

cookielee77 Dec 4, 2020

cookielee77 Dec 4, 2020

jinyongyoo Dec 11, 2020

cookielee77 Dec 11, 2020

		return ["masked_lm_name", "max_length", "max_candidates", "min_confidence"]


		def find_merge_index(token_tags, indices=None):

Add CLARE Attack #356

Add CLARE Attack #356

Conversation

jinyongyoo commented Nov 23, 2020 • edited Loading

qiyanjun commented Nov 25, 2020

qiyanjun commented Dec 2, 2020

Hanyu-Liu-123 commented Dec 2, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Hanyu-Liu-123 left a comment

Choose a reason for hiding this comment

cookielee77 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jinyongyoo commented Nov 23, 2020 •

edited

Loading