Added contrastive alternative

inseq-team · Aug 13, 2024 · e0b25d2 · e0b25d2
1 parent 00a504a
commit e0b25d2
Show file tree

Hide file tree

Showing 6 changed files with 33 additions and 25 deletions.
diff --git a/README.md b/README.md
@@ -240,30 +240,19 @@ All commands support the full range of parameters available for `attribute`, att
 <details>
   <summary><code>inseq attribute-context</code> example</summary>
 
-  The following example uses a GPT-2 model to generate a continuation of <code>input_current_text</code>, and uses the additional context provided by <code>input_context_text</code> to estimate its influence on the the generation. In this case, the output <code>"to the hospital. He said he was fine"</code> is produced, and the generation of token <code>hospital</code> is found to be dependent on context token <code>sick</code> according to the <code>contrast_prob_diff</code> step function.
+  The following example uses a small LM to generate a continuation of <code>input_current_text</code>, and uses the additional context provided by <code>input_context_text</code> to estimate its influence on the the generation. In this case, the output <code>"to the hospital. He said he was fine"</code> is produced, and the generation of token <code>hospital</code> is found to be dependent on context token <code>sick</code> according to the <code>contrast_prob_diff</code> step function.
 
   ```bash
   inseq attribute-context \
-    --model_name_or_path gpt2 \
+    --model_name_or_path HuggingFaceTB/SmolLM-135M \
     --input_context_text "George was sick yesterday." \
     --input_current_text "His colleagues asked him to come" \
     --attributed_fn "contrast_prob_diff"
   ```
 
   **Result:**
 
-  ```
-  Context with [contextual cues] (std λ=1.00) followed by output sentence with {context-sensitive target spans} (std λ=1.00)
-  (CTI = "kl_divergence", CCI = "saliency" w/ "contrast_prob_diff" target)
-
-  Input context:  George was sick yesterday.
-  Input current: His colleagues asked him to come
-  Output current: to the hospital. He said he was fine
-
-  #1.
-  Generated output (CTI > 0.428): to the {hospital}(0.548). He said he was fine
-  Input context (CCI > 0.460):    George was [sick](0.516) yesterday.
-  ```
+  <img src="https://raw.githubusercontent.com/inseq-team/inseq/main/docs/source/images/attribute_context_hospital_output.png" style="width:300px">
 </details>
 
 ## Planned Development
@@ -280,7 +269,7 @@ Our vision for Inseq is to create a centralized, comprehensive and robust set of
 
 ## Citing Inseq
 
-If you use Inseq in your research we suggest to include a mention to the specific release (e.g. v0.6.0) and we kindly ask you to cite our reference paper as:
+If you use Inseq in your research we suggest including a mention of the specific release (e.g. v0.6.0) and we kindly ask you to cite our reference paper as:
 
 ```bibtex
 @inproceedings{sarti-etal-2023-inseq,
@@ -308,7 +297,7 @@ If you use Inseq in your research we suggest to include a mention to the specifi
 Inseq has been used in various research projects. A list of known publications that use Inseq to conduct interpretability analyses of generative models is shown below.
 
 > [!TIP]
-> Last update: June 2024. Please open a pull request to add your publication to the list.
+> Last update: August 2024. Please open a pull request to add your publication to the list.
 
 <details>
   <summary><b>2023</b></summary>
@@ -318,7 +307,6 @@ Inseq has been used in various research projects. A list of known publications t
     <li> <a href="https://aclanthology.org/2023.nlp4convai-1.1/">Response Generation in Longitudinal Dialogues: Which Knowledge Representation Helps?</a> (Mousavi et al., 2023)  </li>
     <li> <a href="https://openreview.net/forum?id=XTHfNGI3zT">Quantifying the Plausibility of Context Reliance in Neural Machine Translation</a> (Sarti et al., 2023)</li>
     <li> <a href="https://aclanthology.org/2023.emnlp-main.243/">A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation</a> (Attanasio et al., 2023)</li>
-    <li> <a href="https://arxiv.org/abs/2310.09820">Assessing the Reliability of Large Language Model Knowledge</a> (Wang et al., 2023)</li>
     <li> <a href="https://aclanthology.org/2023.conll-1.18/">Attribution and Alignment: Effects of Local Context Repetition on Utterance Production and Comprehension in Dialogue</a> (Molnar et al., 2023)</li>
   </ol>
 
@@ -327,13 +315,15 @@ Inseq has been used in various research projects. A list of known publications t
 <details>
   <summary><b>2024</b></summary>
   <ol>
-    <li><a href="https://arxiv.org/abs/2401.12576">LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools</a> (Wang et al., 2024)</li>
+    <li> <a href="https://aclanthology.org/2024.naacl-long.46/">Assessing the Reliability of Large Language Model Knowledge</a> (Wang et al., 2024)</li>
+    <li><a href="https://aclanthology.org/2024.hcinlp-1.9">LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools</a> (Wang et al., 2024)</li>
     <li><a href="https://arxiv.org/abs/2402.00794">ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models</a> (Zhao et al., 2024)</li>
-    <li><a href="https://arxiv.org/abs/2404.02421">Revisiting subword tokenization: A case study on affixal negation in large language models</a> (Truong et al., 2024)</li>
+    <li><a href="https://aclanthology.org/2024.naacl-long.284">Revisiting subword tokenization: A case study on affixal negation in large language models</a> (Truong et al., 2024)</li>
     <li><a href="https://hal.science/hal-04581586">Exploring NMT Explainability for Translators Using NMT Visualising Tools</a> (Gonzalez-Saez et al., 2024)</li>
-    <li><a href="https://arxiv.org/abs/2405.14899">DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning</a> (Zhou et al., 2024)</li>
+    <li><a href="https://openreview.net/forum?id=uILj5HPrag">DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning</a> (Zhou et al., 2024)</li>
     <li><a href="https://arxiv.org/abs/2406.06399">Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue</a> (Alghisi et al., 2024)</li>
     <li><a href="https://arxiv.org/abs/2406.13663">Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation</a> (Qi, Sarti et al., 2024)</li>
+    <li><a href="https://link.springer.com/chapter/10.1007/978-3-031-63787-2_14">NoNE Found: Explaining the Output of Sequence-to-Sequence Models When No Named Entity Is Recognized</a> (dela Cruz et al., 2024)</li>
   </ol>
 
 </details>
diff --git a/docs/source/images/attribute_context_hospital_output.png b/docs/source/images/attribute_context_hospital_output.png
diff --git a/inseq/commands/attribute_context/attribute_context.py b/inseq/commands/attribute_context/attribute_context.py
@@ -169,6 +169,7 @@ def attribute_context_with_model(args: AttributeContextArgs, model: HuggingfaceM
             )
         cci_kwargs = {}
         contextless_output = None
+        contrast_token = None
         if args.attributed_fn is not None and is_contrastive_step_function(args.attributed_fn):
             if not model.is_encoder_decoder:
                 formatted_input_current_text = concat_with_sep(
@@ -193,6 +194,7 @@ def attribute_context_with_model(args: AttributeContextArgs, model: HuggingfaceM
                 contextless_output, skip_special_tokens=False, as_targets=model.is_encoder_decoder
             )
             tok_pos = -2 if model.is_encoder_decoder else -1
+            contrast_token = output_ctxless_tokens[tok_pos]
             if args.attributed_fn == "kl_divergence" or output_ctx_tokens[tok_pos] == output_ctxless_tokens[tok_pos]:
                 cci_kwargs["contrast_force_inputs"] = True
         bos_offset = int(model.is_encoder_decoder or output_ctx_tokens[0] == model.bos_token)
@@ -235,6 +237,7 @@ def attribute_context_with_model(args: AttributeContextArgs, model: HuggingfaceM
         cci_out = CCIOutput(
             cti_idx=cti_idx,
             cti_token=cti_tok,
+            contrast_token=contrast_token,
             cti_score=cti_score,
             contextual_output=contextual_output,
             contextless_output=contextless_output,

diff --git a/inseq/commands/attribute_context/attribute_context_helpers.py b/inseq/commands/attribute_context/attribute_context_helpers.py
@@ -24,7 +24,8 @@ class CCIOutput:
     cti_token: str
     cti_score: float
     contextual_output: str
-    contextless_output: str
+    contrast_token: str | None = None
+    contextless_output: str | None = None
     input_context_scores: list[float] | None = None
     output_context_scores: list[float] | None = None
 

diff --git a/inseq/commands/attribute_context/attribute_context_viz_helpers.py b/inseq/commands/attribute_context/attribute_context_viz_helpers.py
@@ -288,7 +288,7 @@ def visualize_attribute_context_treescope(
     replace_chars = {"Ġ": " ", "Ċ": "\n", "▁": " "}
     cci_idx_map = {cci.cti_idx: cci for cci in output.cci_scores} if output.cci_scores is not None else {}
     for curr_tok_idx, curr_tok in enumerate(output.output_current_tokens):
-        curr_tok_parts, highlighted_idx = get_single_token_heatmap_treescope(
+        curr_tok_parts, highlighted_idx, cleaned_curr_tok = get_single_token_heatmap_treescope(
             curr_tok,
             score=output.cti_scores[curr_tok_idx],
             max_val=output.max_cti,
@@ -300,12 +300,26 @@ def visualize_attribute_context_treescope(
         if curr_tok_idx in cci_idx_map:
             cci_parts = [rp.text("\n")]
             cci = cci_idx_map[curr_tok_idx]
+            if cci.contrast_token is not None:
+                contrast_token = cci.contrast_token
+                for char in replace_chars.keys():
+                    contrast_token = contrast_token.strip(char)
+                if contrast_token != cleaned_curr_tok:
+                    cci_parts += [
+                        rp.custom_style(
+                            rp.text("Contrastive alternative: "),
+                            css_style="font-weight: bold; font-style: italic; color: #888888;",
+                        ),
+                        rp.custom_style(
+                            rp.text(contrast_token + "\n\n"), css_style="font-style: italic; color: #888888;"
+                        ),
+                    ]
             if cci.input_context_scores is not None:
                 cci_parts.append(
                     get_tokens_heatmap_treescope(
                         tokens=output.input_context_tokens,
                         scores=cci.input_context_scores,
-                        title=f'Input context CCI scores for "{cci.cti_token}"',
+                        title=f'Input contextual cues for "{cleaned_curr_tok}"',
                         title_style="font-style: italic; color: #888888;",
                         min_val=output.min_cci,
                         max_val=output.max_cci,
@@ -320,7 +334,7 @@ def visualize_attribute_context_treescope(
                     get_tokens_heatmap_treescope(
                         tokens=output.output_context_tokens,
                         scores=cci.output_context_scores,
-                        title=f'Output context CCI scores for "{cci.cti_token}"',
+                        title=f'Output contextual cue for "{cleaned_curr_tok}"',
                         title_style="font-style: italic; color: #888888;",
                         min_val=output.min_cci,
                         max_val=output.max_cci,

diff --git a/inseq/data/viz.py b/inseq/data/viz.py
@@ -628,7 +628,7 @@ def get_single_token_heatmap_treescope(
     else:
         parts.pop(idx_highlight)
     if return_highlighted_idx:
-        return parts, idx_highlight
+        return parts, idx_highlight, show_token
     return parts