Skip to content

Commit

Permalink
Added contrastive alternative
Browse files Browse the repository at this point in the history
  • Loading branch information
gsarti committed Aug 13, 2024
1 parent 00a504a commit e0b25d2
Show file tree
Hide file tree
Showing 6 changed files with 33 additions and 25 deletions.
30 changes: 10 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -240,30 +240,19 @@ All commands support the full range of parameters available for `attribute`, att
<details>
<summary><code>inseq attribute-context</code> example</summary>

The following example uses a GPT-2 model to generate a continuation of <code>input_current_text</code>, and uses the additional context provided by <code>input_context_text</code> to estimate its influence on the the generation. In this case, the output <code>"to the hospital. He said he was fine"</code> is produced, and the generation of token <code>hospital</code> is found to be dependent on context token <code>sick</code> according to the <code>contrast_prob_diff</code> step function.
The following example uses a small LM to generate a continuation of <code>input_current_text</code>, and uses the additional context provided by <code>input_context_text</code> to estimate its influence on the the generation. In this case, the output <code>"to the hospital. He said he was fine"</code> is produced, and the generation of token <code>hospital</code> is found to be dependent on context token <code>sick</code> according to the <code>contrast_prob_diff</code> step function.

```bash
inseq attribute-context \
--model_name_or_path gpt2 \
--model_name_or_path HuggingFaceTB/SmolLM-135M \
--input_context_text "George was sick yesterday." \
--input_current_text "His colleagues asked him to come" \
--attributed_fn "contrast_prob_diff"
```

**Result:**

```
Context with [contextual cues] (std λ=1.00) followed by output sentence with {context-sensitive target spans} (std λ=1.00)
(CTI = "kl_divergence", CCI = "saliency" w/ "contrast_prob_diff" target)
Input context: George was sick yesterday.
Input current: His colleagues asked him to come
Output current: to the hospital. He said he was fine
#1.
Generated output (CTI > 0.428): to the {hospital}(0.548). He said he was fine
Input context (CCI > 0.460): George was [sick](0.516) yesterday.
```
<img src="https://raw.githubusercontent.com/inseq-team/inseq/main/docs/source/images/attribute_context_hospital_output.png" style="width:300px">
</details>

## Planned Development
Expand All @@ -280,7 +269,7 @@ Our vision for Inseq is to create a centralized, comprehensive and robust set of

## Citing Inseq

If you use Inseq in your research we suggest to include a mention to the specific release (e.g. v0.6.0) and we kindly ask you to cite our reference paper as:
If you use Inseq in your research we suggest including a mention of the specific release (e.g. v0.6.0) and we kindly ask you to cite our reference paper as:

```bibtex
@inproceedings{sarti-etal-2023-inseq,
Expand Down Expand Up @@ -308,7 +297,7 @@ If you use Inseq in your research we suggest to include a mention to the specifi
Inseq has been used in various research projects. A list of known publications that use Inseq to conduct interpretability analyses of generative models is shown below.

> [!TIP]
> Last update: June 2024. Please open a pull request to add your publication to the list.
> Last update: August 2024. Please open a pull request to add your publication to the list.
<details>
<summary><b>2023</b></summary>
Expand All @@ -318,7 +307,6 @@ Inseq has been used in various research projects. A list of known publications t
<li> <a href="https://aclanthology.org/2023.nlp4convai-1.1/">Response Generation in Longitudinal Dialogues: Which Knowledge Representation Helps?</a> (Mousavi et al., 2023) </li>
<li> <a href="https://openreview.net/forum?id=XTHfNGI3zT">Quantifying the Plausibility of Context Reliance in Neural Machine Translation</a> (Sarti et al., 2023)</li>
<li> <a href="https://aclanthology.org/2023.emnlp-main.243/">A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation</a> (Attanasio et al., 2023)</li>
<li> <a href="https://arxiv.org/abs/2310.09820">Assessing the Reliability of Large Language Model Knowledge</a> (Wang et al., 2023)</li>
<li> <a href="https://aclanthology.org/2023.conll-1.18/">Attribution and Alignment: Effects of Local Context Repetition on Utterance Production and Comprehension in Dialogue</a> (Molnar et al., 2023)</li>
</ol>

Expand All @@ -327,13 +315,15 @@ Inseq has been used in various research projects. A list of known publications t
<details>
<summary><b>2024</b></summary>
<ol>
<li><a href="https://arxiv.org/abs/2401.12576">LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools</a> (Wang et al., 2024)</li>
<li> <a href="https://aclanthology.org/2024.naacl-long.46/">Assessing the Reliability of Large Language Model Knowledge</a> (Wang et al., 2024)</li>
<li><a href="https://aclanthology.org/2024.hcinlp-1.9">LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools</a> (Wang et al., 2024)</li>
<li><a href="https://arxiv.org/abs/2402.00794">ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models</a> (Zhao et al., 2024)</li>
<li><a href="https://arxiv.org/abs/2404.02421">Revisiting subword tokenization: A case study on affixal negation in large language models</a> (Truong et al., 2024)</li>
<li><a href="https://aclanthology.org/2024.naacl-long.284">Revisiting subword tokenization: A case study on affixal negation in large language models</a> (Truong et al., 2024)</li>
<li><a href="https://hal.science/hal-04581586">Exploring NMT Explainability for Translators Using NMT Visualising Tools</a> (Gonzalez-Saez et al., 2024)</li>
<li><a href="https://arxiv.org/abs/2405.14899">DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning</a> (Zhou et al., 2024)</li>
<li><a href="https://openreview.net/forum?id=uILj5HPrag">DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning</a> (Zhou et al., 2024)</li>
<li><a href="https://arxiv.org/abs/2406.06399">Should We Fine-Tune or RAG? Evaluating Different Techniques to Adapt LLMs for Dialogue</a> (Alghisi et al., 2024)</li>
<li><a href="https://arxiv.org/abs/2406.13663">Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation</a> (Qi, Sarti et al., 2024)</li>
<li><a href="https://link.springer.com/chapter/10.1007/978-3-031-63787-2_14">NoNE Found: Explaining the Output of Sequence-to-Sequence Models When No Named Entity Is Recognized</a> (dela Cruz et al., 2024)</li>
</ol>

</details>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions inseq/commands/attribute_context/attribute_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,7 @@ def attribute_context_with_model(args: AttributeContextArgs, model: HuggingfaceM
)
cci_kwargs = {}
contextless_output = None
contrast_token = None
if args.attributed_fn is not None and is_contrastive_step_function(args.attributed_fn):
if not model.is_encoder_decoder:
formatted_input_current_text = concat_with_sep(
Expand All @@ -193,6 +194,7 @@ def attribute_context_with_model(args: AttributeContextArgs, model: HuggingfaceM
contextless_output, skip_special_tokens=False, as_targets=model.is_encoder_decoder
)
tok_pos = -2 if model.is_encoder_decoder else -1
contrast_token = output_ctxless_tokens[tok_pos]
if args.attributed_fn == "kl_divergence" or output_ctx_tokens[tok_pos] == output_ctxless_tokens[tok_pos]:
cci_kwargs["contrast_force_inputs"] = True
bos_offset = int(model.is_encoder_decoder or output_ctx_tokens[0] == model.bos_token)
Expand Down Expand Up @@ -235,6 +237,7 @@ def attribute_context_with_model(args: AttributeContextArgs, model: HuggingfaceM
cci_out = CCIOutput(
cti_idx=cti_idx,
cti_token=cti_tok,
contrast_token=contrast_token,
cti_score=cti_score,
contextual_output=contextual_output,
contextless_output=contextless_output,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ class CCIOutput:
cti_token: str
cti_score: float
contextual_output: str
contextless_output: str
contrast_token: str | None = None
contextless_output: str | None = None
input_context_scores: list[float] | None = None
output_context_scores: list[float] | None = None

Expand Down
20 changes: 17 additions & 3 deletions inseq/commands/attribute_context/attribute_context_viz_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,7 @@ def visualize_attribute_context_treescope(
replace_chars = {"Ġ": " ", "Ċ": "\n", "▁": " "}
cci_idx_map = {cci.cti_idx: cci for cci in output.cci_scores} if output.cci_scores is not None else {}
for curr_tok_idx, curr_tok in enumerate(output.output_current_tokens):
curr_tok_parts, highlighted_idx = get_single_token_heatmap_treescope(
curr_tok_parts, highlighted_idx, cleaned_curr_tok = get_single_token_heatmap_treescope(
curr_tok,
score=output.cti_scores[curr_tok_idx],
max_val=output.max_cti,
Expand All @@ -300,12 +300,26 @@ def visualize_attribute_context_treescope(
if curr_tok_idx in cci_idx_map:
cci_parts = [rp.text("\n")]
cci = cci_idx_map[curr_tok_idx]
if cci.contrast_token is not None:
contrast_token = cci.contrast_token
for char in replace_chars.keys():
contrast_token = contrast_token.strip(char)
if contrast_token != cleaned_curr_tok:
cci_parts += [
rp.custom_style(
rp.text("Contrastive alternative: "),
css_style="font-weight: bold; font-style: italic; color: #888888;",
),
rp.custom_style(
rp.text(contrast_token + "\n\n"), css_style="font-style: italic; color: #888888;"
),
]
if cci.input_context_scores is not None:
cci_parts.append(
get_tokens_heatmap_treescope(
tokens=output.input_context_tokens,
scores=cci.input_context_scores,
title=f'Input context CCI scores for "{cci.cti_token}"',
title=f'Input contextual cues for "{cleaned_curr_tok}"',
title_style="font-style: italic; color: #888888;",
min_val=output.min_cci,
max_val=output.max_cci,
Expand All @@ -320,7 +334,7 @@ def visualize_attribute_context_treescope(
get_tokens_heatmap_treescope(
tokens=output.output_context_tokens,
scores=cci.output_context_scores,
title=f'Output context CCI scores for "{cci.cti_token}"',
title=f'Output contextual cue for "{cleaned_curr_tok}"',
title_style="font-style: italic; color: #888888;",
min_val=output.min_cci,
max_val=output.max_cci,
Expand Down
2 changes: 1 addition & 1 deletion inseq/data/viz.py
Original file line number Diff line number Diff line change
Expand Up @@ -628,7 +628,7 @@ def get_single_token_heatmap_treescope(
else:
parts.pop(idx_highlight)
if return_highlighted_idx:
return parts, idx_highlight
return parts, idx_highlight, show_token
return parts


Expand Down

0 comments on commit e0b25d2

Please sign in to comment.