Skip to content

Commit

Permalink
change to markdown
Browse files Browse the repository at this point in the history
  • Loading branch information
dtch1997 committed Jul 15, 2024
1 parent 2a58409 commit 01f7eb5
Show file tree
Hide file tree
Showing 3 changed files with 35 additions and 35 deletions.
22 changes: 12 additions & 10 deletions content/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,14 @@ date: 2023-10-24
type: landing

sections:

- block: hero
- block: markdown
content:
title: SV-Gen. Analyzing the Generalization and Reliability of Steering Vectors
title: Analyzing the Generalization and Reliability of Steering Vectors
text: We find that steering vectors can often fail to work in- and out-of-distribution. We propose "steerability", a new metric for steering vectors, and extensively evaluate it across 40 datasets. We find that steerability is highly variable across different inputs. Depending on the concept, spurious biases can substantially contribute to how effective steering is for each input. Overall, our findings show that while steering can work well in the right circumstances, there remain mnany technical difficulties of applying steering vectors to guide models' behaviour at scale, and higher standards of evidence are required when applying steering vectors to models on novel tasks.
design:
css_class: dark
background:
gradient_end: '#1976d2'
gradient_start: '#004ba0'
text_color_light: true
color: black
- block: cta-button-list
content:
# Need a custom icon?
Expand All @@ -22,14 +20,18 @@ sections:
- text: Read our paper
icon: academicons/arxiv
url: https://drive.google.com/file/d/10DDi0wPFlw9yItmTaY03LPJptuFyTG8P/view?usp=sharing
- text: Use our steering-vectors library
icon: academicons/github
url: https://github.com/steering-vectors/steering-vectors/
- text: View our poster
icon: brands/google
url: https://drive.google.com/file/d/1xCMGCExBfyGivAhTV3-piU8CxVVPkC_5/view?usp=sharing
- text: Use our steering-vectors library
icon: brands/github
url: https://github.com/steering-vectors/steering-vectors/
- text: Reproduce our experiments
icon: academicons/github
icon: brands/github
url: https://github.com/dtch1997/repepo
design:
css_class: dark
background:
color: black

---
5 changes: 5 additions & 0 deletions hugo_stats.json
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
"bg-primary-100",
"blox-cta-button-list",
"blox-hero",
"blox-markdown",
"container",
"dark",
"dark:bg-hb-dark",
Expand All @@ -49,6 +50,7 @@
"flex-wrap",
"font-bold",
"font-semibold",
"gap-3",
"gap-6",
"h-10",
"h-[24px]",
Expand All @@ -73,6 +75,7 @@
"max-w-prose",
"mb-3",
"mb-4",
"mb-6",
"mt-24",
"mt-4",
"mt-6",
Expand Down Expand Up @@ -100,6 +103,7 @@
"rounded-sm",
"sm:py-48",
"sm:text-6xl",
"text-3xl",
"text-4xl",
"text-center",
"text-gray-600",
Expand All @@ -122,6 +126,7 @@
"page-bg",
"section-cta-button-list",
"section-hero",
"section-markdown",
"sun",
"top"
]
Expand Down
43 changes: 18 additions & 25 deletions public/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -400,10 +400,9 @@











Expand Down Expand Up @@ -445,8 +444,8 @@



<section id="section-hero" class="relative hbb-section blox-hero dark " >
<div class="home-section-bg " style="background-image: linear-gradient(90deg, #004ba0, #1976d2);">
<section id="section-markdown" class="relative hbb-section blox-markdown dark" >
<div class="home-section-bg " style="background-color: black;">

</div>

Expand All @@ -460,25 +459,17 @@



<div class="relative isolate px-6 pt-14 lg:px-8">

<div class="mx-auto max-w-2xl py-32 sm:py-48 lg:py-56">



<div class="text-center">
<h1 class="text-4xl font-bold tracking-tight text-gray-900 dark:text-gray-100 sm:text-6xl">SV-Gen. Analyzing the Generalization and Reliability of Steering Vectors</h1>
<p class="mt-6 text-lg leading-8 text-gray-600 dark:text-gray-300">We find that steering vectors can often fail to work in- and out-of-distribution. We propose &ldquo;steerability&rdquo;, a new metric for steering vectors, and extensively evaluate it across 40 datasets. We find that steerability is highly variable across different inputs. Depending on the concept, spurious biases can substantially contribute to how effective steering is for each input. Overall, our findings show that while steering can work well in the right circumstances, there remain mnany technical difficulties of applying steering vectors to guide models&rsquo; behaviour at scale, and higher standards of evidence are required when applying steering vectors to models on novel tasks.</p>





</div>
<div class="flex flex-col items-center max-w-prose mx-auto gap-3 justify-center">

</div>
<div class="mb-6 text-3xl font-bold text-gray-900 dark:text-white">
Analyzing the Generalization and Reliability of Steering Vectors
</div>


<div class="prose prose-slate lg:prose-xl dark:prose-invert max-w-prose">We find that steering vectors can often fail to work in- and out-of-distribution. We propose &ldquo;steerability&rdquo;, a new metric for steering vectors, and extensively evaluate it across 40 datasets. We find that steerability is highly variable across different inputs. Depending on the concept, spurious biases can substantially contribute to how effective steering is for each input. Overall, our findings show that while steering can work well in the right circumstances, there remain mnany technical difficulties of applying steering vectors to guide models&rsquo; behaviour at scale, and higher standards of evidence are required when applying steering vectors to models on novel tasks.</div>
</div>


Expand Down Expand Up @@ -527,6 +518,8 @@ <h1 class="text-4xl font-bold tracking-tight text-gray-900 dark:text-gray-100 sm








Expand Down Expand Up @@ -569,8 +562,8 @@ <h1 class="text-4xl font-bold tracking-tight text-gray-900 dark:text-gray-100 sm



<section id="section-cta-button-list" class="relative hbb-section blox-cta-button-list " >
<div class="home-section-bg " >
<section id="section-cta-button-list" class="relative hbb-section blox-cta-button-list dark" >
<div class="home-section-bg " style="background-color: black;">

</div>

Expand Down Expand Up @@ -601,28 +594,28 @@ <h1 class="text-4xl font-bold tracking-tight text-gray-900 dark:text-gray-100 sm
</div>
</a>

<a href="https://github.com/steering-vectors/steering-vectors/" target="_blank" rel="noopener noreferrer" class="flex items-center p-1 w-full rounded-md hover:scale-105 transition-all bg-gray-100 mb-3 max-w-3xl">
<a href="https://drive.google.com/file/d/1xCMGCExBfyGivAhTV3-piU8CxVVPkC_5/view?usp=sharing" target="_blank" rel="noopener noreferrer" class="flex items-center p-1 w-full rounded-md hover:scale-105 transition-all bg-gray-100 mb-3 max-w-3xl">
<div class="flex text-center w-full">
<div class="w-10 h-10 text-gray-700">

<svg height="40" width="40" class="rounded-sm" viewBox="0 0 370 391" xmlns="http://www.w3.org/2000/svg"><g clip-rule="evenodd" fill-rule="evenodd"><path d="m207.5 22.4 114.4 66.6c13.5 7.9 21.9 22.4 21.9 38v136.4c0 17.3-9.3 33.3-24.5 41.8l-113.5 63.9a49.06 49.06 0 0 1 -48.5-.2l-104.5-60.1c-16.4-9.5-26.6-27-26.6-45.9v-129.5c0-19.1 9.9-36.8 26.1-46.8l102.8-63.5c16-9.9 36.2-10.1 52.4-.7z" fill="#ff4088" stroke="#c9177e" stroke-width="27" /><path d="m105.6 298.2v-207.2h43.4v75.5h71.9v-75.5h43.5v207.2h-43.5v-90.6h-71.9v90.6z" fill="#fff" /></g></svg>

</div>
<div class="flex justify-center items-center font-semibold w-full text-gray-700 -ml-10">
Use our steering-vectors library
View our poster
</div>
</div>
</a>

<a href="https://drive.google.com/file/d/1xCMGCExBfyGivAhTV3-piU8CxVVPkC_5/view?usp=sharing" target="_blank" rel="noopener noreferrer" class="flex items-center p-1 w-full rounded-md hover:scale-105 transition-all bg-gray-100 mb-3 max-w-3xl">
<a href="https://github.com/steering-vectors/steering-vectors/" target="_blank" rel="noopener noreferrer" class="flex items-center p-1 w-full rounded-md hover:scale-105 transition-all bg-gray-100 mb-3 max-w-3xl">
<div class="flex text-center w-full">
<div class="w-10 h-10 text-gray-700">

<svg height="40" width="40" class="rounded-sm" viewBox="0 0 370 391" xmlns="http://www.w3.org/2000/svg"><g clip-rule="evenodd" fill-rule="evenodd"><path d="m207.5 22.4 114.4 66.6c13.5 7.9 21.9 22.4 21.9 38v136.4c0 17.3-9.3 33.3-24.5 41.8l-113.5 63.9a49.06 49.06 0 0 1 -48.5-.2l-104.5-60.1c-16.4-9.5-26.6-27-26.6-45.9v-129.5c0-19.1 9.9-36.8 26.1-46.8l102.8-63.5c16-9.9 36.2-10.1 52.4-.7z" fill="#ff4088" stroke="#c9177e" stroke-width="27" /><path d="m105.6 298.2v-207.2h43.4v75.5h71.9v-75.5h43.5v207.2h-43.5v-90.6h-71.9v90.6z" fill="#fff" /></g></svg>
<svg height="40" width="40" class="rounded-sm" fill="currentColor" viewBox="3 3 18 18"><path d="M12 3C7.0275 3 3 7.12937 3 12.2276C3 16.3109 5.57625 19.7597 9.15374 20.9824C9.60374 21.0631 9.77249 20.7863 9.77249 20.5441C9.77249 20.3249 9.76125 19.5982 9.76125 18.8254C7.5 19.2522 6.915 18.2602 6.735 17.7412C6.63375 17.4759 6.19499 16.6569 5.8125 16.4378C5.4975 16.2647 5.0475 15.838 5.80124 15.8264C6.51 15.8149 7.01625 16.4954 7.18499 16.7723C7.99499 18.1679 9.28875 17.7758 9.80625 17.5335C9.885 16.9337 10.1212 16.53 10.38 16.2993C8.3775 16.0687 6.285 15.2728 6.285 11.7432C6.285 10.7397 6.63375 9.9092 7.20749 9.26326C7.1175 9.03257 6.8025 8.08674 7.2975 6.81794C7.2975 6.81794 8.05125 6.57571 9.77249 7.76377C10.4925 7.55615 11.2575 7.45234 12.0225 7.45234C12.7875 7.45234 13.5525 7.55615 14.2725 7.76377C15.9937 6.56418 16.7475 6.81794 16.7475 6.81794C17.2424 8.08674 16.9275 9.03257 16.8375 9.26326C17.4113 9.9092 17.76 10.7281 17.76 11.7432C17.76 15.2843 15.6563 16.0687 13.6537 16.2993C13.98 16.5877 14.2613 17.1414 14.2613 18.0065C14.2613 19.2407 14.25 20.2326 14.25 20.5441C14.25 20.7863 14.4188 21.0746 14.8688 20.9824C16.6554 20.364 18.2079 19.1866 19.3078 17.6162C20.4077 16.0457 20.9995 14.1611 21 12.2276C21 7.12937 16.9725 3 12 3Z"></path></svg>

</div>
<div class="flex justify-center items-center font-semibold w-full text-gray-700 -ml-10">
View our poster
Use our steering-vectors library
</div>
</div>
</a>
Expand All @@ -631,7 +624,7 @@ <h1 class="text-4xl font-bold tracking-tight text-gray-900 dark:text-gray-100 sm
<div class="flex text-center w-full">
<div class="w-10 h-10 text-gray-700">

<svg height="40" width="40" class="rounded-sm" viewBox="0 0 370 391" xmlns="http://www.w3.org/2000/svg"><g clip-rule="evenodd" fill-rule="evenodd"><path d="m207.5 22.4 114.4 66.6c13.5 7.9 21.9 22.4 21.9 38v136.4c0 17.3-9.3 33.3-24.5 41.8l-113.5 63.9a49.06 49.06 0 0 1 -48.5-.2l-104.5-60.1c-16.4-9.5-26.6-27-26.6-45.9v-129.5c0-19.1 9.9-36.8 26.1-46.8l102.8-63.5c16-9.9 36.2-10.1 52.4-.7z" fill="#ff4088" stroke="#c9177e" stroke-width="27" /><path d="m105.6 298.2v-207.2h43.4v75.5h71.9v-75.5h43.5v207.2h-43.5v-90.6h-71.9v90.6z" fill="#fff" /></g></svg>
<svg height="40" width="40" class="rounded-sm" fill="currentColor" viewBox="3 3 18 18"><path d="M12 3C7.0275 3 3 7.12937 3 12.2276C3 16.3109 5.57625 19.7597 9.15374 20.9824C9.60374 21.0631 9.77249 20.7863 9.77249 20.5441C9.77249 20.3249 9.76125 19.5982 9.76125 18.8254C7.5 19.2522 6.915 18.2602 6.735 17.7412C6.63375 17.4759 6.19499 16.6569 5.8125 16.4378C5.4975 16.2647 5.0475 15.838 5.80124 15.8264C6.51 15.8149 7.01625 16.4954 7.18499 16.7723C7.99499 18.1679 9.28875 17.7758 9.80625 17.5335C9.885 16.9337 10.1212 16.53 10.38 16.2993C8.3775 16.0687 6.285 15.2728 6.285 11.7432C6.285 10.7397 6.63375 9.9092 7.20749 9.26326C7.1175 9.03257 6.8025 8.08674 7.2975 6.81794C7.2975 6.81794 8.05125 6.57571 9.77249 7.76377C10.4925 7.55615 11.2575 7.45234 12.0225 7.45234C12.7875 7.45234 13.5525 7.55615 14.2725 7.76377C15.9937 6.56418 16.7475 6.81794 16.7475 6.81794C17.2424 8.08674 16.9275 9.03257 16.8375 9.26326C17.4113 9.9092 17.76 10.7281 17.76 11.7432C17.76 15.2843 15.6563 16.0687 13.6537 16.2993C13.98 16.5877 14.2613 17.1414 14.2613 18.0065C14.2613 19.2407 14.25 20.2326 14.25 20.5441C14.25 20.7863 14.4188 21.0746 14.8688 20.9824C16.6554 20.364 18.2079 19.1866 19.3078 17.6162C20.4077 16.0457 20.9995 14.1611 21 12.2276C21 7.12937 16.9725 3 12 3Z"></path></svg>

</div>
<div class="flex justify-center items-center font-semibold w-full text-gray-700 -ml-10">
Expand Down

0 comments on commit 01f7eb5

Please sign in to comment.