Skip to content

Commit

Permalink
build pages from docs/
Browse files Browse the repository at this point in the history
  • Loading branch information
chenzizhao committed Oct 16, 2024
1 parent b703d7d commit 16d47b7
Show file tree
Hide file tree
Showing 18 changed files with 3,249 additions and 114 deletions.
113 changes: 1 addition & 112 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,6 @@ __pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
Expand All @@ -26,16 +23,6 @@ share/python-wheels/
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
Expand All @@ -51,112 +38,14 @@ coverage.xml
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.DS_store
7 changes: 5 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
# respect
Home for the paper "Retrospective Learning from Interactions"
# Retrospective Learning from Interactions

Project page: <https://lil-lab.github.io/respect/>

Under construction
237 changes: 237 additions & 0 deletions docs/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,237 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta name="description" content="A simple method to learn from human-AI interactions annotations-free.">
<meta property="og:title" content="Retrospective Learning from Interactions"/>
<meta property="og:description" content="A simple method to learn from human-AI interactions annotations-free."/>
<meta property="og:url" content="https://lil-lab.github.io/respect/"/>
<!-- Path to banner image, should be in the path listed below. Optimal dimenssions are 1200X630-->
<meta property="og:image" content="static/images/retrospect.png" />
<meta property="og:image:width" content="1916"/>
<meta property="og:image:height" content="694"/>

<meta name="twitter:title" content="Retrospective Learning from Interactions">
<meta name="twitter:description" content="A simple method to learn from human-AI interactions annotations-free.">
<!-- Path to banner image, should be in the path listed below. Optimal dimenssions are 1200X600-->
<meta name="twitter:image" content="static/images/retrospect.png">
<meta name="twitter:card" content="summary_large_image">
<meta name="keywords" content="human-AI, interactive learning, natural language processing, artificial intelligence, self improve">
<meta name="viewport" content="width=device-width, initial-scale=1" />

<!-- MathJax CDN -->
<script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
<script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>


<title>Retrospective Learning from Interactions</title>
<link rel="icon" type="image/x-icon" href="static/images/favicon.ico" />
<link
href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro"
rel="stylesheet"
/>

<link rel="stylesheet" href="static/css/bulma.min.css" />
<link rel="stylesheet" href="static/css/bulma-carousel.min.css" />
<link rel="stylesheet" href="static/css/bulma-slider.min.css" />
<link rel="stylesheet" href="static/css/fontawesome.all.min.css" />
<link
rel="stylesheet"
href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css"
/>
<link rel="stylesheet" href="static/css/index.css" />

<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
<script src="https://documentcloud.adobe.com/view-sdk/main.js"></script>
<script defer src="static/js/fontawesome.all.min.js"></script>
<script src="static/js/bulma-carousel.min.js"></script>
<script src="static/js/bulma-slider.min.js"></script>
<script src="static/js/index.js"></script>
</head>
<body>
<section class="hero">
<div class="hero-body">
<div class="container is-max-desktop">
<div class="columns is-centered">
<div class="column has-text-centered">
<h1 class="title is-1 publication-title">
Retrospective Learning from Interactions
</h1>
<div class="is-size-5 publication-authors">
<!-- Paper authors -->
<span class="author-block">
<a href="https://chenzizhao.github.io/" target="_blank"
>Zizhao Chen</a
>,
</span>
<span class="author-block">
<a href="https://momergul.github.io/" target="_blank"
>Mustafa Omer Gul</a
>,
</span>
<span class="author-block">
Yiwei Chen,
</span>
<span class="author-block">
Gloria Geng,
</span>
<span class="author-block">
<a href="https://annshin.github.io/" target="_blank"
>Anne Wu</a>,
</span>
<span class="author-block">
<a href="https://yoavartzi.com/" target="_blank"
>Yoav Artzi</a
>
</span>
</div>

<div class="is-size-5 publication-authors">
<span class="author-block">
Cornell Tech
<br />
October 2024
<!-- <br />Conferance name and year</span> -->
<!-- <span class="eql-cntrb">
<small><br /><sup>*</sup>Indicates Equal Contribution</small>
</span> -->
</div>

<div class="column has-text-centered">
<div class="publication-links">
<!-- Arxiv PDF link -->
<span class="link-block">
<a
href="https://arxiv.org/pdf/<ARXIV PAPER ID>.pdf"
target="_blank"
class="external-link button is-normal is-rounded is-dark"
>
<span class="icon">
<i class="fas fa-file-pdf"></i>
</span>
<span>Paper</span>
</a>
</span>

<!-- Github link -->
<span class="link-block">
<a
href="https://github.com/lil-lab/respect/"
target="_blank"
class="external-link button is-normal is-rounded is-dark"
>
<span class="icon">
<i class="fab fa-github"></i>
</span>
<span>Code</span>
</a>
</span>

<!-- ArXiv abstract Link -->
<span class="link-block">
<a
href="https://arxiv.org/abs/<ARXIV PAPER ID>"
target="_blank"
class="external-link button is-normal is-rounded is-dark"
>
<span class="icon">
<i class="ai ai-arxiv"></i>
</span>
<span>arXiv</span>
</a>
</span>
</div>
</div>
</div>
</div>
</div>
</div>
</section>

<!-- Paper abstract -->
<section class="section hero is-light">
<div class="container is-max-desktop">
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Abstract</h2>
<div class="content has-text-justified">
<p>
Multi-turn interactions between large language models (LLMs) and users naturally include implicit feedback signals. If an LLM responds in an unexpected way to an instruction, the user is likely to signal it by rephrasing the request, expressing frustration, or pivoting to an alternative task. Such signals are task-independent and occupy a relatively constrained subspace of language, allowing the LLM to identify them even if it fails on the actual task. This creates an avenue for continually learning from interactions without additional annotations. We introduce <i>ReSpect</i>, a method to learn from such signals in past interactions via retrospection. We deploy ReSpect in a new multimodal interaction scenario, where humans instruct an LLM to solve an abstract reasoning task with a combinatorial solution space. Through thousands of interactions with humans, we show how ReSpect gradually improves task completion rate from 31% to 82%, all without any external annotation.
</p>
</div>
</div>
</div>
</div>
</section>
<!-- End paper abstract -->

<!--Figure section -->
<section class="section hero">
<div class="container is-max-desktop">
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<div class="content has-text-justified">
<figure>
<img src="static/images/retrospect.png" width="100%">
<figcaption>Learning via Respect</figcaption>
</figure>
<p>
We deploy an LLM policy \(\pi_{\theta_{\rho}}(a \vert x) \) to interact with users in multi-turn interactions. Following each round, the LLM reasons retrospectively about each of its actions (highlighted in blue) to decode feedback given the interaction context, including follow up utterances. After each round, the model is retrained using all data aggregated so far \(D_{\leq \rho}\).
The LLM improves over time without any external annotations. The plot on the right shows the performance curve in our experiments - the LLM improves from 31% to 82% task completion rate over six rounds.
</p>

<br>

<figure>
<img src="static/images/interaction.png" width="100%">
<figcaption>Multiref: The interaction scenario we use in our experiments.
</figcaption>
</figure>
<p>
Multiref is a multi-turn reference game. A speaker and a listener both observe a shared set of tangram shapes, but in different order. The goal of the speaker is to describe a subset of targets for the listener to select. Because the target requires multiple abstract shapes, humans often communicate the targets gradually over multiple turns. As an interaction progresses naturally, the speaker produces implicit feedback signals that validate or reject the listener's actions.
</p>
</div>
</div>
</div>
</div>
</section>
<!--End figure section -->

<!--BibTex citation -->
<section class="section" id="BibTeX">
<div class="container is-max-desktop content">
<h2 class="title">BibTeX</h2>
<pre><code>BibTex Code Here</code></pre>
</div>
</section>
<!--End BibTex citation -->

<footer class="footer">
<div class="container">
<div class="columns is-centered">
<div class="column is-8">
<div class="content">
<p>
This page was built using the
<a
href="https://github.com/eliahuhorwitz/Academic-project-page-template"
target="_blank"
>Academic Project Page Template
</a>.
This website is licensed under a
<a
rel="license"
href="http://creativecommons.org/licenses/by-sa/4.0/"
target="_blank"
>Creative Commons Attribution-ShareAlike 4.0 International
License</a
>.
</p>
</div>
</div>
</div>
</div>
</footer>

</body>
</html>
1 change: 1 addition & 0 deletions docs/static/css/bulma-carousel.min.css

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 16d47b7

Please sign in to comment.