Skip to content

Commit

Permalink
Update abstract
Browse files Browse the repository at this point in the history
  • Loading branch information
Ziyan Jiang committed Jun 23, 2024
1 parent 9ad9953 commit 61acbd2
Showing 1 changed file with 52 additions and 29 deletions.
81 changes: 52 additions & 29 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -177,47 +177,70 @@ <h1 class="title is-1 publication-title">LongRAG: Enhancing Retrieval-Augmented
</div>
</div>

<centering>
<div style="text-align: center;">
<img id="teaser" width="70%" src="static/images/teaser.png">
</div>
</centering>

</div>
</div>
</div>
</div>
</section>


<section class="section" style="background-color:#efeff081">
<div class="container is-max-desktop">
<!-- Abstract. -->
<div class="columns is-centered has-text-centered">
<div class="column is-six-fifths">
<h2 class="title is-3">Abstract</h2>
<div class="content has-text-justified">
<p>

In traditional RAG framework, the basic retrieval units are normally short. The common retrievers like DPR normally work with 100-word Wikipedia paragraphs.
Such a design forces the retriever to search over a large corpus to find the "needle" unit. In contrast, the readers only need to extract answers from the
short retrieved units. Such an imbalanced heavy retriever and light reader design can lead to sub-optimal performance. In order to alleviate the imbalance,
we propose a new framework LongRAG, consisting of a "long retriever" and a "long reader". LongRAG processes the entire Wikipedia into 4K-token units, which is
30x longer than before. By increasing the unit size, we significantly reduce the total units from 22M to 700K. This significantly lowers the burden of retriever,
which leads to a remarkable retrieval score: answer recall@1=71% on NQ (previously 52%) and answer recall@2=72% (previously 47%) on HotpotQA (full-wiki). Then
we feed the top-k retrieved units (≈ 30K tokens) to an existing long-context LLM to perform zero-shot answer extraction. Without requiring any training, LongRAG
achieves an EM of 62.7% on NQ, which is the best known result. LongRAG also achieves 64.3% on HotpotQA (full-wiki), which is on par of the SoTA model. Our study
offers insights into the future roadmap for combining RAG with long-context LLMs. <br>
<!-- Paper abstract -->
<section class="hero">
<div class="hero-body">
<div class="container is-max-desktop">
<div class="columns is-centered">
<div class="column has-text-centered">
<h2 class="title is-3">Abstract</h2>
<div class="content has-text-justified">
<p>

In traditional RAG framework, the basic retrieval units are normally short. The common retrievers like DPR normally work with 100-word Wikipedia paragraphs.
Such a design forces the retriever to search over a large corpus to find the "needle" unit. In contrast, the readers only need to extract answers from the
short retrieved units. Such an imbalanced heavy retriever and light reader design can lead to sub-optimal performance. In order to alleviate the imbalance,
we propose a new framework LongRAG, consisting of a "long retriever" and a "long reader". LongRAG processes the entire Wikipedia into 4K-token units, which is
30x longer than before. By increasing the unit size, we significantly reduce the total units from 22M to 600K. This significantly lowers the burden of retriever,
which leads to a remarkable retrieval score: answer recall@1=71% on NQ (previously 52%) and answer recall@2=72% (previously 47%) on HotpotQA (full-wiki). Then
we feed the top-k retrieved units (≈ 30K tokens) to an existing long-context LLM to perform zero-shot answer extraction. Without requiring any training, LongRAG
achieves an EM of 62.7% on NQ and 64.3% on HotpotQA (full-wiki), which is on par with the SoTA model. Our study offers insights into the future roadmap for combining
RAG with long-context LLMs. <br>


</p>
</div>
</div>
</div>
</div>
</div>
</section>
<!-- End paper abstract -->


</p>

</div>
<!-- Image carousel -->
<section class="hero">
<div class="hero-body">
<div class="container is-max-desktop">
<div class="columns is-centered">
<div class="column is-four-fifths">
<div class="item">
<!-- Your image here -->
<img src="static/images/teaser.png" alt="LongRAG framework" />
<h2 class="subtitle">
Figure 1: Traditional RAG vs. LongRAG. (Up) Traditional RAG operates on short retrieval units, where retriever needs to scan over massive amount of units
to find the relevant piece. In contrast, LongRAG operates on long retrieval units (30x longer). Retriever has a much less workload, which significantly boosts
the recall score. LongRAG fully exploits the ability of long-context language models (reader) to achieve strong performance. (Middle) LongRAG vs. other RAG
methods on NQ. Blue model names indicate models without fine-tuning, while red model names indicate models with fine-tuning. (Down) LongRAG vs. other
RAG methods on HotpotQA.
</h2>
</div>
</div>
</div>
</div>
</div>

</div>
</section>
</section>
<!-- End image carousel -->



<section class="section">
<!-- Results. -->
Expand Down

0 comments on commit 61acbd2

Please sign in to comment.