Update abstract

TIGER-AI-Lab · Jun 23, 2024 · 61acbd2 · 61acbd2
1 parent 9ad9953
commit 61acbd2
Showing 1 changed file with 52 additions and 29 deletions.
diff --git a/index.html b/index.html
@@ -177,47 +177,70 @@ <h1 class="title is-1 publication-title">LongRAG: Enhancing Retrieval-Augmented
               </div>
             </div>
 
-            <centering>
-              <div style="text-align: center;">
-                <img id="teaser" width="70%" src="static/images/teaser.png">
-              </div>
-            </centering>
-
           </div>
         </div>
       </div>
     </div>
   </section>
 
 
-  <section class="section"  style="background-color:#efeff081">
-    <div class="container is-max-desktop">
-      <!-- Abstract. -->
-      <div class="columns is-centered has-text-centered">
-        <div class="column is-six-fifths">
-          <h2 class="title is-3">Abstract</h2>
-          <div class="content has-text-justified">
-            <p>
-
-              In traditional RAG framework, the basic retrieval units are normally short. The common retrievers like DPR normally work with 100-word Wikipedia paragraphs.
-              Such a design forces the retriever to search over a large corpus to find the "needle" unit. In contrast, the readers only need to extract answers from the
-              short retrieved units. Such an imbalanced heavy retriever and light reader design can lead to sub-optimal performance. In order to alleviate the imbalance,
-              we propose a new framework LongRAG, consisting of a "long retriever" and a "long reader". LongRAG processes the entire Wikipedia into 4K-token units, which is
-              30x longer than before. By increasing the unit size, we significantly reduce the total units from 22M to 700K. This significantly lowers the burden of retriever,
-              which leads to a remarkable retrieval score: answer recall@1=71% on NQ (previously 52%) and answer recall@2=72% (previously 47%) on HotpotQA (full-wiki). Then
-              we feed the top-k retrieved units (≈ 30K tokens) to an existing long-context LLM to perform zero-shot answer extraction. Without requiring any training, LongRAG
-              achieves an EM of 62.7% on NQ, which is the best known result. LongRAG also achieves 64.3% on HotpotQA (full-wiki), which is on par of the SoTA model. Our study
-              offers insights into the future roadmap for combining RAG with long-context LLMs. <br>
+<!-- Paper abstract -->
+<section class="hero">
+    <div class="hero-body">
+        <div class="container is-max-desktop">
+            <div class="columns is-centered">
+                <div class="column has-text-centered">
+                    <h2 class="title is-3">Abstract</h2>
+                    <div class="content has-text-justified">
+                        <p>
+
+                          In traditional RAG framework, the basic retrieval units are normally short. The common retrievers like DPR normally work with 100-word Wikipedia paragraphs.
+                          Such a design forces the retriever to search over a large corpus to find the "needle" unit. In contrast, the readers only need to extract answers from the
+                          short retrieved units. Such an imbalanced heavy retriever and light reader design can lead to sub-optimal performance. In order to alleviate the imbalance,
+                          we propose a new framework LongRAG, consisting of a "long retriever" and a "long reader". LongRAG processes the entire Wikipedia into 4K-token units, which is
+                          30x longer than before. By increasing the unit size, we significantly reduce the total units from 22M to 600K. This significantly lowers the burden of retriever,
+                          which leads to a remarkable retrieval score: answer recall@1=71% on NQ (previously 52%) and answer recall@2=72% (previously 47%) on HotpotQA (full-wiki). Then
+                          we feed the top-k retrieved units (≈ 30K tokens) to an existing long-context LLM to perform zero-shot answer extraction. Without requiring any training, LongRAG
+                          achieves an EM of 62.7% on NQ and 64.3% on HotpotQA (full-wiki), which is on par with the SoTA model. Our study offers insights into the future roadmap for combining
+                          RAG with long-context LLMs. <br>
+
+
+                        </p>
+                    </div>
+                </div>
+            </div>
+        </div>
+    </div>
+</section>
+<!-- End paper abstract -->
 
 
-           </p>
 
-          </div>
+<!-- Image carousel -->
+<section class="hero">
+    <div class="hero-body">
+        <div class="container is-max-desktop">
+            <div class="columns is-centered">
+                <div class="column is-four-fifths">
+                    <div class="item">
+                        <!-- Your image here -->
+                        <img src="static/images/teaser.png" alt="LongRAG framework" />
+                        <h2 class="subtitle">
+                            Figure 1: Traditional RAG vs. LongRAG. (Up) Traditional RAG operates on short retrieval units, where retriever needs to scan over massive amount of units
+                            to find the relevant piece. In contrast, LongRAG operates on long retrieval units (30x longer). Retriever has a much less workload, which significantly boosts
+                            the recall score. LongRAG fully exploits the ability of long-context language models (reader) to achieve strong performance. (Middle) LongRAG vs. other RAG
+                            methods on NQ. Blue model names indicate models without fine-tuning, while red model names indicate models with fine-tuning. (Down) LongRAG vs. other
+                            RAG methods on HotpotQA.
+                        </h2>
+                    </div>
+                </div>
+            </div>
         </div>
-      </div>
-
     </div>
-  </section>
+</section>
+<!-- End image carousel -->
+
+
 
   <section class="section">
     <!-- Results. -->