Docs fix about EAGLE and streaming output #3166

jhinpan · 2025-01-27T04:42:37Z

Motivation

To improve docs rendering for speculative_decoding and to solve issue #3164(#3164)

Modifications

Solve necessary rendering issues in speculative_decoding.ipynb and also help solve streaming output issue mentioned in issue 3164

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling.

zhaochenyang20 · 2025-01-27T04:45:12Z

docs/backend/offline_engine_api.ipynb

+    "def remove_overlap(existing_text, new_chunk):\n",
+    "    \"\"\"\n",
+    "    Finds the largest suffix of 'existing_text' that is a prefix of 'new_chunk'\n",
+    "    and removes that overlap from the start of 'new_chunk'.\n",
+    "    \"\"\"\n",
+    "    max_overlap = 0\n",
+    "    max_possible = min(len(existing_text), len(new_chunk))\n",
+    "\n",
+    "    for i in range(max_possible, 0, -1):\n",
+    "        if existing_text.endswith(new_chunk[:i]):\n",
+    "            max_overlap = i\n",
+    "            break\n",
+    "\n",
+    "    return new_chunk[max_overlap:]\n",
+    "\n",
+    "\n",
+    "def generate_text_no_repeats(llm, prompt, sampling_params):\n",
+    "    \"\"\"\n",
+    "    Example function that:\n",
+    "    1) Streams the text,\n",
+    "    2) Removes chunk overlaps,\n",
+    "    3) Returns the merged text.\n",
+    "    \"\"\"\n",
+    "    final_text = \"\"\n",
+    "    for chunk in llm.generate(prompt, sampling_params, stream=True):\n",
+    "        chunk_text = chunk[\"text\"]\n",
+    "\n",
+    "        cleaned_chunk = remove_overlap(final_text, chunk_text)\n",
+    "\n",
+    "        final_text += cleaned_chunk\n",
+    "\n",
+    "    return final_text\n",
+    "\n",
+    "\n",


Move this to python/sglang/utils.py and import it at the beginning of docs. Remember also to change other streaming part.

Imported and solve for streaming part also in the Streaming Asynchronous Generation.
Will check cahnge other streaming part tmrw

zhaochenyang20 · 2025-01-27T04:46:14Z

docs/backend/speculative_decoding.ipynb

+    "> **Note**: To run the following tests or benchmarks, you also need to install the [**cutex**](https://pypi.org/project/cutex/) module.  \n",
+    "> **Requirement**: Python 3.6+ on a Unix-like OS with **fcntl** support.  \n",
+    "> **Installation**:  \n",
+    "> ```bash\n",
+    "> pip install cutex\n",
+    "> ```\n",
+    "\n",


To run the following tests or benchmarks, you also need to install cutex:

pip install cutex

zhaochenyang20

make this concise like mine

remove ** but keep `

zhaochenyang20

LGTM

…into docs/streaming-fix

zhaochenyang20

LGTM

Chayenne added 2 commits January 26, 2025 20:36

Docs fix about EAGLE and streaming output

156ffe2

Change timeout mins

26f88c5

zhaochenyang20 requested changes Jan 27, 2025

View reviewed changes

Solve Streaming output issue for async

d204f88

zhaochenyang20 requested changes Jan 27, 2025

View reviewed changes

Chayenne added 3 commits January 26, 2025 22:14

DocFix in the Speculative_decoding

c5d45f5

Fix for the cutex doc

6110fb5

addup timeout-minutes

5630656

zhaochenyang20 reviewed Jan 27, 2025

View reviewed changes

zhaochenyang20 and others added 6 commits January 26, 2025 22:26

Merge branch 'main' into docs/streaming-fix

8d1ff04

Some quick fix for docs

bc315bb

Merge branch 'docs/streaming-fix' of https://github.com/jhinpan/sglang …

c5c55f7

…into docs/streaming-fix

Merge branch 'main' into docs/streaming-fix

d1d233a

fix flashInfer Doc Issue for installing sgl

de38993

fix sentence usage

d72db2d

zhaochenyang20 approved these changes Jan 28, 2025

View reviewed changes

zhaochenyang20 merged commit 7b9b4f4 into sgl-project:main Jan 28, 2025
4 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs fix about EAGLE and streaming output #3166

Docs fix about EAGLE and streaming output #3166

jhinpan commented Jan 27, 2025

zhaochenyang20 Jan 27, 2025

jhinpan Jan 27, 2025

zhaochenyang20 Jan 27, 2025

zhaochenyang20 left a comment

zhaochenyang20 left a comment

zhaochenyang20 left a comment

Docs fix about EAGLE and streaming output #3166

Docs fix about EAGLE and streaming output #3166

Conversation

jhinpan commented Jan 27, 2025

Motivation

Modifications

Checklist

zhaochenyang20 Jan 27, 2025

Choose a reason for hiding this comment

jhinpan Jan 27, 2025

Choose a reason for hiding this comment

zhaochenyang20 Jan 27, 2025

Choose a reason for hiding this comment

zhaochenyang20 left a comment

Choose a reason for hiding this comment

zhaochenyang20 left a comment

Choose a reason for hiding this comment

zhaochenyang20 left a comment

Choose a reason for hiding this comment