Skip to content

Commit 2ecce30

Browse files
authored
Merge pull request #157 from uiuc-focal-lab/pypi
Add pypi publishing workflow
2 parents 2dbbaaa + db063e4 commit 2ecce30

File tree

6 files changed

+140
-16
lines changed

6 files changed

+140
-16
lines changed
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
name: Build and publish to PyPI
2+
3+
on:
4+
release:
5+
types: [published]
6+
workflow_dispatch: # Allows you to run this workflow manually from the Actions tab
7+
8+
jobs:
9+
build_wheels:
10+
name: Build wheels on ${{ matrix.os }}
11+
runs-on: ${{ matrix.os }}
12+
strategy:
13+
fail-fast: false
14+
matrix:
15+
os: [ubuntu-latest, windows-latest, macos-latest]
16+
python-version: ['3.8', '3.9', '3.10', '3.11', '3.12']
17+
exclude:
18+
# Add any exclusions if certain OS/Python combinations are problematic
19+
# - os: macos-latest
20+
# python-version: '3.12'
21+
22+
steps:
23+
- uses: actions/checkout@v4
24+
with:
25+
fetch-depth: 0 # Gets all history for proper versioning
26+
27+
- name: Set up Python ${{ matrix.python-version }}
28+
uses: actions/setup-python@v4
29+
with:
30+
python-version: ${{ matrix.python-version }}
31+
32+
- name: Install build dependencies
33+
run: |
34+
python -m pip install --upgrade pip
35+
pip install build wheel setuptools
36+
37+
- name: Build wheels
38+
run: |
39+
python -m build --wheel --outdir dist/
40+
41+
- name: Upload wheels
42+
uses: actions/upload-artifact@v3
43+
with:
44+
name: wheels-${{ matrix.os }}-${{ matrix.python-version }}
45+
path: dist/*.whl
46+
47+
build_sdist:
48+
name: Build source distribution
49+
runs-on: ubuntu-latest
50+
51+
steps:
52+
- uses: actions/checkout@v4
53+
with:
54+
fetch-depth: 0 # Gets all history for proper versioning
55+
56+
- name: Set up Python
57+
uses: actions/setup-python@v4
58+
with:
59+
python-version: '3.12'
60+
61+
- name: Install build dependencies
62+
run: |
63+
python -m pip install --upgrade pip
64+
pip install build twine
65+
66+
- name: Build sdist
67+
run: |
68+
python -m build --sdist --outdir dist/
69+
70+
- name: Check metadata
71+
run: |
72+
twine check dist/*.tar.gz
73+
74+
- name: Upload sdist
75+
uses: actions/upload-artifact@v3
76+
with:
77+
name: sdist
78+
path: dist/*.tar.gz
79+
80+
publish:
81+
name: Publish to PyPI
82+
needs: [build_wheels, build_sdist]
83+
runs-on: ubuntu-latest
84+
# Only publish on release
85+
if: github.event_name == 'release' && github.event.action == 'published'
86+
environment:
87+
name: pypi
88+
url: https://pypi.org/project/syncode/
89+
permissions:
90+
id-token: write # For PyPI trusted publishing
91+
92+
steps:
93+
- name: Download all artifacts
94+
uses: actions/download-artifact@v3
95+
with:
96+
path: dist
97+
98+
- name: Flatten dist directory
99+
run: |
100+
mkdir -p flat_dist
101+
find dist -type f -name "*.whl" -o -name "*.tar.gz" -exec cp {} flat_dist \;
102+
ls -la flat_dist
103+
104+
- name: Publish to PyPI
105+
uses: pypa/gh-action-pypi-publish@release/v1
106+
with:
107+
packages-dir: flat_dist
108+
verbose: true
109+
# skip-existing: true # Uncomment if you want to skip existing versions

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,5 @@ tmp*
88
cache/
99
.ipynb_checkpoints/
1010
*.prof
11+
dist/
12+
syncode.egg-info/

README.md

Lines changed: 24 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -48,27 +48,30 @@ Define your own grammar using simple EBNF syntax. Check out our [notebooks direc
4848
| 🎲 Sample with any existing decoding strategy (eg. greedy, beam search, nucleus sampling) |
4949

5050

51-
## 📖 More About **SynCode**
51+
## 🚀 Quick Start
52+
### Python Installation and Usage Instructions
5253

53-
### How **SynCode** works?
54+
You can install SynCode via PyPI:
5455

55-
<img width="750" alt="Screenshot 2024-03-21 at 2 22 15 AM" src="https://github.com/uiuc-focal-lab/syncode/assets/14147610/d9d73072-3c9b-47d4-a941-69d5cf8fb1bf">
56+
```bash
57+
pip install syncode
58+
```
5659

57-
In the SynCode workflow, the LLM takes partial code _C<sub>k</sub>_ and generates a distribution for the next token _t<sub>k+1</sub>_. The incremental parser processes _C<sub>k</sub>_ to generate accept sequences _A_, the sequences of terminals that can follow partial code called accept sequences. Simultaneously, the incremental parser computes a remainder _r_ from the partial code, representing the suffix that may change its terminal type in subsequent generations. The backbone of SynCode is the offline construction of a DFA mask store, a lookup table derived from regular expressions representing the terminals of the language grammar. The DFA mask store facilitates efficient traversal of DFA states, enabling the retrieval of masks mapped to each state and accept sequence. SynCode walks over the DFA using the remainder and uses the mask store to compute the mask specific to each accept sequence. By unifying masks for each accept sequence SynCode gets the set of syntactically valid tokens. The LLM iteratively generates a token _t<sub>k+1</sub>_ using the distribution and the mask, appending it to _C<sub>k</sub>_ to create the updated code _C<sub>k+1</sub>_. The process continues until the LLM returns the final code _C<sub>n</sub>_ based on the defined stop condition.
60+
Alternatively, you can install the latest development version directly from GitHub:
5861

59-
## 🚀 Quick Start
60-
### Python Installation and Usage Instructions
61-
Simply install SynCode via PyPi using the following command:
62-
``` bash
62+
```bash
6363
pip install git+https://github.com/uiuc-focal-lab/syncode.git
6464
```
6565

66-
Note: SynCode depends on HuggingFace [transformers](https://github.com/huggingface/transformers):
67-
| SynCode version | Recommended transformers version |
68-
| -------------- | -------------------------------- |
69-
| `v0.1.4` (latest) | `v4.44.0` |
70-
| `v0.1.2` | `v4.42.0` |
66+
#### Version Compatibility
67+
68+
SynCode depends on HuggingFace [transformers](https://github.com/huggingface/transformers):
7169

70+
| SynCode version | Required transformers version | Python version |
71+
| -------------- | ----------------------------- | -------------- |
72+
| `v0.4.1` (latest) | `v4.44.0` | 3.6 - 3.12 |
73+
74+
**Note:** Python 3.13 is not currently supported due to dependency constraints.
7275

7376
### Usage option 1:
7477
SynCode can be used as a simple logit processor with HuggingFace [transformers](https://github.com/huggingface/transformers) library interface. Check this [notebook](./notebooks/example_logits_processor.ipynb) for example.
@@ -426,6 +429,14 @@ print(f"Syncode augmented LLM output:\n{output}")
426429
}
427430
```
428431

432+
## 📖 More About **SynCode**
433+
434+
### How **SynCode** works?
435+
436+
<img width="750" alt="Screenshot 2024-03-21 at 2 22 15 AM" src="https://github.com/uiuc-focal-lab/syncode/assets/14147610/d9d73072-3c9b-47d4-a941-69d5cf8fb1bf">
437+
438+
In the SynCode workflow, the LLM takes partial code _C<sub>k</sub>_ and generates a distribution for the next token _t<sub>k+1</sub>_. The incremental parser processes _C<sub>k</sub>_ to generate accept sequences _A_, the sequences of terminals that can follow partial code called accept sequences. Simultaneously, the incremental parser computes a remainder _r_ from the partial code, representing the suffix that may change its terminal type in subsequent generations. The backbone of SynCode is the offline construction of a DFA mask store, a lookup table derived from regular expressions representing the terminals of the language grammar. The DFA mask store facilitates efficient traversal of DFA states, enabling the retrieval of masks mapped to each state and accept sequence. SynCode walks over the DFA using the remainder and uses the mask store to compute the mask specific to each accept sequence. By unifying masks for each accept sequence SynCode gets the set of syntactically valid tokens. The LLM iteratively generates a token _t<sub>k+1</sub>_ using the distribution and the mask, appending it to _C<sub>k</sub>_ to create the updated code _C<sub>k+1</sub>_. The process continues until the LLM returns the final code _C<sub>n</sub>_ based on the defined stop condition.
439+
429440
## Contact
430441
For questions, please contact [Shubham Ugare](mailto:shubhamdugare@gmail.com).
431442

pyproject.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "syncode"
7-
version = "0.4.0"
7+
version="0.4.1"
8+
requires-python = ">=3.6,<3.13"
89
description = "Grammar-guided code generation tool"
910
readme = "README.md"
1011
authors = [

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,6 @@ interegular
33
regex==2023.8.8
44
torch
55
tqdm
6-
transformers==4.44.0
6+
transformers==4.44.0; python_version < "3.13"
77
datasets
88
jsonschema

setup.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import setuptools
2+
python_requires=">=3.6,<3.13",
23

34
with open("README.md", "r", encoding="utf-8") as fh:
45
long_description = fh.read()
@@ -17,7 +18,7 @@
1718

1819
setuptools.setup(
1920
name="syncode",
20-
version="0.4.0",
21+
version="0.4.1",
2122
author="Shubham Ugare",
2223
author_email="shubhamugare@gmail.com",
2324
description="This package provides the tool for grammar augmented LLM generation.",

0 commit comments

Comments
 (0)