Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run-time structured generation benchmarks #549

Closed
lapp0 opened this issue Jan 17, 2024 · 1 comment · Fixed by #1067 · May be fixed by #925
Closed

Run-time structured generation benchmarks #549

lapp0 opened this issue Jan 17, 2024 · 1 comment · Fixed by #1067 · May be fixed by #925
Assignees

Comments

@lapp0
Copy link
Contributor

lapp0 commented Jan 17, 2024

Initialization benchmarks are introduced in #542

We should extend these benchmarks to measure the performance of inference.

Goal

Outlines shouldn't be a bottleneck for most inference. A reasonable goal can be set based on

Benchmarks will help us achieve and maintain that goal.

What must be benchmarked

Proposed method

It's annoying to need a GPU to run tests. We shouldn't do actual inference in performance benchmarks.

    1. Create a mock inference engine
    1. Simple benchmark to ensure unguided mock inference engine takes infinitesimally small time
    1. Guided benchmarks that show true throughput of outlines
@rlouf
Copy link
Member

rlouf commented Jan 18, 2024

At this point I think that it would only make sense to benchmark the CFG-guided generation. Regex-guided generation is only a dictionary call at each step, so there really isn't anything we could do that would move the needle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment