You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was reading your blog https://blog.dottxt.co/coalescence.html#org5a801d9 and saw that you mentioned structured generation is 5x faster than vanilla generation. Is this something available in outlines now? From my tests, using logitsprocessor is a few times slower than not using it as expected (transformers version only). Furthermore, is the 5X speed based on enabling sampling in generate?
Is there benchmark code to compare structured vs unstructured generation for this speed comparison? I am interested in using the logitsprocessor provided but it does seem that the speed is slower in my experience, and I'm wondering if I'm doing this wrongly. Intuitively, this should be slower and not faster, due to the need for extra processing during generation right?
Would appreciate any pointers here thank you!
How would you like it to behave?
Example code and documentation with code pointers.
The text was updated successfully, but these errors were encountered:
What behavior of the library made you think about the improvement?
Hi @rlouf from reading the documentation, I am missing a few things:
Is infilling currently supported? The documentation under README says that
Interleave completions
is supported but I cannot find a way to get this done. Could be missing something here How to use gen fields in outlines? #176 but the examples from that PR don't work even after refactoring the imports. The main goal is to get to something like https://github.com/guidance-ai/guidance/tree/main?tab=readme-ov-file#guidance-accelerationI was reading your blog https://blog.dottxt.co/coalescence.html#org5a801d9 and saw that you mentioned structured generation is 5x faster than vanilla generation. Is this something available in outlines now? From my tests, using logitsprocessor is a few times slower than not using it as expected (transformers version only). Furthermore, is the 5X speed based on enabling sampling in
generate
?Is there benchmark code to compare structured vs unstructured generation for this speed comparison? I am interested in using the logitsprocessor provided but it does seem that the speed is slower in my experience, and I'm wondering if I'm doing this wrongly. Intuitively, this should be slower and not faster, due to the need for extra processing during generation right?
Would appreciate any pointers here thank you!
How would you like it to behave?
Example code and documentation with code pointers.
The text was updated successfully, but these errors were encountered: