Closed
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new bug or useful enhancement to share.
Expected Behavior
Prompt eval time is around twice as long as eval time (12 tokens/sec vs 22 tokens/sec). Is there a way to make them both the same speed?
Current Behavior
Prompt eval time takes twice as long as eval time.