Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streaming inference & fix stopping at EOS #180

Merged
merged 2 commits into from
Jun 10, 2023

Conversation

Glavin001
Copy link
Contributor

@Glavin001 Glavin001 commented Jun 10, 2023

What's new?

  • Add streaming inference
  • Fix stopping at eos_token

When I was testing inference with the Falcon config file it used eos_token=<|endoftext|> instead of what was previously being set in the do_inference() code was <s/> so it wouldn’t stop early it would always use the entire 1024 max new tokens.

Demo

https://www.loom.com/share/03f8c8903b42484ea7aad72ab07a6340


prompter_module = getattr(importlib.import_module("axolotl.prompters"), prompter)

while True:
print("=" * 80)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you like or dislike the separators. Can remove.

@NanoCode012
Copy link
Collaborator

NanoCode012 commented Jun 10, 2023

Thank you for PR. I am currently still thinking how to handle this. Since we might not only use alpaca format for inference, I'm thinking of how to make the prompt dynamic/passable and setting the appropriate default tokens.

Edit: as winglian has approved, I guess we can deal with the above at a later point. The streaming is a great addition!

scripts/finetune.py Outdated Show resolved Hide resolved
@winglian winglian merged commit 215d775 into axolotl-ai-cloud:main Jun 10, 2023
mkeoliya pushed a commit to mkeoliya/axolotl that referenced this pull request Dec 15, 2023
…ference

Add streaming inference & fix stopping at EOS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants