Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prepared dataset caching, other misc fixes #665

Merged
merged 2 commits into from
Oct 3, 2023
Merged

Conversation

winglian
Copy link
Collaborator

@winglian winglian commented Oct 2, 2023

this will only explicitly cache when dataset_prepared_path is defined in the yml
other small fixes like wording for inference and not adding extra spaces when debugging tokenization

Copy link
Collaborator

@NanoCode012 NanoCode012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing would be to add a message on readme near that config to say "leave empty for no cache".

@winglian winglian merged commit e50a64e into main Oct 3, 2023
4 checks passed
@winglian winglian deleted the ds_cache-20231002 branch October 3, 2023 01:07
mkeoliya pushed a commit to mkeoliya/axolotl that referenced this pull request Dec 15, 2023
* prepared dataset caching, other misc fixes

* also don't load from disk cache unless explicit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants