-
-
Notifications
You must be signed in to change notification settings - Fork 871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DPO cleanup #1126
Merged
+440
−106
Merged
DPO cleanup #1126
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
6a23720
cleanup dpo to be a little more extensible, add zephyr/nectar strategy
winglian 2f7242d
fix eos slash
winglian 74cbe2a
support for eval split
winglian 1f28327
fix kwargs
winglian cb2f774
handle empty evals
winglian a91e0cb
don't load peft model for dpo
winglian bcae290
ensure dpo traning args gets bf16 for peft if applicable
winglian a24c756
fix duplicate kwargs for bf16
winglian d5e12dd
make sure to respect the configured lr scheduler
winglian 9a746fa
supprt trainer callback to push config to wandb
winglian 60f566c
set dataloader preload args
winglian c41391d
ensure that we are loading the lora when merging
winglian 1cfd179
Update src/axolotl/utils/data.py
winglian 02fc8f9
support local datasets for dpo
winglian 7141fd1
chore: lint
winglian 44a6f2d
dpo/kto/ipo smoke tests w lora, simplify dpo dataset type names
winglian 7f3b7ce
add split to dpo tests
winglian 52a227d
fix rebase/merging error
winglian 064b20e
handle edge case w logging
winglian c49315f
use accelerator for dpo datasets so it doesn't break the logger
winglian 72fb877
missing args
winglian 29663d8
validate checkpoint is an adapter for now
winglian 76b5c2d
log warning when dataset strategy is not loadable
winglian File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
""" | ||
module for DPO style dataset transform strategies | ||
""" | ||
|
||
import importlib | ||
import logging | ||
|
||
LOG = logging.getLogger("axolotl") | ||
|
||
|
||
def load(strategy, cfg): | ||
try: | ||
load_fn = strategy.split(".")[-1] | ||
strategy = ".".join(strategy.split(".")[:-1]) | ||
mod = importlib.import_module(f".{strategy}", "axolotl.prompt_strategies.dpo") | ||
func = getattr(mod, load_fn) | ||
load_kwargs = {} | ||
return func(cfg, **load_kwargs) | ||
except Exception: # pylint: disable=broad-exception-caught | ||
LOG.warning(f"unable to load strategy {strategy}") | ||
return None | ||
winglian marked this conversation as resolved.
Show resolved
Hide resolved
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is most likely not correct. The
strategy
includes underscores, not.
, such asintel_apply_chatml
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the intention is the setting is something like
in which case it will load the argilla function from the
axolotl.prompt_strategies.dpo.chatml
module.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @winglian 👋🏻 thanks. That makes sense. I will test it later today 👍🏻