-
Notifications
You must be signed in to change notification settings - Fork 38
Language model block #372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Language model block #372
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
# TODO: Prevent unsafe by default | ||
yield from torch.load(path) | ||
# TODO: Confirm that loading works with `weights_only=True` | ||
yield from torch.load(path, weights_only=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for making our sec folks happy
ModelTestingGroup.distributed: ModelTestingGroupAction.unimportant, | ||
}, | ||
) | ||
del MODEL_CONFIGS["starcoder_2"].config_dict["model"]["base_model"]["embeddings"]["num_position_embeddings"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unused but leaving it in the dict caused errors when comparing with converted configs.
# Set a dummy default user so we don't run in root by default. | ||
# The image is still compatible with any user id. | ||
RUN useradd user | ||
USER user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
✨ Description
A small follow-up to #370, turning the language model itself into a module just like any other. Also move the hidden size to the language model config to enforce that module input/output dimensions are set by their parent module.
Also added small safety tweaks.