-
Notifications
You must be signed in to change notification settings - Fork 538
Fix seed for random embeddings on CPU & MPS #592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
LeoGrin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woohoo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR modifies the seed selection logic for positional embeddings to improve performance on datasets with a small number of features (1-3 features) when using CPU or MPS devices. The change addresses the issue that the default seed optimized for GPU training may be suboptimal for CPU/MPS inference on small-feature datasets.
- Device-specific seed overrides for CPU (819) and MPS (42) are introduced for datasets with few features
- The original
random_embedding_seedfrom config is preserved for GPU devices
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…to ben/fix-seed-for-cpu
* Record copied public PR 592 * Fix seed for random embeddings on CPU & MPS (#592) (cherry picked from commit 94ce690) --------- Co-authored-by: mirror-bot <mirror-bot@users.noreply.github.com> Co-authored-by: Benjamin Jaeger <benjamin@priorlabs.ai> Co-authored-by: Benjamin Jaeger <jaeger.benjamin7@gmail.com>
We need to fix the seed for CPU & MPS inference to one that works for datasets with a small number of features.
This means we also had to update the consistency tests.