Skip to content

Conversation

@bejaeger
Copy link
Contributor

@bejaeger bejaeger commented Nov 4, 2025

We need to fix the seed for CPU & MPS inference to one that works for datasets with a small number of features.
This means we also had to update the consistency tests.

@bejaeger bejaeger changed the title Fix seed for random embeddings on CPU Fix seed for random embeddings on CPU & MPS Nov 4, 2025
@bejaeger bejaeger requested a review from LeoGrin November 4, 2025 19:35
@bejaeger bejaeger marked this pull request as ready for review November 4, 2025 19:35
@bejaeger bejaeger requested a review from a team as a code owner November 4, 2025 19:35
Copilot AI review requested due to automatic review settings November 4, 2025 19:35
Copy link
Collaborator

@LeoGrin LeoGrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woohoo

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR modifies the seed selection logic for positional embeddings to improve performance on datasets with a small number of features (1-3 features) when using CPU or MPS devices. The change addresses the issue that the default seed optimized for GPU training may be suboptimal for CPU/MPS inference on small-feature datasets.

  • Device-specific seed overrides for CPU (819) and MPS (42) are introduced for datasets with few features
  • The original random_embedding_seed from config is preserved for GPU devices

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bejaeger bejaeger merged commit 94ce690 into main Nov 5, 2025
10 checks passed
oscarkey pushed a commit that referenced this pull request Nov 12, 2025
* Record copied public PR 592

* Fix seed for random embeddings on CPU & MPS (#592)

(cherry picked from commit 94ce690)

---------

Co-authored-by: mirror-bot <mirror-bot@users.noreply.github.com>
Co-authored-by: Benjamin Jaeger <benjamin@priorlabs.ai>
Co-authored-by: Benjamin Jaeger <jaeger.benjamin7@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants