Skip to content

Conversation

@z52527
Copy link
Collaborator

@z52527 z52527 commented Jul 2, 2025

Description

#73

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Copy link
Collaborator

@jiashuy jiashuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @z52527 ,we can check the result in two ways, and you can select one after your evaluation:

  1. Generic method:
    develop a def lookup_scores(indices, offsets) API for BatchedDynamicEmbeddingTables. It's process is close to forward() but don't need to do deduplication. Just lookup the scores using find_pointers as you already used.
    But in the test script, you need to figure out how to get the BatchedDynamicEmbeddingTables from DistributedModelParallel.

So it may cost more.
2.Only checking the masking's result under debug mode in C++, so it will also make no effect to users.


// Frequency masking functions
void mask_embeddings_by_frequency(void *embeddings_ptr, void *scores_ptr,
int num_embeddings, int embedding_dim,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to num_keys to avoid confusion with the size of embedding table.

@z52527 z52527 force-pushed the fea-mask-low-frequence-emb-LFU branch from d01a4b3 to 2f9a776 Compare October 28, 2025 07:03
@shijieliu shijieliu force-pushed the fea-mask-low-frequence-emb-LFU branch from 429963d to d941891 Compare October 31, 2025 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants