Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TASK] Add select_by_tag to ParallelBlock #694

Closed
marcromeyn opened this issue Aug 30, 2022 · 0 comments · Fixed by #701
Closed

[TASK] Add select_by_tag to ParallelBlock #694

marcromeyn opened this issue Aug 30, 2022 · 0 comments · Fixed by #701
Assignees
Milestone

Comments

@marcromeyn
Copy link
Contributor

marcromeyn commented Aug 30, 2022

We would like to add capabilities to a ParallelBlock to select sub-graphs. This can be useful for instance to select the item-id embedding-table from a InputBlock. Another instance where this could be useful is to enable shared-embeddings for two-tower like models. At the moment, we would instantiate a different InputBlock per tower which doesn't allow for shared-embeddings. A way to enable this is to select the right feature-branches from a InputBlock.

As an example, let's say we have the following schema:

  • User features: user-id, last-purchase (is shared encoded with the item-id)
  • Item features: item-id
all_inputs = InputBlockV2(schema)
# This would result in a parallel-block with 2 branches:
# user-id -> EmbeddingTable(user_id)
# item-id, last-purchace -> EmbeddingTable(item_id, last_purchase)

user_inputs = all_inputs.select_by_tag(Tags.USER)
# Results in a parallel-block with 2 branches:
# user-id -> EmbeddingTable(user_id)
# last-purchace -> EmbeddingTable(item_id, last_purchase)

item_inputs = all_inputs.select_by_tag(Tags.ITEM)
# Results in a parallel-block with 2 branches:
# item-id -> EmbeddingTable(item_id, last_purchase)

As can be seen in the previous example, select_by_tag sub-selects the branches in a ParallelBlock from a feature-perspective.

@marcromeyn marcromeyn changed the title Add select_by_tag to ParallelBlock [TASK] Add select_by_tag to ParallelBlock Aug 30, 2022
@marcromeyn marcromeyn transferred this issue from NVIDIA-Merlin/Merlin Aug 30, 2022
@edknv edknv self-assigned this Aug 30, 2022
@viswa-nvidia viswa-nvidia added this to the Merlin 22.09 milestone Sep 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants