Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for using basic block context to GRANITE models. #290

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

virajbshah
Copy link
Collaborator

  • Update seq2seq models to only use non-context nodes as input to the
    readout network.
  • Add tests for training seq2seq models with context.

 * Add fields to the `proto` specification to store context.
 * Add members to the Gematria `BasicBlock` data structure to store
   context and update methods on it and its Python binding accordingly.
 * Bonus: Remove dangling TODO.
 * Update the graph builder and its Python bindings to add context
   instructions to basic block graphs and store context node mask to later
   be used by models.
 * Add tests for the new graph builder functionality.
 * Update seq2seq models to only use non-context nodes as input to the
   readout network.
 * Add tests for training seq2seq models with context.
Copy link
Collaborator

@ondrasej ondrasej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. I'll approve once the parent ones are ready (merged).

basic_block_was_added = self._batch_graph_builder.add_basic_block(block)
# Add context to the basic block graph only for seq2seq models.
basic_block_was_added = self._batch_graph_builder.add_basic_block(
block, add_context=self.use_deltas
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action needed in this PR: I think we should either completely disable the model where use_deltas is False, or add the context mask info somehow to the update functions. Otherwise, these models would have a hard time knowing which nodes belong to the context...

But given the model precision, I'd go with just disabling the model without deltas :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants