Modifed masking before pooling - Fixes issue in ONNX conversion #92
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue:
In
class INSTRUCTOR_Transformer
, insidedef forward()
, the attention mask corresponding to the instruction tokens areset to 0 in the following manner:
I want to draw attention to the line
n = len(attention_mask)
. This int variable will be treated as a constant during onnx conversion, which will lead to incorrect inference when the instruction token length changes.Solution:
Instead of geting the instruction token length and manually iterating the
attention_mask
to set the value as 0,I have introduced
def prepare_input_features()
function underclass Instructor
that carries out the same task using tensor manipulations.By this way performing inference using the onnx model works as expected for any instruction.
Other change set:
There are many other diff in the pull request, those are a result of adhering to formatting/linting standards.