Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support list/tuple inputs for special tokens in MultiSegmentPacker layer #1046

Conversation

abheesht17
Copy link
Collaborator

No description provided.

@abheesht17 abheesht17 requested a review from mattdangerw May 17, 2023 03:44
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! just minor comments

@@ -53,12 +53,16 @@ class MultiSegmentPacker(keras.layers.Layer):

Args:
sequence_length: The desired output length.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add the type notes you have in the other layer for consistency

dtype of the input tensors to the layer.
end_value: The id(s) or token(s) that is/are to be placed at the end of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is/are -> are

dtype of the input tensors to the layer.
end_value: The id(s) or token(s) that is/are to be placed at the end of
the last input segment (called "[SEP]" for BERT). The dtype much
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much -> must

sep_value: The id(s) or token(s) that is/are to be placed at the end of
every segment, except the last segment (called "[SEP]" for BERT).
If `None`, `end_value` is used. The dtype much match the dtype of
the input tensors to the layer.
pad_value: The id or token that is to be placed into the unused
positions after the last segment in the sequence
(called "[PAD]" for BERT).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add an example below? maybe roberta double sep?

start_column = tf.fill((batch_size, 1), start_value)
end_column = tf.fill((batch_size, 1), end_value)
ones_column = tf.ones_like(start_column, dtype=tf.int32)
start_values_tensor = tf.repeat(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these names are a little confusing start_value is already a tensor. should we co back to _column naming?

start_column, end_column, sep_column?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, but it isn't exactly a column :P. I'll call it start_columns

(
[
[
"[CLS]",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to come up with a slightly shorter test case that will format the lists to one line

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops!

@abheesht17 abheesht17 requested a review from mattdangerw May 17, 2023 10:34
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Thank you.

@mattdangerw
Copy link
Member

/gcbrun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants