[TASK] Implement PredictMasked (BERT-like masking) #692

marcromeyn · 2022-08-30T08:39:41Z

BERT-like masking

Let’s say we have a sequence ABCDE, BERT-like masking would result in the following:

Inputs	A	`MASKED`	C	`MASKED`	E
Targets	`MASKED`	B	`MASKED`	D	`MASKED`

Note, the number of target items might differ for different samples in the batch. We should ensure that we have at least one target and at most len(seq) - 1

The class that’s responsible for masking could roughly look like:

class PredictMasked(DataAugmentation):
	def __init__(
		self, 
		schema: Schema
		target: Union[str, Tag, ColSchema],
		prediction_block=None,
		mask_selection_rate,
		mask_selection_length,
		unselectable_token_ids=[0],
		mask_token_rate=0.8,
		random_token_rate=0.1
	):
		...

	def compute_mask(self, ...):
		...

We want to make use of standard Keras functionality w.r.t. masking. Some useful links:

Mask propagation
MLMMaskGenerator in keras_nlp
- Works on normal + tf.RaggedTensors
- Uses tensorflow_text
mask_language_model in tensorflow_text

The text was updated successfully, but these errors were encountered:

marcromeyn mentioned this issue Aug 30, 2022

[RMP] Tensorflow support for session based recommendations integration in Merlin NVIDIA-Merlin/Merlin#433

Closed

37 tasks

marcromeyn transferred this issue from NVIDIA-Merlin/Merlin Aug 30, 2022

marcromeyn changed the title ~~Implement PredictMasked~~ [TASK] Implement PredictMasked Aug 30, 2022

marcromeyn changed the title ~~[TASK] Implement PredictMasked~~ [TASK] Implement PredictMasked (BERT-like masking) Aug 30, 2022

marcromeyn assigned gabrielspmoreira Aug 30, 2022

marcromeyn added this to the Merlin 22.09 milestone Aug 30, 2022

viswa-nvidia modified the milestones: Merlin 22.09, Merlin 22.10 Sep 26, 2022

This was referenced Sep 29, 2022

Implemented Masked Language Modeling (alternative implementation) #775

Closed

Introduces Masked Language Modeling for Transformers #780

Merged

marcromeyn closed this as completed in #780 Oct 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TASK] Implement PredictMasked (BERT-like masking) #692

[TASK] Implement PredictMasked (BERT-like masking) #692

marcromeyn commented Aug 30, 2022 •

edited

Loading

[TASK] Implement PredictMasked (BERT-like masking) #692

[TASK] Implement PredictMasked (BERT-like masking) #692

Comments

marcromeyn commented Aug 30, 2022 • edited Loading

BERT-like masking

marcromeyn commented Aug 30, 2022 •

edited

Loading