Have not implemented batched attention mask. #5

RolianTan · 2024-12-08T04:22:44Z

Hello,
I got this error when running the train_opt.py. I just change the model type to "Victuna-Tiny-1B", and keep other parameters same as the ReadMe.

====================
=== Step 0 ===

dln-fwd: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8335/8335 [22:26<00:00, 6.19it/s, acc: 0.618]
[Iter 0/40] generating prompt

Meta prompt #0:
[START]A student is completing a task that requires producing a text output from a text input. The student receives an instruction that describes how to produce the output given each input. The student has made some errors. Your task is to improve the instruction such that the student can fix the errors.
This was the instruction.

Instruction: Classify the input text as positive or negative.

Student successes

Input: it 's almost impossible not to be swept away by the sheer beauty of his images .
Correct Output: positive

Input: on a 10-year delay
Correct Output: positive

Input: an elegant film with often surprising twists and an intermingling of naiveté and sophistication
Correct Output: positive

Input: is trying to dupe the viewer into taking it all as very important simply because the movie is ugly to look at and not a hollywood product
Correct Output: negative

Student errors

Input: a sick , twisted sort of way
Student Output: positive
Correct Ouput: negative

Improve the instruction to fix the student errors. Clarify the instruction by adding few words or a short sentence. Be concise
Improved Instruction: [APE][END]
Traceback (most recent call last):
File "train_opt.py", line 294, in
main()
File "train_opt.py", line 209, in main
generated_instructs, used_demos = instruct_generator.iterative_generate(
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/dln.py", line 242, in iterative_generate
_generated_instructs, _used_demos = self.generate_instruct_bwd(cur_instruct, num_demos, dataset, rng, evaluator, num_prompt=1, num_meta_prompt=num_meta_prompt, **kwargs)
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/dln.py", line 280, in generate_instruct_bwd
return self._generate_instruct_bwd(
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/dln.py", line 409, in _generate_instruct_bwd
instructs = self.forward_generate_prompt(
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/dln.py", line 128, in forward_generate_prompt
smp_output_ids = ensemble_generate(
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/ensemble.py", line 84, in ensemble_generate
output_ids = greedy_search(model, input_ids, attention_mask, eos_token_id, pad_token_id,
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/ensemble.py", line 150, in greedy_search
raise NotImplementedError('Have not implemented batched attention mask.')
NotImplementedError: Have not implemented batched attention mask.

jyhong836 · 2024-12-13T16:32:35Z

Hi,
I have not tried the Victuna-Tiny-1B.
It is likely that the attention_mask of model is not None as required.
Could you check how the attention_mask was created with your used model?
And make sure it is None.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have not implemented batched attention mask. #5

Have not implemented batched attention mask. #5

RolianTan commented Dec 8, 2024 •

edited

Loading

jyhong836 commented Dec 13, 2024

Have not implemented batched attention mask. #5

Have not implemented batched attention mask. #5

Comments

RolianTan commented Dec 8, 2024 • edited Loading

==================== === Step 0 ===

Student successes

Student errors

jyhong836 commented Dec 13, 2024

RolianTan commented Dec 8, 2024 •

edited

Loading

====================
=== Step 0 ===