Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have not implemented batched attention mask. #5

Open
RolianTan opened this issue Dec 8, 2024 · 1 comment
Open

Have not implemented batched attention mask. #5

RolianTan opened this issue Dec 8, 2024 · 1 comment

Comments

@RolianTan
Copy link

RolianTan commented Dec 8, 2024

Hello,
I got this error when running the train_opt.py. I just change the model type to "Victuna-Tiny-1B", and keep other parameters same as the ReadMe.

====================
=== Step 0 ===

dln-fwd: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8335/8335 [22:26<00:00, 6.19it/s, acc: 0.618]
[Iter 0/40] generating prompt

Meta prompt #0:
[START]A student is completing a task that requires producing a text output from a text input. The student receives an instruction that describes how to produce the output given each input. The student has made some errors. Your task is to improve the instruction such that the student can fix the errors.
This was the instruction.

Instruction: Classify the input text as positive or negative.

Student successes

Input: it 's almost impossible not to be swept away by the sheer beauty of his images .
Correct Output: positive

Input: on a 10-year delay
Correct Output: positive

Input: an elegant film with often surprising twists and an intermingling of naiveté and sophistication
Correct Output: positive

Input: is trying to dupe the viewer into taking it all as very important simply because the movie is ugly to look at and not a hollywood product
Correct Output: negative

Student errors

Input: a sick , twisted sort of way
Student Output: positive
Correct Ouput: negative

Improve the instruction to fix the student errors. Clarify the instruction by adding few words or a short sentence. Be concise
Improved Instruction: [APE][END]
Traceback (most recent call last):
File "train_opt.py", line 294, in
main()
File "train_opt.py", line 209, in main
generated_instructs, used_demos = instruct_generator.iterative_generate(
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/dln.py", line 242, in iterative_generate
_generated_instructs, _used_demos = self.generate_instruct_bwd(cur_instruct, num_demos, dataset, rng, evaluator, num_prompt=1, num_meta_prompt=num_meta_prompt, **kwargs)
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/dln.py", line 280, in generate_instruct_bwd
return self._generate_instruct_bwd(
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/dln.py", line 409, in _generate_instruct_bwd
instructs = self.forward_generate_prompt(
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/dln.py", line 128, in forward_generate_prompt
smp_output_ids = ensemble_generate(
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/ensemble.py", line 84, in ensemble_generate
output_ids = greedy_search(model, input_ids, attention_mask, eos_token_id, pad_token_id,
File "/home/ubuntu/dpopt/DP-OPT-main/source/utils/ensemble.py", line 150, in greedy_search
raise NotImplementedError('Have not implemented batched attention mask.')
NotImplementedError: Have not implemented batched attention mask.

@jyhong836
Copy link
Collaborator

Hi,
I have not tried the Victuna-Tiny-1B.
It is likely that the attention_mask of model is not None as required.
Could you check how the attention_mask was created with your used model?
And make sure it is None.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants