-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
revisited the README and max token lengths #30
Conversation
shamanez
commented
Sep 17, 2023
- added max_token_len variables
- corrected the path
- checked readme executions for training end2end
type=int, | ||
default=128, | ||
help=( | ||
"The maximum total input sequence length after tokenization. Sequences longer than this will be truncated," | ||
"The maximum total passage sequence length after tokenization. Sequences longer than this will be truncated," | ||
" sequences shorter will be padded if `--pad_to_max_length` is passed." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the parameter --pad_to_max_length
defined? I can't see it in the file.
sys.path.append(os.getcwd()) # This is needed to import modules with absolute paths | ||
|
||
# ruff: noqa: E402 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this empty line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There is a
--pad_to_max_length
parameter referenced in argparser hints, but is not defined. - Should we also use
query_max_length
andpassage_max_length
parameters in this script similar to thetrain_rage2e.py
script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup