-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complete pytorch transformers interface, deprecate old GPT implement #881
Conversation
HaokunLiu
commented
Aug 8, 2019
- Replacing old GPT implement with the one from huggingface pytorch transformers
- Add GPT2, Transformer-XL, XLM to pytorch transformer interface
- Refactor pytorch transformer interface a little bit to reduce duplicated / outdated code & comments
I was thinking about including tokenizer and indexer inside model_preprocessing_interface as well, and change the main process from create_tasks -> preprocess -> create model to create_tasks -> create model -> preprocess, so that members of model_preprocessing_interface can be passed from model, instead of first creating from args in preprocess, and again creating from args in model. But this is unnecessarily radical for the sake of #881 . Since you are one of the major developers of jiant, maybe you can consider this idea, and when the time comes, figure out a well-rounded overall architecture for jiant. |
Some results are different, but I think, considering how old gpt and new gpt differs, it meets the expectation. |
Thanks! That's not that informative of a comparison, but since you're getting numbers in the same ballpark as what OpenAI published, I think that's enough. I agree that it's okay to leave in some awkward abstractions for now—better to get this out there and refactor later than to put too much burden on you for doing it. There are some new merge conflicts (we moved the config dir), BTW. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Ready to merge? |
Yes, it’s ready. |
Great—I'll make a proper release tm unless someone beats me to it. |
…881) * Rename namespaces to suppress warnings. * Revert "Rename namespaces to suppress warnings." This reverts commit 0cf7b23. * Initial working-ish attempt. * Intermediate check-in... * More partial progress. * Another pass... * Fix sep/cls handling, cleanup. * Further cleanup. * Keyword name fix. * Another flag fix. * Pull debug print. * Line length cleanup. * WiC fix. * Two task setup bugs. * BoolQ typo * Improved segment handling. * Delete unused is_pair_task, other cleanup/fixes. * Fix deleted path from merge. * Fix cache path. * relocate tasks from seminar * add linguistic phenomena benchmark tasks * Address (spurious?) tokenization warning. * Select pool_type automatically to match model. h/t Haokun Liu * Config updates. * Path fix * add two prefix method and simple LM * Fix XLNet UNK handling. * Internal temporary MNLI alternate. * Revert "Internal temporary MNLI alternate." This reverts commit 455792a. * refacor tags in data loader * Add helper fn tests * Finish merge * Remove unused argument. * update task init * Possible ReCoRD bug fix * Cleanup * Fix merge issues. * Revert "Remove unused argument." This reverts commit 96a7c37. * Assorted responses to Alex's commenst. * Further ReCoRD fix. * @iftenney's comments. * Fix/simplify segment logic. * @W4ngatang's comments * Cleanup. * add forward functinos * bugfix * merge pytorch transformer * update old process split * add gpt2 * add get_pretrained_lm_head for transformers * update filename * add config * debug * update config * allow evaluate with raw parameter * debug * Cleanup * Fix issues with alternative embeddings_mode settings, max_layer. * More mix cleanup. * Masking fix. * cleanup * simplify get_seg_ids * debug * related adjustments to add pytorch transformers * pytorch transformer refactor * formatting * formatting * debug * TransformerXL fix * update test script * formatting again * add note to transfo-xl * debug * update test script * update test script * tokenized_name change * cleanup * pool type fix * config update * Update defaults.conf * rename use_pytorch_transformer * cleanup * Update test_preprocess.py * Update test_checkpointing.py * Update test_write_preds.py * clean up * debug * name changes * name changes * update message * name changes * tokenizer name fix * docstring changes * name changes * restore asserts * add pair embedding for pytorch_transformers * add max position embedding assert * deal with gpt-like boundary fn * roberta tokenizer support * roberta model support * roberta embedder * fix roberta seg_id * change unused_task_name message * more test cases for pytorch_tranformers_interface * gpt-style mirrored pair forward func for similarity tasks * Update environment.yml * adjust import location * black * move import location * update test script * add comments to test script * update test script * pool type fix * tokenizer fix * debug * special tokens fix * roberta vocab fix * roberta tokenizer fix * clean up * Update test_pytorch_transformers_interface.py * add_special_token fix * black * fix roberta message logic * fix embedding extend bug * black * clean up * simplify add_special_token fix * add assert for lm task & pytorch_transformers * black * relocate task_modulator initialization * minor changes * rename task_modulator -> model_preprocessing_interface * change lm_parsing process_split docstring * black * add gpt2-large * update dependency * update dependency for real * clean up * add a forgotten similarity task for gpt * update setup * update setup