-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CLOSED] XLNet support and overhaul/cleanup of BERT support #845
Comments
Comment by pep8speaks Hello @sleepinyourhat! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
You can repair most issues by installing black and running: Comment last updated at 2019-08-07 21:30:58 UTC |
Comment by sleepinyourhat Removing WIP tag—I think this is ready for review. Anyone have a moment? I'm still running some larger-scale tests on GLUE/SuperGLUE to make sure that BERT performance doesn't change, and that XLNet gets semi-sane results. |
Comment by sleepinyourhat From the GLUE tasks, I can confirm that BERT models do just as well after the refactor, but XLNet performance is fairly low:
Old vs. new refers to master vs. this branch. |
Comment by HaokunLiu
I found there is some differences in jiant implementation, I don't know if that's the cause. |
Comment by sleepinyourhat Good catch! |
Comment by sleepinyourhat Okay, I think this is ready for a real review. Thanks to everyone who commented so far!
|
Comment by W4ngatang Looked through most files (only skimmed
|
Comment by sleepinyourhat
|
Comment by sleepinyourhat ReCoRD seems to be working now. Here are results from a short half-epoch run:
|
Comment by sleepinyourhat @pruksmhc Thanks! I'm testing everything I can, but do make sure to have a close look at CCG. I'm not fully set up to test that. |
Comment by pruksmhc CCG has tokenization as a off-line preprocessing step (which should honestly be changed). I would say just put in the documentation that CCG is not yet set up for XLNet, and leave CCG for another PR. |
Comment by pruksmhc The off-line preprocessing step should honestly be made as part of load_data too. |
Comment by sleepinyourhat @pruksmhc - Mind making an issue? |
Comment by sleepinyourhat This should be ready to go after one last test. @iftenney Any last comments before I merge? |
Comment by sleepinyourhat
|
Issue by sleepinyourhat
Wednesday Jul 17, 2019 at 19:47 GMT
Originally opened as nyu-mll/jiant#845
There's a lot going on here, and I'm still debugging. Suggestions for tests to add are very welcome!
I'm adding a few semi-related changes that are meant to help with clarity/maintainability:
I caught a bug along the way:
Note to self:
sleepinyourhat included the following code: https://github.com/nyu-mll/jiant/pull/845/commits
The text was updated successfully, but these errors were encountered: