Readme unclear about training only extension vs training whole model #11

rubmz · 2021-10-29T16:37:16Z

Hi :-)

Really love the work you did here! And looking forward to give it a try myself, but I found two unclear documentation issues:

Is training "extension only" makes a workable model, or I need to train a "whole" exBert model? It's confusing, because in the paper you explain that training a whole model is a very lengthy operation + It does not make sense that you would provide CLI for training exBert from scratch (we have bert for it), so what does the "only extension" vs "whole" exBert training means? I tried to decypher that by reading the paper and the readme... and I am still unsure which one I should go with.
In the inputs to the CLI you specify path_to_state_dict_of_the_OFF_THE_SHELF_MODEL variable. Yet on the bert model I want to extend there is no such file: https://huggingface.co/onlplab/alephbert Is this input file mandatory? Is it expected that all BERT models on huggingface provide such state file?

Provide feedback