Finetune GPT2 model with TP=2 #8649
AnirudhVIyer
started this conversation in
General
Replies: 1 comment
-
Please try to convert a TP=1 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am trying to finetune a GPT model with TP=2 AND SP=True.
I have converted the TP=1 .nemo file to TP=2 .nemo file. However there is a issue when I run the file tuning script. There is an error when I try to load the nemo model.
File "/scratch/avi2011/nemo_proj/fine_tuning/megatron_gpt_finetuning.py", line 68, in main model = MegatronGPTSFTModel.restore_from(cfg.model.restore_from_path, model_cfg, trainer=trainer) File "/usr/local/lib/python3.10/dist-packages/nemo/collections/nlp/models/nlp_model.py", line 465, in restore_from return super().restore_from( File "/usr/local/lib/python3.10/dist-packages/nemo/core/classes/modelPT.py", line 442, in restore_from instance = cls._save_restore_connector.restore_from( File "/usr/local/lib/python3.10/dist-packages/nemo/collections/nlp/parts/nlp_overrides.py", line 700, in restore_from loaded_params = super().load_config_and_state_dict( File "/usr/local/lib/python3.10/dist-packages/nemo/core/connectors/save_restore_connector.py", line 169, in load_config_and_state_dict state_dict = self._load_state_dict_from_disk(model_weights, map_location=map_location) File "/usr/local/lib/python3.10/dist-packages/nemo/collections/nlp/parts/nlp_overrides.py", line 663, in _load_state_dict_from_disk raise ValueError(f'Expected {model_weights} to be a file or directory.') ValueError: Expected /state/partition1/job-44049844/tmp0m2inrp1/model_weights.ckpt to be a file or directory.
any suggestions?
Beta Was this translation helpful? Give feedback.
All reactions