-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
使用rdrop, it shows a tf.split error. 大佬帮忙看看 #112
Comments
好的,我下周看看
…On Thu, Sep 22, 2022, 8:35 PM Edward Chan ***@***.***> wrote:
code from https://github.com/EdwardChan5000/m3tl_run
错误show below
2022-09-21 16:17:06.321 | INFO | m3tl.utils:set_phase:478 - Setting phase
to infer
2022-09-21 16:17:06.345 | CRITICAL | m3tl.model_fn:compile:271 - Initial
lr: 2e-05
2022-09-21 16:17:06.345 | CRITICAL | m3tl.model_fn:compile:272 - Train
steps: 408675
2022-09-21 16:17:06.345 | CRITICAL | m3tl.model_fn:compile:273 - Warmup
steps: 40867
2022-09-21 16:17:06.361554: I
tensorflow/core/profiler/lib/profiler_session.cc:136] Profiler session
initializing.
2022-09-21 16:17:06.361588: I
tensorflow/core/profiler/lib/profiler_session.cc:155] Profiler session
started.
2022-09-21 16:17:06.361613: I
tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1365] Profiler found
1 GPUs
2022-09-21 16:17:06.369724: I
tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully
opened dynamic library libcupti.so.11.0
2022-09-21 16:17:06.655982: I
tensorflow/core/profiler/lib/profiler_session.cc:172] Profiler session tear
down.
2022-09-21 16:17:06.656157: I
tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487] CUPTI activity
buffer flushed
2022-09-21 16:17:07.017 | INFO | m3tl.utils:set_phase:478 - Setting phase
to train
WARNING:tensorflow:The parameters output_attentions, output_hidden_states
and use_cache cannot be updated when calling a model.They have to be set
to True/False in the config object (i.e.: config=XConfig.from_pretrained('name',
output_attentions=True)).
WARNING:tensorflow:The parameter return_dict cannot be set in graph mode
and will always be set to True.
2022-09-21 16:17:19.175 | INFO | m3tl.utils:set_phase:478 - Setting phase
to train
Traceback (most recent call last):
File "m3tl_4room_rdrop.py", line 195, in
main(args)
File "m3tl_4room_rdrop.py", line 149, in main
create_tf_record_only=False, model_dir=model_dir,
mirrored_strategy=mirrored_strategy)
File "/usr/local/lib/python3.6/site-packages/m3tl/run_bert_multitask.py",
line 319, in train_bert_multitask
verbose=verbose
File "/usr/local/lib/python3.6/site-packages/m3tl/run_bert_multitask.py",
line 163, in _train_bert_multitask_keras_model
validation_steps=validation_steps
File
"/usr/local/lib64/python3.6/site-packages/tensorflow/python/keras/engine/training.py",
line 1100, in fit
tmp_logs = self.train_function(iterator)
File
"/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/def_function.py",
line 828, in *call*
result = self._call(*args, **kwds)
File
"/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/def_function.py",
line 855, in _call
return self._stateless_fn(*args, **kwds) # pylint: disable=not-callable
File
"/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/function.py",
line 2943, in *call*
filtered_flat_args, captured_inputs=graph_function.captured_inputs) #
pylint: disable=protected-access
File
"/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/function.py",
line 1919, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File
"/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/function.py",
line 560, in call
ctx=ctx)
File
"/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/execute.py",
line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root
error(s) found.
(0) Invalid argument: Number of ways to split should evenly divide the
split dimension, but got split_dim 0 (size = 7) and num_split 2
[[node
BertMultiTask/BertMultiTaskTop/rdrop_preprocess/rdrop_preprocess/split_1
(defined at data/yard/workspace/vega/m3tl_run/custom_top.py:250) ]]
(1) Invalid argument: Number of ways to split should evenly divide the
split dimension, but got split_dim 0 (size = 7) and num_split 2
[[node
BertMultiTask/BertMultiTaskTop/rdrop_preprocess/rdrop_preprocess/split_1
(defined at data/yard/workspace/vega/m3tl_run/custom_top.py:250) ]]
[[BertMultiTask/BertMultiTaskTop/rdrop_preprocess/rdrop_preprocess/split_1/_44]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_423171]
Errors may have originated from an input operation.
Input Source operations connected to node
BertMultiTask/BertMultiTaskTop/rdrop_preprocess/rdrop_preprocess/split_1:
BertMultiTask/basic_mtl/GatherNd (defined at
usr/local/lib/python3.6/site-packages/m3tl/utils.py:412)
Input Source operations connected to node
BertMultiTask/BertMultiTaskTop/rdrop_preprocess/rdrop_preprocess/split_1:
BertMultiTask/basic_mtl/GatherNd (defined at
usr/local/lib/python3.6/site-packages/m3tl/utils.py:412)
Function call stack:
train_function -> train_function
—
Reply to this email directly, view it on GitHub
<#112>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADS2OTDITUG23H7HZGCEYTDV7RHADANCNFSM6AAAAAAQTAIMCM>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
code from https://github.com/EdwardChan5000/m3tl_run
错误show below
2022-09-21 16:17:06.321 | INFO | m3tl.utils:set_phase:478 - Setting phase to infer
2022-09-21 16:17:06.345 | CRITICAL | m3tl.model_fn:compile:271 - Initial lr: 2e-05
2022-09-21 16:17:06.345 | CRITICAL | m3tl.model_fn:compile:272 - Train steps: 408675
2022-09-21 16:17:06.345 | CRITICAL | m3tl.model_fn:compile:273 - Warmup steps: 40867
2022-09-21 16:17:06.361554: I tensorflow/core/profiler/lib/profiler_session.cc:136] Profiler session initializing.
2022-09-21 16:17:06.361588: I tensorflow/core/profiler/lib/profiler_session.cc:155] Profiler session started.
2022-09-21 16:17:06.361613: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1365] Profiler found 1 GPUs
2022-09-21 16:17:06.369724: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcupti.so.11.0
2022-09-21 16:17:06.655982: I tensorflow/core/profiler/lib/profiler_session.cc:172] Profiler session tear down.
2022-09-21 16:17:06.656157: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487] CUPTI activity buffer flushed
2022-09-21 16:17:07.017 | INFO | m3tl.utils:set_phase:478 - Setting phase to train
WARNING:tensorflow:The parameters
output_attentions
,output_hidden_states
anduse_cache
cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.:config=XConfig.from_pretrained('name', output_attentions=True)
).WARNING:tensorflow:The parameter
return_dict
cannot be set in graph mode and will always be set toTrue
.2022-09-21 16:17:19.175 | INFO | m3tl.utils:set_phase:478 - Setting phase to train
Traceback (most recent call last):
File "m3tl_4room_rdrop.py", line 195, in
main(args)
File "m3tl_4room_rdrop.py", line 149, in main
create_tf_record_only=False, model_dir=model_dir, mirrored_strategy=mirrored_strategy)
File "/usr/local/lib/python3.6/site-packages/m3tl/run_bert_multitask.py", line 319, in train_bert_multitask
verbose=verbose
File "/usr/local/lib/python3.6/site-packages/m3tl/run_bert_multitask.py", line 163, in _train_bert_multitask_keras_model
validation_steps=validation_steps
File "/usr/local/lib64/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1100, in fit
tmp_logs = self.train_function(iterator)
File "/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 828, in call
result = self._call(*args, **kwds)
File "/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 855, in _call
return self._stateless_fn(*args, **kwds) # pylint: disable=not-callable
File "/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/function.py", line 2943, in call
filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access
File "/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/function.py", line 1919, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/function.py", line 560, in call
ctx=ctx)
File "/usr/local/lib64/python3.6/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Number of ways to split should evenly divide the split dimension, but got split_dim 0 (size = 7) and num_split 2
[[node BertMultiTask/BertMultiTaskTop/rdrop_preprocess/rdrop_preprocess/split_1 (defined at data/yard/workspace/vega/m3tl_run/custom_top.py:250) ]]
(1) Invalid argument: Number of ways to split should evenly divide the split dimension, but got split_dim 0 (size = 7) and num_split 2
[[node BertMultiTask/BertMultiTaskTop/rdrop_preprocess/rdrop_preprocess/split_1 (defined at data/yard/workspace/vega/m3tl_run/custom_top.py:250) ]]
[[BertMultiTask/BertMultiTaskTop/rdrop_preprocess/rdrop_preprocess/split_1/_44]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_423171]
Errors may have originated from an input operation.
Input Source operations connected to node BertMultiTask/BertMultiTaskTop/rdrop_preprocess/rdrop_preprocess/split_1:
BertMultiTask/basic_mtl/GatherNd (defined at usr/local/lib/python3.6/site-packages/m3tl/utils.py:412)
Input Source operations connected to node BertMultiTask/BertMultiTaskTop/rdrop_preprocess/rdrop_preprocess/split_1:
BertMultiTask/basic_mtl/GatherNd (defined at usr/local/lib/python3.6/site-packages/m3tl/utils.py:412)
Function call stack:
train_function -> train_function
The text was updated successfully, but these errors were encountered: