You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
错误信息:
W NPUStream.cpp:409] Warning: NPU warning, error code is 507046[Error]:
[Error]: In the specified timeout waiting event, all tasks in the specified stream are not completed.
Rectify the fault based on the error information in the ascend log.
EE1002: 2024-09-03-16:50:00.041.231 Stream synchronize timeout. rtDeviceSynchronize execute failed, reason=[stream sync timeout]
Possible Cause: 1. The timeout interval may be improperly set.
Solution: 1. Check whether the timeout interval is properly set. 2. Check whether the network is normal.
TraceBack (most recent call last):
wait for compute device to finish failed, runtime result = 507046.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
(function npuSynchronizeUsedDevices)
The text was updated successfully, but these errors were encountered:
训练数据量1.2M,采用16卡进行训练
已经设置
export HCCL_EXEC_TIMEOUT=17340
错误信息:
W NPUStream.cpp:409] Warning: NPU warning, error code is 507046[Error]:
[Error]: In the specified timeout waiting event, all tasks in the specified stream are not completed.
Rectify the fault based on the error information in the ascend log.
EE1002: 2024-09-03-16:50:00.041.231 Stream synchronize timeout. rtDeviceSynchronize execute failed, reason=[stream sync timeout]
Possible Cause: 1. The timeout interval may be improperly set.
Solution: 1. Check whether the timeout interval is properly set. 2. Check whether the network is normal.
TraceBack (most recent call last):
wait for compute device to finish failed, runtime result = 507046.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
(function npuSynchronizeUsedDevices)
The text was updated successfully, but these errors were encountered: