Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练过程中报错PreconditionNotMetError: The predict data must less or equal 1. #23807

Closed
liangzhenduo opened this issue Apr 13, 2020 · 3 comments
Assignees

Comments

@liangzhenduo
Copy link

为使您的问题得到快速解决,在建立Issues前,请您先通过如下方式搜索是否有相似问题:【搜索issue关键字】【使用labels筛选】【官方文档】

如果您没有查询到相似问题,为快速解决您的提问,建立issue时请提供如下细节信息:

  • 标题:简洁、精准概括您的问题,例如“Insufficient Memory xxx" ”
  • 版本、环境信息:
       1)PaddlePaddle版本:1.7.1
       2)CPU:Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
       3)GPU:预测若用GPU,请提供GPU型号、CUDA和CUDNN版本号
       4)系统环境:CentOS6,Python 3.6.4
  • 训练信息
       1)单机
  • 复现信息:如为报错,请给出复现环境、复现步骤
  • 问题描述:

错误日志

/home/map/.jumbo/lib/python3.6/site-packages/paddle/fluid/executor.py:782: UserWarning: The following exception is not an EOF exception.
"The following exception is not an EOF exception.")
Traceback (most recent call last):
File "train.py", line 717, in
main(use_cuda)
File "train.py", line 711, in main
train(use_cuda)
File "train.py", line 686, in train
fetch_list=[avg_cost.name, auc_batch.name]
File "/home/map/.jumbo/lib/python3.6/site-packages/paddle/fluid/executor.py", line 783, in run
six.reraise(*sys.exc_info())
File "/home/map/.jumbo/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/home/map/.jumbo/lib/python3.6/site-packages/paddle/fluid/executor.py", line 778, in run
use_program_cache=use_program_cache)
File "/home/map/.jumbo/lib/python3.6/site-packages/paddle/fluid/executor.py", line 831, in _run_impl
use_program_cache=use_program_cache)
File "/home/map/.jumbo/lib/python3.6/site-packages/paddle/fluid/executor.py", line 905, in _run_program
fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet:


C++ Call Stacks (More useful to developers):

0 std::string paddle::platform::GetTraceBackString<std::string const&>(std::string const&&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::string const&, char const*, int)
2 paddle::operators::AucKernel<paddle::platform::CPUPlace, float>::statAuc(paddle::framework::Tensor const*, paddle::framework::Tensor const*, int, int, long*, long*)
3 paddle::operators::AucKernel<paddle::platform::CPUPlace, float>::Compute(paddle::framework::ExecutionContext const&) const
4 ZNSt17_Function_handlerIFvRKN6paddle9framework16ExecutionContextEEZNKS1_24OpKernelRegistrarFunctorINS0_8platform8CPUPlaceELb0ELm0EINS0_9operators9AucKernelIS8_fEEEEclEPKcSE_iEUlS4_E_E9_M_invokeERKSt9_Any_dataS4
5 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::framework::RuntimeContext*) const
6 paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
7 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
8 paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool)
9 paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool, bool)


Python Call Stacks (More useful to users):

File "/home/map/.jumbo/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2525, in append_op
attrs=kwargs.get("attrs", None))
File "/home/map/.jumbo/lib/python3.6/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(*args, **kwargs)
File "/home/map/.jumbo/lib/python3.6/site-packages/paddle/fluid/layers/metric_op.py", line 235, in auc
"StatNegOut": [batch_stat_neg]
File "train.py", line 587, in get_model
auc = layers.auc(input=predict, label=label, num_thresholds=2 ** 12)
File "train.py", line 646, in train
model_args = get_model()
File "train.py", line 711, in main
train(use_cuda)
File "train.py", line 717, in
main(use_cuda)


Error Message Summary:

PreconditionNotMetError: The predict data must less or equal 1.
[Hint: Expected predict_data <= 1, but received predict_data:nan > 1:1.] at (/paddle/paddle/fluid/operators/metrics/auc_op.h:139)
[operator < auc > error]

调试信息 paddle.fluid.layers.Print

BATCH 77
1586773000 The content of input layer: The place is:CPUPlace
Tensor[fc_4.tmp_2]
shape: [1024,2,]
dtype: f
data: 0.946299,0.0537008,0.710008,0.289992,0.706162,0.293838,0.97014,0.0298601,0.944264,0.0557359,0.995318,0.00468204,0.997226,0.00277375,0.89308,0.10692,0.390375,0.609625,0.915708,0.0842917,
BATCH 78
1586773000 The content of input layer: The place is:CPUPlace
Tensor[fc_4.tmp_2]
shape: [1024,2,]
dtype: f
data: 0.993649,0.00635145,0.89751,0.10249,1,1.30941e-07,0.717667,0.282333,0.98908,0.0109196,0.987229,0.0127711,0.97767,0.0223299,0.638005,0.361995,0.926301,0.0736985,0.99981,0.000189584,
BATCH 79
1586773000 The content of input layer: The place is:CPUPlace
Tensor[fc_4.tmp_2]
shape: [1024,2,]
dtype: f
data: 0.819731,0.180269,0.372814,0.627186,0.972738,0.0272616,0.821131,0.178869,0.9855,0.0144998,0.992382,0.00761771,0.994886,0.00511405,0.894112,0.105888,0.881455,0.118545,0.928715,0.0712853,
BATCH 80
1586773001 The content of input layer: The place is:CPUPlace
Tensor[fc_4.tmp_2]
shape: [1024,2,]
dtype: f
data: 0.999675,0.000325095,0.994081,0.00591934,0.974014,0.0259864,0.935169,0.0648307,0.996004,0.00399586,0.999818,0.000181847,0.999429,0.000571389,0.939129,0.0608707,0.734311,0.265689,0.99998,2.02711e-05,
BATCH 81
1586773002 The content of input layer: The place is:CPUPlace
Tensor[fc_4.tmp_2]
shape: [1024,2,]
dtype: f
data: nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,

@zhhsplendid
Copy link
Member

数据计算有错,出nan了,能否告诉我们复现步骤?

@zhhsplendid zhhsplendid self-assigned this Apr 14, 2020
@liangzhenduo
Copy link
Author

@zhhsplendid 在加上一列特征(值域[0.0,1.0])后就会出现这个错误,去掉就没有问题
观察训练数据这列特征没有发现脏数据,已hi上联系

@liangzhenduo
Copy link
Author

Duplicate of #3146
参考 PaddlePaddle/models#3146 (comment) 排查脏数据解决

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants