Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

训练时候只有1000条测试样本, tester却显示有124W条 #448

Closed
linrongyi opened this issue Nov 12, 2016 · 4 comments
Closed

训练时候只有1000条测试样本, tester却显示有124W条 #448

linrongyi opened this issue Nov 12, 2016 · 4 comments
Assignees
Labels

Comments

@linrongyi
Copy link

我split了1000条样本作为dev set, 可是在训练过程中, 打印出来的tester信息显示有100多W条. 在test过程中, 停顿了许久, 所以应该是eval了这么多条数据.

我训练的命令是

${TRAINER_BIN} \
	--job=train \
	--config=trainer_config.conf \
	--save_dir=output \
	--trainer_count=11 \
	--use_gpu=0 \
	--save_dir=./output.${TAG} \
	--dot_period=100 \
	--log_period=1000 \
	--test_period=10000 \
	--num_passes=15 \
	--init_model_path=./model.init \
	--load_missing_parameter_strategy=rand \
	--test_wait=1  --show_parameter_stats_period=1000

打印出来的log

I1112 16:08:26.652833  8142 TrainerInternal.cpp:182]  Pass=0 Batch=15091 samples=1931613 AvgCost=0.481067 Eval: err_rate_out1=0.0968464 
I1112 16:09:07.795814  8142 Tester.cpp:127]  Test samples=1248750 cost=0.488171 Eval: err_rate_out1=0.0605288 

我在dataprovider里面打了debug, 确认data_provider只load了1000条数据

0 insts loaded
0 insts loaded
[999] instances loaded from /home/aladdin/paddle_tasks/tagging/data/caipu.refined.test [load结束打出的日志]
100000 insts loaded
200000 insts loaded
300000 insts loaded
400000 insts loaded
500000 insts loaded
I1112 16:04:13.903460  8165 ThreadLocal.cpp:37] thread use undeterministic rand seed:8166
...600000 insts loaded
....700000 insts loaded

@backyes
Copy link
Contributor

backyes commented Nov 13, 2016

@linrongyi
能否提供,

  • 至少模型配置中数据部分
  • dataprovider的test部分

确保能从一一有迹可循,上面的内容还是不够全面。

@linrongyi
Copy link
Author

我后来试了一下, 是这个参数引起的--test_period=10000, 去掉以后就正常了.
不过按照paddle_trainer的help, 这个argument的含义是, 每10000个batch测试一下, 好像从这个现象上看也不太符合预期

@qingqing01
Copy link
Contributor

@linrongyi 训练的命令参数里加上 --test_all_data_in_one_period=true,这样会测试全部样本。 如果不加会依据testPeriod测试,这个特性目前正在 #411 里修正。 一般都用会 --test_all_data_in_one_period=true

@linrongyi
Copy link
Author

好的. 谢谢各位

zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this issue Sep 25, 2019
* add_api_cn (PaddlePaddle#437)

* add gpu benchmark page (PaddlePaddle#436)

* add gpu benchmark

* fix comments

* add chinese link in gen_doc.py (PaddlePaddle#439)

* modified gen_doc.py

* Update gen_doc.py

* update gen_doc.py

* Update mobile doc (PaddlePaddle#440)

* update_en_structure

* update_Paddle_commitid

* delete_DS

* update_mobile_doc

* Update book commit (PaddlePaddle#438)

* update_en_structure

* update_Paddle_commitid

* delete_DS

* update_book_commit_id

* update_api_rst

* Adjust structure of advanced usage (PaddlePaddle#442)

* adjust_structure_of_advanced_usage

* Update paddle_gpu_benchmark.md

* adjust_toctree_hidden

* fix_api_guide

* Update index.rst

* fix deadlinkes (PaddlePaddle#443)

* add_toctree (PaddlePaddle#445)

* adjust_beginners_structure (PaddlePaddle#446)

* adjust_beginners_structure

* Update index.rst

* update install_doc for python3 (PaddlePaddle#418)

* update install_doc for python3

* update python3.5+ info,test=develop

* fix review, test=develop

* fix_pic (PaddlePaddle#447)
gglin001 added a commit to graphcore/Paddle-fork that referenced this issue Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants