Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to assign a float weight to every training sample? #136

Closed
lqniunjunlper opened this issue Sep 28, 2016 · 5 comments
Closed

How to assign a float weight to every training sample? #136

lqniunjunlper opened this issue Sep 28, 2016 · 5 comments
Labels

Comments

@lqniunjunlper
Copy link

Is the following right?
input_types: add a dense_vector(1)
network config: add a data_layer weight=data_layer( , size=1)
cost layer: coeff=weight?

@lqniunjunlper
Copy link
Author

Or using scaling_layer?

@lqniunjunlper
Copy link
Author

hi @reyoung
After adding a weight layer, i got the following error:
So strange, input order is not right!

815e272f-5b5b-48b0-9fa6-1b1d414ac3bb
5435f8c2-29fa-45c5-9cc6-7ea1cb85232e
fb1a6f37-ae16-40e4-b622-4ba64c4f138e

@reyoung
Copy link
Collaborator

reyoung commented Sep 29, 2016

@NIULQfromNJU In dataprovider, we calculate the input order by reference order of data_layer in TrainerConfig. But user also can yield a dictionary in dataprovider in latest code. See http://www.paddlepaddle.org/doc_cn/ui/data_provider/pydataprovider2.html#id1 's 显式指定返回的数据对应关系.

Sample weight should be added to cost layer. It is ok to add ascaling_layer to cost. And we should add a weight parameter to cost layer later.

The sample config should be

feas = data_layer(name='feas', size=100)

hidden = fc_layer(input=feas, size=100)

predict = fc_layer(input=hidden, size=10, act=SoftmaxActivation())

cost = classification_cost(input=predict, label=data_layer(name='label', size=10))

cost_after_scale = scaling_layer(input=cost, weight=data_layer(name='weight', size=1))

outputs(cost_after_scale)

and the data provider could be

def process(settings, filename):
     with open(filename) as f:
           ...
           yield { 'feas': feas, 'label': label, 'weight': weight}

@lqniunjunlper
Copy link
Author

hi @reyoung
you mean that previous paddle (version 0.8.0b0) does not support this explicit way but the latest version?
Another question: which input_types can be used to represent the float weight? Is the following right?

initializer

input_types=[dense_vector(1)]

process

weight = []
weight.append(1.0)
yield weight

config

weight=data_layer(name='weight', size=1)

thx~

@reyoung
Copy link
Collaborator

reyoung commented Sep 29, 2016

@NIULQfromNJU It seems that the usage above should be ok. And explicit provide data is supported since the version 0.8.0b1 .

@reyoung reyoung closed this as completed Oct 10, 2016
zhhsplendid pushed a commit to zhhsplendid/Paddle that referenced this issue Sep 25, 2019
thisjiang pushed a commit to thisjiang/Paddle that referenced this issue Oct 28, 2021
* add primitive layer. add function primitive::add

* optimize the primitive::add function

* change the name and struct of add function

* change opfunction to pe.

* fix codestyle and debug

* fix codestyle

* add op module. test ci

* fix bug

* fix bug

* fix CMakelist
wangxicoding pushed a commit to wangxicoding/Paddle that referenced this issue Dec 9, 2021
zhoutianzi666 pushed a commit to zhoutianzi666/Paddle that referenced this issue May 23, 2022
update doc, tmp remove go and r demo
AnnaTrainingG pushed a commit to AnnaTrainingG/Paddle that referenced this issue Sep 19, 2022
Thunderbrook added a commit to Thunderbrook/Paddle that referenced this issue Oct 18, 2022
…addlePaddle#136)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com>
qingshui pushed a commit to qingshui/Paddle that referenced this issue Nov 14, 2022
* Optimizing the zero key problem in the push phase

* Optimize CUDA thread parallelism in MergeGrad phase

* Optimize CUDA thread parallelism in MergeGrad phase

* Performance optimization, segment gradient merging

* Performance optimization, segment gradient merging

* Optimize pullsparse and increase keys aggregation

* sync gpugraph to gpugraph_v2 (#86)

* change load node and edge from local to cpu (#83)

* change load node and edge

* remove useless code

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* extract pull sparse as single stage(#85)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com>
Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>
Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com>
Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* [GPUGraph] graph sample v2 (#87)

* change load node and edge from local to cpu (#83)

* change load node and edge

* remove useless code

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* extract pull sparse as single stage(#85)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* support ssdsparsetable;test=develop (#81)

* graph sample v2

* remove log

Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com>
Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>
Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com>
Co-authored-by: yangjunchao <yangjunchao@baidu.com>
Co-authored-by: danleifeng <52735331+danleifeng@users.noreply.github.com>

* Release cpu graph

* uniq nodeid (#89)

* compatible whole HBM mode (#91)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* Gpugraph v2 (#93)

* compatible whole HBM mode

* unify flag for graph emd storage mode and graph struct storage mode

* format

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* split generate batch into multi stage (#92)

* split generate batch into multi stage

* fix conflict

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* [GpuGraph] Uniq feature (#95)

* uniq feature

* uniq feature

* uniq feature

* [GpuGraph]  global startid (#98)

* uniq feature

* uniq feature

* uniq feature

* global startid

* load node edge seperately and release graph (#99)

* load node edge seperately and release graph

* load node edge seperately and release graph

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* v2 infer (#102)

* optimize begin pass and end pass (#106)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* fix ins no (#104)

* [GPUGraph] fix FillOneStep args (#107)

* fix ins no

* fix FillOnestep args

* fix bug for whole hbm mode (#110)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* [GPUGraph] fix infer && add infer_table_cap (#108)

* fix ins no

* fix FillOnestep args

* fix infer && add infer table cap

* fix infer

* 【PSCORE】perform ssd sparse table  (#111)

* perform ssd sparsetable;test=develop

Conflicts:
	paddle/fluid/framework/fleet/ps_gpu_wrapper.cc

* perform ssd sparsetable;test=develop

* remove debug code;

* remove debug code;

* add jemalloc cmake;test=develop

* fix wrapper;test=develop

* fix sample core (#114)

* [GpuGraph] optimize shuffle batch (#115)

* fix sample core

* optimize shuffle batch

* release gpu mem when sample end (#116)

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* fix class not found err (PaddlePaddle#118)

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* optimize sample (PaddlePaddle#117)

* optimize sample

* optimize sample

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* fix clear gpu mem (PaddlePaddle#119)

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* fix sample core (PaddlePaddle#121)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* add ssd cache (PaddlePaddle#123)

* add ssd cache;test=develop

* add ssd cache;test=develop

* add ssd cache;test=develop

* add multi epoch train & fix train table change ins & save infer embeding  (PaddlePaddle#129)

* add multi epoch train & fix train table change ins & save infer embedding

* change epoch finish judge

* change epoch finish change

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* Add debug log (PaddlePaddle#131)

* Add debug log

* Add debug log

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0008.yq01.baidu.com>

* optimize mem in  uniq slot feature (PaddlePaddle#130)

* [GpuGraph] cherry pick var slot feature && fix load multi path node (PaddlePaddle#136)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com>

* [GpuGraph] fix kernel overflow (PaddlePaddle#138)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

* fix kernel overflow && add max feature num flag

Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com>

* fix ssd cache;test=develop (PaddlePaddle#139)

* slot feature secondary storage (PaddlePaddle#140)

* slot feature secondary storage

* slot feature secondary storage

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0008.yq01.baidu.com>
Co-authored-by: xuewujiao <105861147+xuewujiao@users.noreply.github.com>
Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com>
Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>
Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com>
Co-authored-by: yangjunchao <yangjunchao@baidu.com>
Co-authored-by: Thunderbrook <52529258+Thunderbrook@users.noreply.github.com>
Co-authored-by: danleifeng <52735331+danleifeng@users.noreply.github.com>
Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com>
zmxdream pushed a commit to zmxdream/Paddle that referenced this issue Dec 7, 2022
* Optimizing the zero key problem in the push phase

* Optimize CUDA thread parallelism in MergeGrad phase

* Optimize CUDA thread parallelism in MergeGrad phase

* Performance optimization, segment gradient merging

* Performance optimization, segment gradient merging

* Optimize pullsparse and increase keys aggregation

* sync gpugraph to gpugraph_v2 (PaddlePaddle#86)

* change load node and edge from local to cpu (PaddlePaddle#83)

* change load node and edge

* remove useless code

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* extract pull sparse as single stage(PaddlePaddle#85)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com>
Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>
Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com>
Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* [GPUGraph] graph sample v2 (PaddlePaddle#87)

* change load node and edge from local to cpu (PaddlePaddle#83)

* change load node and edge

* remove useless code

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* extract pull sparse as single stage(PaddlePaddle#85)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* support ssdsparsetable;test=develop (PaddlePaddle#81)

* graph sample v2

* remove log

Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com>
Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>
Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com>
Co-authored-by: yangjunchao <yangjunchao@baidu.com>
Co-authored-by: danleifeng <52735331+danleifeng@users.noreply.github.com>

* Release cpu graph

* uniq nodeid (PaddlePaddle#89)

* compatible whole HBM mode (PaddlePaddle#91)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* Gpugraph v2 (PaddlePaddle#93)

* compatible whole HBM mode

* unify flag for graph emd storage mode and graph struct storage mode

* format

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* split generate batch into multi stage (PaddlePaddle#92)

* split generate batch into multi stage

* fix conflict

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* [GpuGraph] Uniq feature (PaddlePaddle#95)

* uniq feature

* uniq feature

* uniq feature

* [GpuGraph]  global startid (PaddlePaddle#98)

* uniq feature

* uniq feature

* uniq feature

* global startid

* load node edge seperately and release graph (PaddlePaddle#99)

* load node edge seperately and release graph

* load node edge seperately and release graph

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* v2 infer (PaddlePaddle#102)

* optimize begin pass and end pass (PaddlePaddle#106)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* fix ins no (PaddlePaddle#104)

* [GPUGraph] fix FillOneStep args (PaddlePaddle#107)

* fix ins no

* fix FillOnestep args

* fix bug for whole hbm mode (PaddlePaddle#110)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* [GPUGraph] fix infer && add infer_table_cap (PaddlePaddle#108)

* fix ins no

* fix FillOnestep args

* fix infer && add infer table cap

* fix infer

* 【PSCORE】perform ssd sparse table  (PaddlePaddle#111)

* perform ssd sparsetable;test=develop

Conflicts:
	paddle/fluid/framework/fleet/ps_gpu_wrapper.cc

* perform ssd sparsetable;test=develop

* remove debug code;

* remove debug code;

* add jemalloc cmake;test=develop

* fix wrapper;test=develop

* fix sample core (PaddlePaddle#114)

* [GpuGraph] optimize shuffle batch (PaddlePaddle#115)

* fix sample core

* optimize shuffle batch

* release gpu mem when sample end (PaddlePaddle#116)

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* fix class not found err (PaddlePaddle#118)

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* optimize sample (PaddlePaddle#117)

* optimize sample

* optimize sample

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* fix clear gpu mem (PaddlePaddle#119)

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* fix sample core (PaddlePaddle#121)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

* add ssd cache (PaddlePaddle#123)

* add ssd cache;test=develop

* add ssd cache;test=develop

* add ssd cache;test=develop

* add multi epoch train & fix train table change ins & save infer embeding  (PaddlePaddle#129)

* add multi epoch train & fix train table change ins & save infer embedding

* change epoch finish judge

* change epoch finish change

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>

* Add debug log (PaddlePaddle#131)

* Add debug log

* Add debug log

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0008.yq01.baidu.com>

* optimize mem in  uniq slot feature (PaddlePaddle#130)

* [GpuGraph] cherry pick var slot feature && fix load multi path node (PaddlePaddle#136)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com>

* [GpuGraph] fix kernel overflow (PaddlePaddle#138)

* optimize mem in  uniq slot feature

* cherry-pick var slot_feature

* fix kernel overflow && add max feature num flag

Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com>

* fix ssd cache;test=develop (PaddlePaddle#139)

* slot feature secondary storage (PaddlePaddle#140)

* slot feature secondary storage

* slot feature secondary storage

Co-authored-by: yangjunchao <yangjunchao@baidu.com>

Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0008.yq01.baidu.com>
Co-authored-by: xuewujiao <105861147+xuewujiao@users.noreply.github.com>
Co-authored-by: miaoli06 <106585574+miaoli06@users.noreply.github.com>
Co-authored-by: root <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com>
Co-authored-by: chao9527 <33347532+chao9527@users.noreply.github.com>
Co-authored-by: yangjunchao <yangjunchao@baidu.com>
Co-authored-by: Thunderbrook <52529258+Thunderbrook@users.noreply.github.com>
Co-authored-by: danleifeng <52735331+danleifeng@users.noreply.github.com>
Co-authored-by: huwei02 <53012141+huwei02@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants