High level API的一些使用问题 #11246

qingqing01 · 2018-06-06T11:02:12Z

背景是要升级一些models里模型，遇到若干问题：

如何使得训练中的test_program和train_program不同？
- 问题：类似以下定义，如何只给test_program里加一些op，train_program里不加？
```
 loss = fluid.layers.ssd_loss(locs, confs, gt_box, gt_label, box,box_var)
 loss = fluid.layers.reduce_sum(loss)
 test_program = fluid.default_main_program().clone(for_test=True)
 with fluid.program_guard(test_program):
     map_eval = fluid.evaluator.DetectionMAP(...)
```
- 当前内部实现问题：只从train_program里clone出来test_program。(也没有加for_test=True，这样对有BatchNorm、Dropout的网络在测试时的行为是正确的吗？)
  self.test_program = self.train_program.clone()
如何Load部分参数做初始化？
- Low-level API方法：当前在models里的一些例子，采用如下方法，必需定义predicate的一个函数传入给load_vars才能只load部分参数。
```
place = fluid.CUDAPlace(0)  # or fluid.CPUPlace()
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
def if_exist(var):
    return os.path.exists(os.path.join(pretrained_model_dir, var.name))
fluid.io.load_vars(exe, pretrained_model_dir, predicate=if_exist)
```
- 问题：High level API如何做？
  io.load_persistables(exe, dirname=param_path)当前这样使用，无法只load部分参数。

如何做内存优化？

Low-level API目前使用如下：

 # 定义网络部分省略
 opts = optimizer.minimize(avg_cost)
 fluid.memory_optimize(fluid.default_main_program()) #优化内存

问题：High level API如何做？

The text was updated successfully, but these errors were encountered:

dzhwinter · 2018-06-06T11:15:25Z

I can answer the question about memory optimize, we will apply the memory optimize to program by default, seems we didn't have the needs to export a high-level API.

BigFishMaster · 2018-06-06T14:59:33Z

@dzhwinter opening the memory optimization will cause performance problem, which have not been solved. Thus, it is necessary to export an interface and set it False.

BigFishMaster · 2018-06-06T15:02:52Z

如何进行每隔一定轮数的训练信息输出？
新API的demo中，给出来了fluid.EndStepEvent之后会输出一些信息。而没有更多示例给出灵活的日志输出方式。
给定demo中的train_network()函数是无入口参数的，真实用的很多情况需要支持入口参数(如class_number等)，这个接口不灵活。

def train_network():
    predict = inference_network()
    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
    cost = fluid.layers.cross_entropy(input=predict, label=label)
    avg_cost = fluid.layers.mean(cost)
    accuracy = fluid.layers.accuracy(input=predict, label=label)
    return [avg_cost, accuracy]

event_handler的引入不太好让人理解(包括isinstance等)，用户也不知道有哪几种event，应该用机器学习术语。
建议把train的循环与test循环过程都暴露给用户，而不是透明。这样方便用户之后添加调试或者可视化代码。
有些代码太晦涩了，难以理解。兼容性判断是否可以采用fluid.cuda()这种形式，简单且能有对应日志来说明成功与否。

    if use_cuda and not fluid.core.is_compiled_with_cuda():
        return

下面trainer的使用上，有些参数放在Trainer初始化中，有些放在成员函数train中，是否可以合并或者简化：

    trainer = fluid.Trainer(
        train_func=train_program, place=place, optimizer_func=optimizer_func)

    trainer.train(
        reader=train_reader,
        num_epochs=1,
        event_handler=event_handler,
        feed_order=['pixel', 'label'])

优化器optimizer优化的目标没有明确给出，只作为fluid.Trainer的参数传入。后台代码默认选择list元素中第一个，这个操作应该留给用户会灵活些。train_network()中如果只返回一个参数需要是return [avg_cost], 才不出错，但这个中括号的使用会很奇怪。

    trainer = fluid.Trainer(
        train_func=train_program, place=place, optimizer_func=optimizer_func)

optimizer_func()也不能假定没有入口参数。

jacquesqiao · 2018-06-07T01:39:14Z

给定demo中的train_network()函数是无入口参数的，真实用的很多情况需要支持入口参数(如class_number等)，这个接口不灵活。

函数参数的问题，这个可以通过python的partial function来解决

lgone2000 · 2018-06-11T12:34:56Z

隐藏训练迭代部分不是通常框架的方式，建议还是要暴露出训练feed数据的主循环。另一方面是方便写调试代码。目前event回调方式很不灵活，只能在一个全局函数里打打日志干不了其他的。

shanyi15 · 2018-08-15T10:23:42Z

您好，此issue在近一个月内暂无更新，我们将于今天内关闭。若在关闭后您仍需跟进提问，可重新开启此问题，我们将在24小时内回复您。因关闭带来的不便我们深表歉意，请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!

qingqing01 assigned cheradam, panyx0718 and reyoung Jun 6, 2018

panyx0718 assigned wangkuiyi Jun 6, 2018

qingqing01 added the need be discussed label Jun 6, 2018

cheradam assigned captainwoon Jun 7, 2018

shanyi15 closed this as completed Aug 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High level API的一些使用问题 #11246

High level API的一些使用问题 #11246

qingqing01 commented Jun 6, 2018 •

edited

Loading

dzhwinter commented Jun 6, 2018

BigFishMaster commented Jun 6, 2018

BigFishMaster commented Jun 6, 2018 •

edited

Loading

jacquesqiao commented Jun 7, 2018

lgone2000 commented Jun 11, 2018

shanyi15 commented Aug 15, 2018

High level API的一些使用问题 #11246

High level API的一些使用问题 #11246

Comments

qingqing01 commented Jun 6, 2018 • edited Loading

dzhwinter commented Jun 6, 2018

BigFishMaster commented Jun 6, 2018

BigFishMaster commented Jun 6, 2018 • edited Loading

jacquesqiao commented Jun 7, 2018

lgone2000 commented Jun 11, 2018

shanyi15 commented Aug 15, 2018

qingqing01 commented Jun 6, 2018 •

edited

Loading

BigFishMaster commented Jun 6, 2018 •

edited

Loading