Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High level API的一些使用问题 #11246

Closed
qingqing01 opened this issue Jun 6, 2018 · 6 comments
Closed

High level API的一些使用问题 #11246

qingqing01 opened this issue Jun 6, 2018 · 6 comments

Comments

@qingqing01
Copy link
Contributor

qingqing01 commented Jun 6, 2018

背景是要升级一些models里模型,遇到若干问题:

  1. 如何使得训练中的test_program和train_program不同?

    • 问题:类似以下定义,如何只给test_program里加一些op,train_program里不加?
       loss = fluid.layers.ssd_loss(locs, confs, gt_box, gt_label, box,box_var)
       loss = fluid.layers.reduce_sum(loss)
       test_program = fluid.default_main_program().clone(for_test=True)
       with fluid.program_guard(test_program):
           map_eval = fluid.evaluator.DetectionMAP(...)
    • 当前内部实现问题:只从train_program里clone出来test_program。(也没有加for_test=True,这样对有BatchNorm、Dropout的网络在测试时的行为是正确的吗?)
      self.test_program = self.train_program.clone()
  2. 如何Load部分参数做初始化?

    • Low-level API方法:当前在models里的一些例子,采用如下方法,必需定义predicate的一个函数传入给load_vars才能只load部分参数。
    place = fluid.CUDAPlace(0)  # or fluid.CPUPlace()
    exe = fluid.Executor(place)
    exe.run(fluid.default_startup_program())
    def if_exist(var):
        return os.path.exists(os.path.join(pretrained_model_dir, var.name))
    fluid.io.load_vars(exe, pretrained_model_dir, predicate=if_exist)
  3. 如何做内存优化?

    • Low-level API目前使用如下:
       # 定义网络部分省略
       opts = optimizer.minimize(avg_cost)
       fluid.memory_optimize(fluid.default_main_program()) #优化内存
      
    • 问题:High level API如何做?
@dzhwinter
Copy link
Contributor

I can answer the question about memory optimize, we will apply the memory optimize to program by default, seems we didn't have the needs to export a high-level API.

@BigFishMaster
Copy link
Contributor

@dzhwinter opening the memory optimization will cause performance problem, which have not been solved. Thus, it is necessary to export an interface and set it False.

@BigFishMaster
Copy link
Contributor

BigFishMaster commented Jun 6, 2018

  1. 如何进行每隔一定轮数的训练信息输出?
    新API的demo中,给出来了fluid.EndStepEvent之后会输出一些信息。而没有更多示例给出灵活的日志输出方式。

  2. 给定demo中的train_network()函数是无入口参数的,真实用的很多情况需要支持入口参数(如class_number等),这个接口不灵活。

def train_network():
    predict = inference_network()
    label = fluid.layers.data(name='label', shape=[1], dtype='int64')
    cost = fluid.layers.cross_entropy(input=predict, label=label)
    avg_cost = fluid.layers.mean(cost)
    accuracy = fluid.layers.accuracy(input=predict, label=label)
    return [avg_cost, accuracy]
  1. event_handler的引入不太好让人理解(包括isinstance等),用户也不知道有哪几种event,应该用机器学习术语。

  2. 建议把train的循环与test循环过程都暴露给用户,而不是透明。这样方便用户之后添加调试或者可视化代码。

  3. 有些代码太晦涩了,难以理解。兼容性判断是否可以采用fluid.cuda()这种形式,简单且能有对应日志来说明成功与否。

    if use_cuda and not fluid.core.is_compiled_with_cuda():
        return

下面trainer的使用上,有些参数放在Trainer初始化中,有些放在成员函数train中,是否可以合并或者简化:

    trainer = fluid.Trainer(
        train_func=train_program, place=place, optimizer_func=optimizer_func)

    trainer.train(
        reader=train_reader,
        num_epochs=1,
        event_handler=event_handler,
        feed_order=['pixel', 'label'])
  1. 优化器optimizer优化的目标没有明确给出,只作为fluid.Trainer的参数传入。后台代码默认选择list元素中第一个,这个操作应该留给用户会灵活些。train_network()中如果只返回一个参数需要是return [avg_cost], 才不出错,但这个中括号的使用会很奇怪。
    trainer = fluid.Trainer(
        train_func=train_program, place=place, optimizer_func=optimizer_func)
  1. optimizer_func()也不能假定没有入口参数。

@jacquesqiao
Copy link
Member

给定demo中的train_network()函数是无入口参数的,真实用的很多情况需要支持入口参数(如class_number等),这个接口不灵活。

函数参数的问题,这个可以通过python的partial function来解决

@lgone2000
Copy link
Contributor

隐藏训练迭代部分 不是通常框架的方式,建议还是要暴露出训练feed数据的主循环。另一方面是方便 写调试代码。目前event回调方式很不灵活,只能在一个全局函数里打打日志 干不了其他的。

@shanyi15
Copy link
Collaborator

您好,此issue在近一个月内暂无更新,我们将于今天内关闭。若在关闭后您仍需跟进提问,可重新开启此问题,我们将在24小时内回复您。因关闭带来的不便我们深表歉意,请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests