Using go to implement parameter server side optimization #265

typhoonzero · 2017-08-03T03:38:57Z

There are two ways to accomplish this:

port eigen library to go, then use go to do optimization
- provide a general tensor library in go, can use somewhere else.
- since pserver merely use GPU, do not need op to provide both CPU and GPU
port paddle "operators" to go, then use go to do optimization
- ops can both use in pserver and trainers, for both remote optimize and local optimize
- higher level implement

helinwang · 2017-08-03T04:53:30Z

Here are some initial thoughts:

provide a general tensor library in go, can use somewhere else.
If we want a tensor library in Go, this native library could be an option: https://github.com/gonum/gonum

For 2, The current implementation is we have optimizer in C++, I think it's will be quite simple to compile the C++ operator implementation into optimizer.

If we want to directly call Eigen from Go (frequently), we need to be careful of the cgo cost: https://www.cockroachlabs.com/blog/the-cost-and-complexity-of-cgo/

Will take more look tomorrow and discuss in tomorrow night US time.

typhoonzero · 2017-08-03T08:10:21Z

It seems that https://golang.org/pkg/reflect/#MakeFunc is able to make multi-typed tensor operations.

Edit:

Seems MakeFunc cannot directly manipulate value calculations for the type reflect.Value can not directly add together, will continue to try later.

typhoonzero · 2017-08-03T08:12:11Z

For 2, The current implementation is we have optimizer in C++, I think it's will be quite simple to compile the C++ operator implementation into optimizer.

No. We must re-implement optimizers as "Ops" for more complexity.

typhoonzero · 2017-08-09T04:52:12Z

最近尝试写了几种方式 port eigen：

使用reflect，需要强制type convert，这个对性能损失会比较严重
使用"text/template"和go generate，生成多类型代码，模版编写不可读

关于cgo的性能：
cgo的频繁调用会极大的影响性能，对于计算库来说，频繁调用又是不可避免的。cgo是不能像python c扩展那样带来性能提升，反而会降低性能。所以使用cgo扩展port eigen并不能获得接近原生eigen的性能。

helinwang · 2017-08-09T23:39:33Z

@typhoonzero

使用reflect，需要强制type convert，这个对性能损失会比较严重
使用"text/template"和go generate，生成多类型代码，模版编写不可读

都非常好奇如何实现的，能否贴一下示例代码？（感觉可以成为https://github.com/PaddlePaddle/blog/issues/1 中的一个亮点）

对于计算库来说，频繁调用又是不可避免的

我觉得取决于什么样的计算库，Eigen这种底层的计算库可能会非常频繁。如果是一个layer（甚至一个Op）用C++实现，Go调用，可能cgo overhead跟计算所花时间对比，可以忽略不计？

typhoonzero · 2017-08-10T01:47:21Z

都非常好奇如何实现的，能否贴一下示例代码？（感觉可以成为PaddlePaddle/blog#1 中的一个亮点）

嗯嗯，这个实现了一半，后续我可以完善一个demo case贴在blog中。

我觉得取决于什么样的计算库，Eigen这种底层的计算库可能会非常频繁。如果是一个layer（甚至一个Op）用C++实现，Go调用，可能cgo overhead跟计算所花时间对比，可以忽略不计？

嗯如果Op或Layer使用C++实现，实际大部分的code还是在C++端，这个方式和Tensorflow类似，只是用go来描述网络的配置了。

typhoonzero added the question label Aug 3, 2017

dzhwinter self-assigned this Aug 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using go to implement parameter server side optimization #265

Using go to implement parameter server side optimization #265

typhoonzero commented Aug 3, 2017 •

edited

Loading

helinwang commented Aug 3, 2017 •

edited

Loading

typhoonzero commented Aug 3, 2017 •

edited

Loading

typhoonzero commented Aug 3, 2017

typhoonzero commented Aug 9, 2017

helinwang commented Aug 9, 2017 •

edited

Loading

typhoonzero commented Aug 10, 2017

Using go to implement parameter server side optimization #265

Using go to implement parameter server side optimization #265

Comments

typhoonzero commented Aug 3, 2017 • edited Loading

helinwang commented Aug 3, 2017 • edited Loading

typhoonzero commented Aug 3, 2017 • edited Loading

typhoonzero commented Aug 3, 2017

typhoonzero commented Aug 9, 2017

helinwang commented Aug 9, 2017 • edited Loading

typhoonzero commented Aug 10, 2017

typhoonzero commented Aug 3, 2017 •

edited

Loading

helinwang commented Aug 3, 2017 •

edited

Loading

typhoonzero commented Aug 3, 2017 •

edited

Loading

helinwang commented Aug 9, 2017 •

edited

Loading