Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement pserver RPC part, and simple parameter partition. #2243

Merged
merged 5 commits into from
May 31, 2017

Conversation

helinwang
Copy link
Contributor

@helinwang helinwang commented May 23, 2017

Easier to view using github split diff :)

Please use the following command to build, because there is some problem with the go cmake (the cmake checks out a new Paddle from github for Go dependency, so the changes in local filesystem are not reflected. Fix is in another PR: #2294).

cd Paddle/paddle/go/cclient
mkdir build && cd build
mkdir -p go/src/github.com/PaddlePaddle; ln -s $GOPATH/src/github.com/PaddlePaddle/Paddle go/src/github.com/PaddlePaddle/
cmake .. && make
go run ../../cmd/pserver/pserver.go -p 3000 # run pserver

open new teminal and run:

./test/main # run the example program we just build

type selector bool

func (s selector) Select() bool {
return bool(s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type switch for clearly?

Copy link
Contributor Author

@helinwang helinwang May 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this is clear because the underlying type for selector is bool, but I could be wrong.
I don't fully understand how to do type switch here, could you paste the code here?

type lister []pserver.Server

func (l lister) List() []pserver.Server {
return l
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should implement this interface bases on ectd? for example, list pserver names from etcd value for fault tolerant pserver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will write a module who implements this interface that talks to etcd.
However, this PR only includes a module with static pserver addresses. To be used for MPI cluster.
I will send a separate PR for the etcd module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get it

return nil
}

type result struct {
idx int
p Parameter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe name with "param" better here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Done.

ticker := time.NewTicker(10 * time.Second)
monitor := func() {
curServers := make([]Server, pserverNum)
list := l.List()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to check the pserverNum == len(list) here to ensure that alive pserver instance equal the desired number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok if number of pservers still did not reach the desired number. We will only connect to pservers that are alive. For pservers not alive yet, we will not connect to them, and all calls to them will be blocked until connected. (See: https://github.com/PaddlePaddle/Paddle/pull/2243/files#diff-b1a8f38187aa603e0efc16791f282d87R61)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the explain. I got it. : )

Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@helinwang helinwang merged commit 2c1cffd into PaddlePaddle:develop May 31, 2017
@helinwang helinwang deleted the pserver_rpc branch May 31, 2017 02:07
heavengate pushed a commit to heavengate/Paddle that referenced this pull request Aug 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants