-
Notifications
You must be signed in to change notification settings - Fork 836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding multiple grpc connections for python server and executor client #3356
Conversation
Hi @mwm5945. Thanks for your PR. I'm waiting for a SeldonIO or todo member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the jenkins-x/lighthouse repository. |
executor/api/grpc/seldon/client.go
Outdated
s.RLock() | ||
randNum := rand.Intn(numConns) | ||
k := fmt.Sprintf("%s:%d", host, port) | ||
if conn, ok := s.conns[k]; ok { | ||
return conn, nil | ||
if nodeConns, ok := s.conns[k]; ok { | ||
if c := nodeConns[randNum]; c != nil { | ||
defer s.RUnlock() | ||
return c, nil | ||
} | ||
s.RUnlock() | ||
|
||
c, err := s.createNewConn(modelName, host, port) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
s.Lock() | ||
s.conns[k][randNum] = c | ||
s.Unlock() | ||
|
||
return s.conns[k][randNum], nil | ||
} else { | ||
opts := []grpc.DialOption{ | ||
grpc.WithInsecure(), | ||
} | ||
opts = append(opts, grpc2.AddClientInterceptors(s.Predictor, s.DeploymentName, modelName, s.annotations, s.Log)) | ||
conn, err := grpc.Dial(fmt.Sprintf("%s:%d", host, port), opts...) | ||
if err != nil { | ||
return nil, err | ||
} | ||
s.conns[k] = conn | ||
return conn, nil | ||
s.RUnlock() | ||
connList := make([]*grpc.ClientConn, numConns) | ||
|
||
c, err := s.createNewConn(modelName, host, port) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
connList[randNum] = c | ||
|
||
s.Lock() | ||
s.conns[k] = connList | ||
s.Unlock() | ||
|
||
return s.conns[k][randNum], nil | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've done this complicated stuff because it appears to better spread out the connections amongst the available processes on the server side. There's still no guarantee that each client will connect to a unique process though.
i think it'd also be a good idea to add a new arg to the python server to match the new |
/ok-to-test |
@mwm5945 Can you run "make fmt" from python folder to solve lint issue? |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Looking great @mwm5945 ! It seems the Python tests are failing for GRPC (on the CI) - conscious that this woudl run the grpc service locally it may require some tweaks to work with the new implementation |
(The docs test failing is due to an issue which is now fixed in master so rebasing should address it) |
@axsaucedo do you know what version of python the
Appears this might be an issue with Conda or even an issue with a specific version of Python 3 |
(also the classic "works on my machine" lol) |
@mwm5945 Can you fix lint with |
756bf39
to
d625585
Compare
@cliveseldon running make fmt in python directory doesn't change anything. looks like something else:
|
/test notebooks |
Benchmark results - Testing Workers PerformanceResults table
|
Benchmark Results - Python Wrapper V1 vs V2
Python V1 Wrapper Results table
Python V2 MLServer Results table
|
Benchmark results - Testing Seldon V1 Data TypesResults for NDArray
Results for Tensor
Results for TFTensor
|
/test integration |
/test notebooks |
/test notebooks |
1 similar comment
/test notebooks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment as seems custom metrics test is failing
@@ -60,7 +63,7 @@ def start_servers( | |||
""" | |||
p2 = None | |||
if target2: | |||
p2 = mp.Process(target=target2, daemon=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the custom metrics test is failing, it's not yet clear what may be causing this issue - trying to see whether it could be that the changes are somehow blocking the metrics port and the prometheus metrics are not being scraped, however I can't spot explicitly a reason why this could be the case. Not sure if you have any thouhts @mwm5945 ? I will try to run locally to validate |
/test notebooks |
/test integration |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: axsaucedo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it: Support multiple GRPC connections on the Python Server to allow for pre-fork multi-processing, and multiplex requests over multiple GRPC Client connections in the executor.
Which issue(s) this PR fixes:
Fixes #3334
Special notes for your reviewer:
Does this PR introduce a user-facing change?: