Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix grpc flakiness in python unit tests #1816

Merged
merged 3 commits into from
May 11, 2020
Merged

fix grpc flakiness in python unit tests #1816

merged 3 commits into from
May 11, 2020

Conversation

RafalSkolasinski
Copy link
Contributor

@RafalSkolasinski RafalSkolasinski commented May 11, 2020

Closes #1745.
It seems that problem is related to the fact that grpc server and and stub starts too close to each other (time-wise).

In principle this should be solved with wait_for_ready flag but this is still experimental and does not seem to cover this specific case - it didn't solve the issue for me but it did help me to reproduce the problem locally by adding wait_for_ready=True to stub.Predict and commenting out

         for q in range(10):
             s1 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
             r1 = s1.connect_ex(("127.0.0.1", 5000))
             s2 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
             r2 = s2.connect_ex(("127.0.0.1", 6005))
             if r1 == 0 and r2 == 0:
                 break
             time.sleep(5)
         else:
             raise RuntimeError("Server did not bind to 127.0.0.1:5000")

The helper retry_method that I introduced to fix the problem repeats a given function n times using tenacity and sleeps with exponentially growing time in between attempts. If not successful it raises the last exception.

@seldondev
Copy link
Collaborator

Mon May 11 11:26:34 UTC 2020
The logs for [lint] [2] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/2.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=2

@seldondev
Copy link
Collaborator

Mon May 11 11:26:35 UTC 2020
The logs for [pr-build] [1] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/1.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=1

@RafalSkolasinski
Copy link
Contributor Author

/retest

@RafalSkolasinski
Copy link
Contributor Author

/test pr-build

@RafalSkolasinski
Copy link
Contributor Author

/test

@RafalSkolasinski
Copy link
Contributor Author

/test ?

@RafalSkolasinski
Copy link
Contributor Author

RafalSkolasinski commented May 11, 2020

/test all

edit... not exactly what I wanted...

@seldondev
Copy link
Collaborator

Mon May 11 11:50:52 UTC 2020
The logs for [pr-build] [3] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/3.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=3

@seldondev
Copy link
Collaborator

Mon May 11 11:52:14 UTC 2020
The logs for [lint] [6] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/6.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=6

@seldondev
Copy link
Collaborator

Mon May 11 11:52:20 UTC 2020
The logs for [integration] [5] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/5.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=5

@seldondev
Copy link
Collaborator

Mon May 11 11:52:33 UTC 2020
The logs for [notebooks] [4] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/4.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=4

@RafalSkolasinski
Copy link
Contributor Author

/test this

@seldondev
Copy link
Collaborator

Mon May 11 12:43:40 UTC 2020
The logs for [pr-build] [7] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/7.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=7

@RafalSkolasinski
Copy link
Contributor Author

/test this

@seldondev
Copy link
Collaborator

Mon May 11 13:03:49 UTC 2020
The logs for [pr-build] [8] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/8.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=8

@seldondev seldondev added size/S and removed size/XS labels May 11, 2020
@seldondev
Copy link
Collaborator

Mon May 11 15:02:22 UTC 2020
The logs for [pr-build] [9] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/9.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=9

@seldondev
Copy link
Collaborator

Mon May 11 15:02:26 UTC 2020
The logs for [lint] [10] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/10.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=10

@RafalSkolasinski RafalSkolasinski changed the title WIP: debug grpc test flakiness debug grpc test flakiness May 11, 2020
@RafalSkolasinski RafalSkolasinski changed the title debug grpc test flakiness fix grpc flakiness test in python unit tests May 11, 2020
@RafalSkolasinski RafalSkolasinski changed the title fix grpc flakiness test in python unit tests fix grpc flakiness in python unit tests May 11, 2020
@seldondev
Copy link
Collaborator

Mon May 11 16:38:40 UTC 2020
The logs for [pr-build] [11] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/11.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=11

@seldondev
Copy link
Collaborator

Mon May 11 16:40:14 UTC 2020
The logs for [lint] [12] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/12.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=12

@seldondev
Copy link
Collaborator

Mon May 11 16:50:18 UTC 2020
The logs for [pr-build] [13] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/13.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=13

@seldondev
Copy link
Collaborator

Mon May 11 16:52:41 UTC 2020
The logs for [lint] [14] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/14.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=14

@seldondev
Copy link
Collaborator

Mon May 11 16:59:32 UTC 2020
The logs for [lint] [16] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/16.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=16

@seldondev
Copy link
Collaborator

Mon May 11 16:59:33 UTC 2020
The logs for [pr-build] [15] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/15.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=15

@RafalSkolasinski
Copy link
Contributor Author

Merging as discussed with @adriangonz
/approve

@seldondev
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RafalSkolasinski

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@seldondev
Copy link
Collaborator

Mon May 11 20:31:36 UTC 2020
The logs for [lint] [18] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/18.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=18

@seldondev
Copy link
Collaborator

Mon May 11 20:31:37 UTC 2020
The logs for [pr-build] [17] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/17.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=17

@seldondev
Copy link
Collaborator

seldondev commented May 11, 2020

@RafalSkolasinski: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
notebooks 9bf0bf3 link /test notebooks

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@RafalSkolasinski
Copy link
Contributor Author

/retest

@RafalSkolasinski
Copy link
Contributor Author

Interesting... had

[ERROR] PROTOC FAILED: [libprotobuf WARNING ../../../../../src/google/protobuf/compiler/parser.cc:646] No syntax specified for the proto file: tensorflow/core/framework/types.proto. Please use 'syntax = "proto2";' or 'syntax = "proto3";' to specify a syntax version. (Defaulted to proto2 syntax.)

in tests of Java Engine...

@seldondev
Copy link
Collaborator

Mon May 11 20:38:36 UTC 2020
The logs for [pr-build] [19] will show after the pipeline context has finished.
https://github.com/SeldonIO/seldon-core/blob/gh-pages/jenkins-x/logs/SeldonIO/seldon-core/PR-1816/19.log

impatient try
jx get build logs SeldonIO/seldon-core/PR-1816 --build=19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate test_model_template_app_grpc_metrics flakiness
3 participants