-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue 4175] [pulsar-function-go] Add Go Function heartbeat (and gRPC service) for production usage #6031
Conversation
Let's not merge this until I can get the Go tests to pass. I just wanted to get this upstream to share the fix with anyone else who might be working on this feature. |
FYI, I ran the Go tests from master (without my changes), and they failed with the same results as the tests with the changes from this PR. |
Thanks @devinbost Can you provide the error information? Or how can I reproduce this problem |
@wolfstudy Thanks for taking a look. You should get this result:
|
ping @wolfstudy ? |
I ran each of the Go tests individually. Here are the results:
So, most of them passed. It’s only context_test.go that failed. Regarding the TestContext failure, here's what I got from my breakpoint: Note that the inputTopics value at the bottom is nil. |
It turns out that the issue in the I also added a new test file We probably still need to test the entire thing end-to-end to ensure that concurrency issues weren't introduced anywhere and that I'm starting the servicer in the correct location. |
Thanks @devinbost agree with you, maybe we can remove |
Done. |
Instrumenting Prometheus might be more work than anticipated. I ran into a blocking issue due to missing features in the Go client of Prometheus and got some pushback from one of their maintainers when I asked about getting those features. |
So, we probably need to have the Prometheus wiring in a different PR due to its complexity and the larger changes that will be required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes LGTM, just little comments, please fix them.
"math" | ||
"time" | ||
|
||
"github.com/apache/pulsar/pulsar-client-go/pulsar" | ||
log "github.com/apache/pulsar/pulsar-function-go/logutil" | ||
"github.com/apache/pulsar/pulsar-function-go/pb" | ||
pb "github.com/apache/pulsar/pulsar-function-go/pb" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pb "github.com/apache/pulsar/pulsar-function-go/pb" | |
"github.com/apache/pulsar/pulsar-function-go/pb" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I make this change, I can't get it to build. I get unresolved variables, even if I remove pb from the references in the code.
@@ -24,7 +24,7 @@ import ( | |||
"time" | |||
|
|||
"github.com/apache/pulsar/pulsar-function-go/conf" | |||
"github.com/apache/pulsar/pulsar-function-go/pb" | |||
pb "github.com/apache/pulsar/pulsar-function-go/pb" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pb "github.com/apache/pulsar/pulsar-function-go/pb" | |
"github.com/apache/pulsar/pulsar-function-go/pb" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, when I make this change, I can't get it to build.
"net" | ||
"google.golang.org/grpc" | ||
log "github.com/apache/pulsar/pulsar-function-go/logutil" | ||
pb "github.com/apache/pulsar/pulsar-function-go/pb" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pb "github.com/apache/pulsar/pulsar-function-go/pb" | |
"github.com/apache/pulsar/pulsar-function-go/pb" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, when I make this change, I can't get it to build.
) | ||
|
||
|
||
func testProcessSpawnerHealthCheckTimer(tkr *time.Ticker, lastHealthCheckTs int64, expectedHealthCheckInterval int32, counter *int ){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func testProcessSpawnerHealthCheckTimer(tkr *time.Ticker, lastHealthCheckTs int64, expectedHealthCheckInterval int32, counter *int ){ | |
func testProcessSpawnerHealthCheckTimer(tkr *time.Ticker, lastHealthCheckTs int64, expectedHealthCheckInterval int32, counter int ){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change breaks the test because the counter variable doesn't propagate as needed for the assertion.
} | ||
} | ||
|
||
func testStartScheduler(counter *int){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func testStartScheduler(counter *int){ | |
func testStartScheduler(counter int){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this change breaks the test because the counter variable doesn't propagate as needed for the assertion.
@devinbost I'm adding Action CI for Go Functions, make ensure that test cases can pass and code style is good |
I made the changes that I could. Regarding everything that wasn't modified, please see my replies to the comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
retest this please |
@devinbost can you rebase this pull request? |
…omission of methods for gRPC server registration in generated gRPC files for Go. (apache#4175) Generated updated gRPC files that contain service registration methods for creating gRPC service in Go. Also, upgraded proto version to 3. (apache#4175) Fixed build errors by prefixing pulsar-function-go/pb with pb alias. (apache#4175). Added instanceControlServicer.go as the servicer responsible for serving the gRPC service for the Go Function instances (apache#4175). Rough draft right now. Added changes to show intent behind passing port value to Start in function.go. Also, added some code to support healthcheck and added methods to support instanceConrolServicer. Just needed to commit changes to allow reproducible test errors. (apache#4175). Updated function.go Start method to make it more clear where we need to provide a port value (apache#4175). Added port and expectedHealthCheckInterval to use of function context. Updated all references. (apache#4175) Added Apache license to gRPC-generated files in attempt to get license check test to pass (apache#4175). Created instanceControlServicer_test.go to test gRPC server and validate that HealthCheck method returns true as expected (apache#4175). Fixed bug in FunctionContext (and context_test.go) where the inputTopics field was being referenced when it wasn't getting populated. Updated GetInputTopics method to get input topics from the source location (apache#4175). Fixed bug in FunctionContext (and context_test.go) where the inputTopics field was being referenced when it wasn't getting populated. Updated GetInputTopics method to get input topics from the source location. (Should have been part of previous commit.) Also, added expectedHealthCheckInterval to conf.yaml for testing. (apache#4175). Fixed license formatting by running mvn license:format (apache#4175). Added logic and tests to allow healthCheck to kill instances that aren't receiving their regular health checks. Still needs an end-to-end test involving FunctionManager to check for possible issues that could kill instances incorrectly (apache#4175). Removed inputTopics field from FunctionContext (apache#4175). Adding the progress I've made so far on migrating the Prometheus code to Go... currently blocked due to missing methods from the Go client. Waiting for information from the Prometheus maintainers to find a workaround. (apache#4175). Fixed license check. (apache#4175) Reverting the last two commits since they should go into a separate PR. (apache#4174). Re-added test file that was accidentially deleted (apache#4175). Added a few comments to make review easier (apache#4175). Made minor (non-functional) changes as per PR review (apache#4175). Fixed print statements (apache#4175). Re-added comment after getting maven license formatting correct (apache#4175).
6df8690
to
6e2174d
Compare
@sijie It's rebased and ready to go. |
run java8 tests |
run integration tests |
run java8 tests |
run integration tests |
run cpp tests |
run java8 tests |
…t intermittent test failures to pass. (apache#6031)
…to get intermittent test failures to pass. (apache#6031)
…to get intermittent test failures to pass. Attempt 3. (apache#6031)
…to get intermittent test failures to pass. Attempt 4. (apache#6031)
…ittent test failures to pass. Attempt 5. (apache#6031)
…ittent test failures to pass. Attempt 6. (apache#6031)
@sijie Do we need all of the Github Action tests to pass before this can be merged? |
@merlimat @jerrypeng How do we get this PR merged? We've tried re-running the tests at least 12 times, and the intermittent test failures continue to prevent the tests from all passing. |
@devinbost the actions are marked with required. we can't merge it right now. working on how to get around the github action issues. |
@sijie Thanks for the help. I really appreciate it. |
… service) for production usage (apache#6031) Partial fix required for apache#4175. Addresses issue documented here: grpc/grpc-go#3310 Part of ongoing work to add the gRPC service for heartbeat functionality and related support for running Go functions. ### Motivation Progress was blocked by the issue documented above. ### Modifications 1. Updated ./generate.sh script in pulsar-function-go module to use the grpc plugin. 2. Rebuilt Go gRPC files using the grpc plugin. This updated the proto version and was mostly an additive change that included adding the new registration methods required for creating the Go gRPC service. ### Verifying this change - Unable to test changes because Go tests were already failing. Currently investigating. - More testing will be done in subsequent commits because the Go gRPC functionality is not yet complete. (Should be low impact.)
Partial fix required for #4175.
Addresses issue documented here: grpc/grpc-go#3310
Part of ongoing work to add the gRPC service for heartbeat functionality and related support for running Go functions.
Motivation
Progress was blocked by the issue documented above.
Modifications
Verifying this change