Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add healthcheck endpoint #177

Merged
merged 9 commits into from
Jul 2, 2020
Merged

Add healthcheck endpoint #177

merged 9 commits into from
Jul 2, 2020

Conversation

kmakiela
Copy link
Contributor

@kmakiela kmakiela commented Jun 22, 2020

  • Added healthcheck controller
  • Added tests
  • Created documentation for this endpoint
  • Updated Sparrow

@michalwski michalwski marked this pull request as draft June 24, 2020 08:29
@kmakiela kmakiela marked this pull request as ready for review June 25, 2020 11:04
@kmakiela kmakiela changed the title Add healtheck endpoint Add healthcheck endpoint Jun 25, 2020
@kmakiela kmakiela removed the wip label Jun 25, 2020
@janciesla8818 janciesla8818 requested a review from leszke June 26, 2020 08:54
Copy link
Contributor

@janciesla8818 janciesla8818 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, looks good to me.

Copy link
Contributor

@michalwski michalwski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing the health check. I have 2 points to discuss:

  1. Documentation of the /healthcheck. I think it's important to say about the endpoint and explain that it's not recommended to do it frequently due to extra work added to the worker processes.
  2. Is it possible, does it make sense to add spec of this endpoint to our OpenAPI spec?
  3. Open question. Is status 200 a good fit for a situation when all connections are dead? Software like load balancers decides if a service is a health based on the HTTP status code.

end

test "Successful connection to services" do
{_, _, body} = API.get("/healthcheck")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd worth to check the status of the response as well. If I understand it correctly this is 200.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3. Open question. Is status 200 a good fit for a situation when all connections are dead? Software like load balancers decides if a service is a health based on the HTTP status code.

I agree, I put 503 status response in that case. However, I'm not completely sure if this "service unavailable" would be about APNS and FCM or the MPush itself

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be used as a guideline: https://tools.ietf.org/id/draft-inadarei-api-health-check-01.html :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good reading. Thanks for that @rslota. @kmakiela could you take this document into account?

@kmakiela kmakiela force-pushed the km/healthcheck branch 2 times, most recently from 9577b3b to de603a3 Compare June 29, 2020 13:45
* Added healthcheck controller
* Added tests
* Created documentation for this endpoint
* Updated Sparrow dependency
Copy link
Contributor

@NelsonVides NelsonVides left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an article check and some small questions 🙂

mix.exs Outdated Show resolved Hide resolved
test/support/api.ex Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
kmakiela and others added 2 commits June 30, 2020 14:02
Co-authored-by: Nelson Vides <nelson.vides@erlang-solutions.com>
README.md Outdated
"connection_status":
[
"connected",
"connected"
Copy link
Contributor

@aleklisi aleklisi Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I get 100 workers does it mean that this list is 100 "connected" strings?
How about:

"connection_status": {
  "connected": 2,
  "disconnected": 0
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I thought we wanted to possibly give more information about the workers but since we stick with the connection status only I agree it's cleaner that way. However, the whole response is going to be restructured to match the draft RFC mentioned earlier

Copy link
Contributor

@michalwski michalwski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the response compatible with the proposed RFC and adding documentation. I had few additional, minor comments. Other than that it looks good.

case is_everything_disconnected do
true ->
503

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this empty line is redundant and make the code less readable.

Suggested change

Comment on lines 57 to 63
case pool_info[:connection_status][:connected] do
0 ->
true

_ ->
false
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just this:

Suggested change
case pool_info[:connection_status][:connected] do
0 ->
true
_ ->
false
end
not (pool_info[:connection_status][:connected] > 0)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or could it be even pool_info[:connection_status][:connected] == 0?

case status do
503 ->
"fail"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

case is_pool_disconnected?(pool_info) do
true ->
"fail"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@kmakiela
Copy link
Contributor Author

kmakiela commented Jul 1, 2020

Thanks for making the response compatible with the proposed RFC and adding documentation. I had few additional, minor comments. Other than that it looks good.

These additional lines are all added and required by mix.format.

@NelsonVides
Copy link
Contributor

Thanks for making the response compatible with the proposed RFC and adding documentation. I had few additional, minor comments. Other than that it looks good.

These additional lines are all added and required by mix.format.

Yeah, often mix.format forced me to do unexpected stuff 🤷‍♂️


connections =
stats
|> Enum.map(&extract_connection_info_from_pool/1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coding style remark is that it is considered a bad practise to use pipe only once. Please follow the style guide in future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, there were more pipes earlier that got removed


{_workers, connection_status} =
workers
|> Enum.map_reduce(%{connected: 0, disconnected: 0}, fn worker_info, acc ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here as well :)

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@NelsonVides NelsonVides left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent 😄

@NelsonVides NelsonVides merged commit ee74ff3 into master Jul 2, 2020
@NelsonVides NelsonVides deleted the km/healthcheck branch July 2, 2020 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants