Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Fix HealthManager #3819

Merged
merged 9 commits into from
Apr 27, 2022
Merged

Conversation

thinker0
Copy link
Member

@thinker0 thinker0 commented Apr 18, 2022

  • Fix legacy of HealthManager
  • Fix Sfl4jBridge (Unsupported)
  • Add healthcheck API for heron-tracker, heron-ui
  • Fix heron-tracker, heron-ui

Fix HealthManager
@thinker0 thinker0 force-pushed the feature/fix-healthmanager branch from 94d4d01 to b6133fe Compare April 18, 2022 01:39
@nicknezis
Copy link
Contributor

What was the error encountered that required the change from float to string in the timeline object? I ask because there were various issues with the intermingling of the types. I was hoping we could keep it consistent, but recognize there might be use cases I didn't account for with the prior testing. So I'm interested to better understand what this PR fixes, and also does the timeline in the UI still work? If so, then I think I'm ok with this PR.

@surahman
Copy link
Member

surahman commented Apr 18, 2022

I ask because there were various issues with the intermingling of the types. I was hoping we could keep it consistent, but recognize there might be use cases I didn't account for with the prior testing.

I approved because Python typing will, rather annoyingly, allow you to achieve what follows. I agree with the need for consistency.

float_Val: float = 9.19
print(float_Val, type(float_Val))

str_Val: str = float_Val
print(str_Val, type(str_Val))

str_Val = 919
print(str_Val, type(str_Val))

"""
9.19 <class 'float'>
9.19 <class 'float'>
919 <class 'int'>
"""

@thinker0
Copy link
Member Author

thinker0 commented Apr 19, 2022

return MetricsTimeline(
starttime = 40,
endtime = 360,
component = "a",
timeline = {
"c": {
"b": {
40: "1.0",
100: "1.0",
160: "1.0",
220: "1.0",
280: "1.0",
340: "1.0"
}
}
},
)

In the internal UnitTest, it seems to be a String.

@nicknezis
Copy link
Contributor

nicknezis commented Apr 19, 2022

Should the unit test be updated to be a float instead of string? I did not catch that.

When I started working on the last PR, it was originally an int which I think was incorrect. str might be valid option. I think I tried to make it work, but we had enough other issues in the PR. I trust your opinion on str vs float.

We should make sure we are setting the type consistently. Should this line of code also be updated to be a str?

Edit: Also, does this other line, which was unchanged in the prior PR, help the choice to use str? It seems the code was converting to float which wouldn't make sense with my use of float.

@thinker0
Copy link
Member Author

humm. Conflit is occurring in several places ㅠ.ㅠ

@thinker0 thinker0 closed this Apr 19, 2022
@thinker0 thinker0 reopened this Apr 20, 2022
@thinker0
Copy link
Member Author

I fixed it on my CentOS 7.

@thinker0
Copy link
Member Author

%  ./docker/scripts/test-unittest.sh centos7 0.20.5
================================================================================
INFO: Elapsed time: 1889.214s, Critical Path: 207.25s
INFO: 3826 processes: 1175 internal, 2651 local.
INFO: Build completed successfully, 3826 total actions
Test cases: finished with 1189 passing and 0 failing out of 1189 test cases

Executed 247 out of 247 tests: 247 tests pass.
INFO: Build completed successfully, 3826 total actions
Cleaning up scratch dir

@thinker0
Copy link
Member Author

% ./docker/scripts/test-unittest.sh rocky8 0.20.5
================================================================================
INFO: Elapsed time: 1835.725s, Critical Path: 246.64s
INFO: 3826 processes: 1175 internal, 2651 local.
INFO: Build completed successfully, 3826 total actions
Test cases: finished with 1189 passing and 0 failing out of 1189 test cases

Executed 247 out of 247 tests: 247 tests pass.
INFO: Build completed successfully, 3826 total actions
Cleaning up scratch dir

@thinker0
Copy link
Member Author

Should the unit test be updated to be a float instead of string? I did not catch that.

Nothing

We should make sure we are setting the type consistently. Should this line of code also be updated to be a str?

I don't think you need to change this part.
When 0.20.5 centos binary was applied to both tracker and ui in production, it worked in the UI regardless of this part.

@surahman
Copy link
Member

There was a failing test for killing topologies but I have seen that fail intermittently before. I did restart the CI pipeline.

Is the downgrade of the dev-toolset related to an unsupported library causing issues?

I do not see any issues with using the C++1y flag instead of C++14 because the other OSs have their version of GCC and G++ kept up to date. This means the flag will indicate compiling against the final C++14 standard on those OSs. It will, however, compile using the draft standard on Centos7 (GCC 4.8.x) but given the build completes successfully I believe this should be okay.

I will request @nicknezis to have a look at the tracker change because he has spent a lot of time in it recently.

@surahman surahman requested a review from nicknezis April 20, 2022 16:33
@thinker0 thinker0 marked this pull request as draft April 24, 2022 00:38
@thinker0
Copy link
Member Author

thinker0 commented Apr 24, 2022

Fixed some functions.

Received response from Tmanager: Unknown component: __stmgr__
Received response from Tmanager: Metrics not available for component 'xxxxxx'

@thinker0 thinker0 marked this pull request as ready for review April 25, 2022 05:31
@thinker0
Copy link
Member Author

There was a failing test for killing topologies but I have seen that fail intermittently before. I did restart the CI pipeline.

Is the downgrade of the dev-toolset related to an unsupported library causing issues?

I do not see any issues with using the C++1y flag instead of C++14 because the other OSs have their version of GCC and G++ kept up to date. This means the flag will indicate compiling against the final C++14 standard on those OSs. It will, however, compile using the draft standard on Centos7 (GCC 4.8.x) but given the build completes successfully I believe this should be okay.

I will request @nicknezis to have a look at the tracker change because he has spent a lot of time in it recently.

This was judged to be a cause other than the cause of gcc, so it was restored to its original state.

@thinker0 thinker0 marked this pull request as draft April 25, 2022 10:33
@thinker0 thinker0 marked this pull request as ready for review April 27, 2022 12:08
@surahman surahman merged commit 3fdf1f8 into apache:master Apr 27, 2022
@thinker0 thinker0 deleted the feature/fix-healthmanager branch April 27, 2022 23:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants