-
Notifications
You must be signed in to change notification settings - Fork 7k
[core] Make RPC Chaos Configurations More Readable as JSON #58886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a great improvement, refactoring the testing_rpc_failure configuration from a custom string format to JSON. This significantly enhances readability, maintainability, and extensibility of the RPC chaos testing framework. The changes in both the Python test files and the C++ core logic are well-executed.
I have one suggestion regarding code duplication in the Python test files. A helper function, create_failure_json, has been introduced in multiple test files. Consolidating this into a shared test utility would further improve the codebase.
| def create_failure_json(method, num_failures, failure_str): | ||
| parts = failure_str.split(":") | ||
| return json.dumps( | ||
| { | ||
| method: { | ||
| "num_failures": num_failures, | ||
| "req_failure_prob": int(parts[0]), | ||
| "resp_failure_prob": int(parts[1]), | ||
| "in_flight_failure_prob": int(parts[2]), | ||
| } | ||
| } | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This helper function create_failure_json is very useful for creating the RPC failure configuration. I noticed it's also defined in test_object_manager_fault_tolerance.py and test_raylet_fault_tolerance.py. To avoid code duplication and improve maintainability, would you consider moving this function to a shared test utility module, such as ray/_private/test_utils.py? This would allow all tests to import and use a single implementation.
|
@dayshah Sorry for the delay. I am currently trying building Ray from source to test this modification. Would you mind waiting for a moment? |
9caf2d3 to
20fd244
Compare
|
Hi @dayshah, bazel test
pytest
|
lint failure |
|
Fixed lint failure, thanks! |
jjyao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG. Thanks for the contribution!
src/ray/rpc/rpc_chaos.cc
Outdated
| value.value("num_failures", 0L), | ||
| value.value("req_failure_prob", 0UL), | ||
| value.value("resp_failure_prob", 0UL), | ||
| value.value("in_flight_failure_prob", 0UL), | ||
| value.value("num_lower_bound_req_failures", 0UL), | ||
| value.value("num_lower_bound_resp_failures", 0UL), | ||
| value.value("num_lower_bound_in_flight_failures", 0UL), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also LOG(FATAL) if user specifies some value that's unknown: e.g. a typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in e555826, and passed the test
bazel test //src/ray/rpc/tests:rpc_chaos_test
INFO: Build completed successfully, 9 total actions
//src/ray/rpc/tests:rpc_chaos_test PASSED in 0.3s
Executed 1 out of 1 test: 1 test passes.
e555826 to
923994a
Compare
|
lint failures |
a7e1592 to
993b2a2
Compare
Signed-off-by: dancingactor <s990346@gmail.com>
- modified variable names Signed-off-by: dancingactor <s990346@gmail.com>
Signed-off-by: dancingactor <s990346@gmail.com>
Description
This PR refactors the
testing_rpc_failureconfiguration to use JSON instead of a custom delimited string format.As more options were being added to the RPC chaos testing framework, the old string format (e.g.,
method1=3:12:12:50) became difficult to read and maintain. By switching to JSON, the configuration is now self-describing, much more readable, and easier to extend in the future.Related issues
Closes #58686
Additional information
Before:
After:
{ "*": { "num_failures": -1, "req_failure_prob": 25, "resp_failure_prob": 50, "in_flight_failure_prob": 10, "num_lower_bound_req_failures": 2, "num_lower_bound_resp_failures": 3, "num_lower_bound_in_flight_failures": 1 } }