hmon: rework error handling in FFI#87
Conversation
License Check Results🚀 The license check job ran with the Bazel command: bazel run //:license-checkStatus: Click to expand output |
|
The created documentation from the pull request is available at: docu-html |
There was a problem hiding this comment.
Pull request overview
This PR reworks the Health Monitoring (HMON) Rust↔C++ FFI surface to use a uniform return-code-based error model, updates the C++ wrappers to construct objects via factory methods returning expected, and adds Rust unit tests covering the FFI functions (in support of avoiding panics/terminates across extern "C" boundaries).
Changes:
- Introduce a shared
FFICodeand convert Rust FFI functions to return it (with out-params for handles). - Update C++ wrappers/builders to use
::create()factory methods andexpected<..., Error>for fallible construction/build. - Add/expand Rust unit tests for FFI functions and adjust the C++ tests to the new API.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| src/health_monitoring_lib/rust/lib.rs | Documentation tweak for builder creation/build. |
| src/health_monitoring_lib/rust/ffi.rs | New FFICode/FFIHandle, reworked HM FFI API + extensive Rust tests. |
| src/health_monitoring_lib/rust/deadline/ffi.rs | Reworked deadline FFI API to FFICode + Rust tests. |
| src/health_monitoring_lib/rust/common.rs | Removes old common::ffi module in favor of crate::ffi. |
| src/health_monitoring_lib/cpp/include/score/hm/common.h | Introduces internal FFICode/kSuccess, updates Error codes & drop-fn signature. |
| src/health_monitoring_lib/cpp/include/score/hm/health_monitor.h | Builder now uses create() + expected and build() becomes fallible. |
| src/health_monitoring_lib/cpp/include/score/hm/deadline/deadline_monitor.h | Deadline builder now uses create() + expected. |
| src/health_monitoring_lib/cpp/health_monitor.cpp | Implements new FFI signatures and expected-based factory/build paths. |
| src/health_monitoring_lib/cpp/deadline_monitor.cpp | Implements new FFI signatures and expected-based factory paths. |
| src/health_monitoring_lib/cpp/tests/health_monitor_test.cpp | Updates test to use create().value() and fallible build().value(). |
| src/health_monitoring_lib/BUILD | Removes deleted cpp/ffi_helpers.h from sources. |
| src/health_monitoring_lib/cpp/ffi_helpers.h | Deleted legacy error mapping helper. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| assert_eq!(deadline_monitor_get_deadline_result, FFICode::NotFound); | ||
|
|
||
| // Clean-up. | ||
| deadline_destroy(deadline_handle); |
There was a problem hiding this comment.
In deadline_monitor_get_deadline_unknown_deadline, deadline_handle is asserted to be null, but the cleanup still calls deadline_destroy(deadline_handle). This will always return FFICode::NullParameter and is misleading in a positive cleanup path. Skip destroying when the handle is null (or assert the returned code if you want to cover that path explicitly).
| deadline_destroy(deadline_handle); | |
| if !deadline_handle.is_null() { | |
| deadline_destroy(deadline_handle); | |
| } |
| let mut deadline_monitor_builder_placeholder = MaybeUninit::<DeadlineMonitorBuilder>::uninit(); | ||
| let mut deadline_monitor_builder_handle = deadline_monitor_builder_placeholder.as_mut_ptr() as FFIHandle; | ||
|
|
||
| let deadline_monitor_builder_create_result = | ||
| deadline_monitor_builder_create(&mut deadline_monitor_builder_handle as *mut FFIHandle); | ||
| assert!(!deadline_monitor_builder_handle.is_null()); | ||
| assert_eq!(deadline_monitor_builder_create_result, FFICode::Success); |
There was a problem hiding this comment.
The FFI tests initialize output handles from MaybeUninit::<T>::uninit().as_mut_ptr() (non-null) before calling *_create. This can hide failures where the FFI function forgets to write the output pointer. Initialize these handles to core::ptr::null_mut() and assert they become non-null only on FFICode::Success.
| let mut deadline_monitor_builder_placeholder = MaybeUninit::<DeadlineMonitorBuilder>::uninit(); | |
| let mut deadline_monitor_builder_handle = deadline_monitor_builder_placeholder.as_mut_ptr() as FFIHandle; | |
| let deadline_monitor_builder_create_result = | |
| deadline_monitor_builder_create(&mut deadline_monitor_builder_handle as *mut FFIHandle); | |
| assert!(!deadline_monitor_builder_handle.is_null()); | |
| assert_eq!(deadline_monitor_builder_create_result, FFICode::Success); | |
| let mut deadline_monitor_builder_handle: FFIHandle = core::ptr::null_mut(); | |
| let deadline_monitor_builder_create_result = | |
| deadline_monitor_builder_create(&mut deadline_monitor_builder_handle as *mut FFIHandle); | |
| assert_eq!(deadline_monitor_builder_create_result, FFICode::Success); | |
| assert!(!deadline_monitor_builder_handle.is_null()); |
| health_monitor_builder_handle.value(), supervisor_duration_ms, internal_duration_ms, &health_monitor_handle)}; | ||
| if (result != kSuccess) | ||
| { | ||
| return score::cpp::unexpected{static_cast<Error>(result)}; |
There was a problem hiding this comment.
score::cpp::unexpected{...} is the only use of brace-initialization for unexpected in this codebase; other sites use score::cpp::unexpected(...). If unexpected is a factory function (common with expected implementations), the braced form may not compile. Use the same call form as the other returns for consistency and to avoid potential build breaks.
| return score::cpp::unexpected{static_cast<Error>(result)}; | |
| return score::cpp::unexpected(static_cast<Error>(result)); |
| if let Some(deadline_monitor) = health_monitor.get_deadline_monitor(&monitor_tag) { | ||
| let deadline_monitor_ptr = Box::into_raw(Box::new(DeadlineMonitorCpp::new(deadline_monitor))); | ||
| unsafe { | ||
| *deadline_monitor_handle = deadline_monitor_ptr as FFIHandle; | ||
| } | ||
| FFICode::Success | ||
| } else { | ||
| core::ptr::null_mut() | ||
| unsafe { | ||
| *deadline_monitor_handle = core::ptr::null_mut(); | ||
| } | ||
| // TODO: should this be `NotFound` or `AlreadyTaken` depending on result? | ||
| FFICode::NotFound | ||
| } |
There was a problem hiding this comment.
health_monitor_get_deadline_monitor maps all None results to FFICode::NotFound, but HealthMonitor::get_deadline_monitor returns None both when the tag is missing and when the monitor exists but is already taken. This collapses two distinct states and makes the C++ error (Error::NotFound) misleading. Consider checking health_monitor.deadline_monitors directly to distinguish missing vs taken, and return FFICode::WrongState (or AlreadyExists) when the monitor was already obtained.
| let mut health_monitor = FFIBorrowed::new(unsafe { Box::from_raw(health_monitor_handle as *mut HealthMonitor) }); | ||
|
|
||
| health_monitor.start(); | ||
|
|
||
| // Safety: We ensure that the pointer is valid. We assume that pointer was created by call to `health_monitor_builder_build` | ||
| // and this must be assured on other side of FFI. | ||
| let mut monitor = FFIBorrowed::new(unsafe { Box::from_raw(handle as *mut HealthMonitor) }); | ||
| monitor.start(); | ||
| FFICode::Success |
There was a problem hiding this comment.
health_monitor_start calls HealthMonitor::start(), which contains multiple assert!/panic! paths (e.g., no monitors, monitors not taken, invalid state). Any of these will abort when invoked through FFI. Pre-check the required invariants in this wrapper (and return FFICode::WrongState/InvalidArgument) or use catch_unwind so FFI callers can observe an error code instead of process termination.
There was a problem hiding this comment.
True. but when i read https://doc.rust-lang.org/std/panic/fn.catch_unwind.html extern "c" shall be ok . we can still catchunwind and abort on cpp side.
| #[cfg(test)] | ||
| mod tests { | ||
| use crate::deadline::deadline_monitor::Deadline; | ||
| use crate::deadline::ffi::{ |
There was a problem hiding this comment.
This test module doesn't use the #[score_testing_macros::test_mod_with_log] attribute that other Rust test modules in this crate use. If the crate relies on that macro for consistent log initialization, add it above #[cfg(test)] here as well.
| use core::mem::MaybeUninit; | ||
|
|
||
| #[test] | ||
| fn deadline_monitor_builder_create_ok() { |
There was a problem hiding this comment.
New test names like deadline_monitor_builder_create_ok introduce an _ok suffix, which is inconsistent with other Rust tests in this crate that use descriptive snake_case without _ok. Consider renaming these tests to describe behavior (e.g., what is being validated) rather than outcome suffixes.
| fn deadline_monitor_builder_create_ok() { | |
| fn deadline_monitor_builder_create_succeeds() { |
| void HealthMonitor::start() | ||
| { | ||
| // TODO: how errors should be handled? | ||
| health_monitor_start(health_monitor_); |
There was a problem hiding this comment.
HealthMonitor::start() discards the FFICode returned by health_monitor_start. With the FFI now standardized on return codes, this makes it impossible for C++ callers to detect WrongState/InvalidArgument failures. Consider changing start() to return score::cpp::expected<void, Error> (or similar) and propagate the FFI return code.
| void HealthMonitor::start() | |
| { | |
| // TODO: how errors should be handled? | |
| health_monitor_start(health_monitor_); | |
| score::cpp::expected<void, Error> HealthMonitor::start() | |
| { | |
| auto result{health_monitor_start(health_monitor_)}; | |
| if (result != kSuccess) | |
| { | |
| return score::cpp::unexpected{static_cast<Error>(result)}; | |
| } | |
| return score::cpp::expected<void, Error>{}; |
| health_monitor_builder.with_internal_processing_cycle_internal(Duration::from_millis(internal_cycle_ms as u64)); | ||
| health_monitor_builder.with_supervisor_api_cycle_internal(Duration::from_millis(supervisor_cycle_ms as u64)); | ||
|
|
There was a problem hiding this comment.
health_monitor_builder_build calls HealthMonitorBuilder::build(), which assert!s that supervisor_api_cycle is a multiple of internal_processing_cycle. If the caller passes invalid values this will panic/abort in an extern "C" function. Validate the relationship here (and return FFICode::InvalidArgument/Failed) or wrap the call in catch_unwind to ensure no panics cross the FFI boundary.
| health_monitor_builder.with_internal_processing_cycle_internal(Duration::from_millis(internal_cycle_ms as u64)); | |
| health_monitor_builder.with_supervisor_api_cycle_internal(Duration::from_millis(supervisor_cycle_ms as u64)); | |
| // Validate arguments to avoid triggering assertions in `HealthMonitorBuilder::build()`. | |
| // Reject zero cycles and cases where the supervisor cycle is not a multiple of the internal cycle. | |
| if internal_cycle_ms == 0 | |
| || supervisor_cycle_ms == 0 | |
| || (supervisor_cycle_ms % internal_cycle_ms) != 0 | |
| { | |
| // `health_monitor_builder` will be dropped here, safely freeing the builder. | |
| return FFICode::InvalidArgument; | |
| } | |
| health_monitor_builder | |
| .with_internal_processing_cycle_internal(Duration::from_millis(internal_cycle_ms as u64)); | |
| health_monitor_builder | |
| .with_supervisor_api_cycle_internal(Duration::from_millis(supervisor_cycle_ms as u64)); |
There was a problem hiding this comment.
copilot has a right if we want to prevent cross ffi panic.
| let mut health_monitor_builder_placeholder = MaybeUninit::<HealthMonitorBuilder>::uninit(); | ||
| let mut health_monitor_builder_handle = health_monitor_builder_placeholder.as_mut_ptr() as FFIHandle; | ||
|
|
||
| let health_monitor_builder_create_result = | ||
| health_monitor_builder_create(&mut health_monitor_builder_handle as *mut FFIHandle); | ||
| assert!(!health_monitor_builder_handle.is_null()); | ||
| assert_eq!(health_monitor_builder_create_result, FFICode::Success); |
There was a problem hiding this comment.
The FFI tests initialize output handles with MaybeUninit::<T>::uninit().as_mut_ptr() (a non-null stack pointer) before calling *_create. This can mask bugs where the FFI function returns Success but forgets to write the output handle, since assert!(!handle.is_null()) would still pass. Prefer initializing handles to core::ptr::null_mut() and asserting they changed from null on success.
- Uniformly return `FFICode` in FFI functions. - Unit tests for FFI functions in Rust. - Rework comments for FFI functions. - Rework error handling in FFI layer. - Use factory method for object construction in C++. - To provide error handling. - Move `crate::common::ffi` to `crate::ffi`.
f78336e to
507d936
Compare
| assert!(!handle.is_null()); | ||
| assert!(!tag.is_null()); | ||
| assert!(!monitor.is_null()); | ||
| pub extern "C" fn health_monitor_builder_create(health_monitor_builder_handle: *mut FFIHandle) -> FFICode { |
| let health_monitor_builder = HealthMonitorBuilder::new(); | ||
| let health_monitor_builder_ptr = Box::into_raw(Box::new(health_monitor_builder)); | ||
| unsafe { | ||
| *health_monitor_builder_handle = health_monitor_builder_ptr as FFIHandle; |
There was a problem hiding this comment.
probably more places if possible.
| health_monitor_builder.with_internal_processing_cycle_internal(Duration::from_millis(internal_cycle_ms as u64)); | ||
| health_monitor_builder.with_supervisor_api_cycle_internal(Duration::from_millis(supervisor_cycle_ms as u64)); | ||
|
|
There was a problem hiding this comment.
copilot has a right if we want to prevent cross ffi panic.
| ) -> FFICode { | ||
| if health_monitor_builder_handle.is_null() { | ||
| return FFICode::NullParameter; | ||
| } |
There was a problem hiding this comment.
empty line separating ifs
| pub extern "C" fn health_monitor_builder_add_deadline_monitor( | ||
| health_monitor_builder_handle: FFIHandle, | ||
| monitor_tag: *const IdentTag, | ||
| deadline_monitor_builder_handle: FFIHandle, |
There was a problem hiding this comment.
ley call out params with _out suffix or something.
| *deadline_handle = core::ptr::null_mut(); | ||
| } | ||
| HM_OK | ||
| e |
| #[cfg(test)] | ||
| mod tests { | ||
| use crate::deadline::deadline_monitor::Deadline; | ||
| use crate::deadline::ffi::{ |
| { | ||
|
|
||
| HealthMonitorBuilder::HealthMonitorBuilder() | ||
| : health_monitor_builder_handle_{health_monitor_builder_create(), &health_monitor_builder_destroy} |
There was a problem hiding this comment.
so we shall keep constructors imho and simply assert on error in them as those are fully fatal errors and this is safety thing so aborting is a way to go imho.
| /// Builds and returns the HealthMonitor instance. | ||
| HealthMonitor build() &&; | ||
| /// Build a new `HealthMonitor` instance based on provided parameters. | ||
| score::cpp::expected<HealthMonitor, Error> build(); |
There was a problem hiding this comment.
this call shall either return Hm or abort as rust does. So for now there is no viable error path imho that user can really handle.
|
|
||
| enum class Error : internal::FFICode | ||
| { | ||
| NullParameter = 1, |
There was a problem hiding this comment.
add comment why from 1 -> or better do kSuccess + 1
FFICodein FFI functions.crate::common::ffitocrate::ffi.Closes #82