-
Notifications
You must be signed in to change notification settings - Fork 11.5k
fix(rpc): Improve input validation and error handling #13069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
on my opinion it may affects to perfomance, may be use feature flag (via cmake)? |
bef194d
to
604f0a0
Compare
I believe it would be interesting to see what the performance impact of this change is. I'm new to the project so pointers welcome if there's a test suite available which would show that. Slightly off-topic but related: I think there's plenty of opportunities for similar improvements in the RPC server. From invalid tensor operations to crashing via deep recursion in I think multiple critical fixes behind a feature flag would be counterintuitive. Rather build bench tooling (if needed) and iterate on the fixes so there's minimal performance hit. |
We use
The best investment of efforts in this direction would be creating a script/job for coverage guided fuzzing. This way we can automatically test for security issues when we make RPC changes and even integrate it into the CI. |
llama.cpp/ggml/src/ggml-rpc/ggml-rpc.cpp Line 1470 in 604f0a0
|
Sounds good, I can look into that after this PR 👍 |
The `rpc-server` was vulnerable to Denial of Service attacks via several RPC commands (`SET_TENSOR`, `GRAPH_COMPUTE`, etc.). Malformed messages could trigger failed assertions (e.g., invalid `ggml_type`) or out-of-bounds reads/writes leading to `GGML_ABORT` calls, crashing the server process. This PR introduces robust input validation and replaces `abort()` calls with graceful error handling: - **Type Validation:** `deserialize_tensor` now checks if the `tensor->type` is within the valid `GGML_TYPE_COUNT` range *before* calling `ggml_new_tensor_4d`. Returns `nullptr` on invalid type. - **Bounds Checks:** Replaced `GGML_ABORT` in `set_tensor`, `set_tensor_hash`, and `get_tensor` handlers with error logging and returning `false` when data/offset parameters are out of buffer bounds. - **Size Checks:** Added safe arithmetic checks (for overflow) in `graph_compute` when calculating required message sizes based on client-provided `n_nodes` and `n_tensors`. Returns early if the reported sizes conflict with the actual message size or would lead to overflow. - **Error Propagation:** - `create_node` now checks for `nullptr` return values from `deserialize_tensor` and its recursive calls, propagating `nullptr` upwards on failure. Uses `find` instead of `at` for safer map access. - `copy_tensor` now checks for `nullptr` from `deserialize_tensor` and sets the response status to failure if deserialization or bounds checks fail. - `graph_compute` now checks for `nullptr` return from `create_node` and returns failure status correctly. The final return value now reflects the actual computation status. These changes improve the RPC server's resilience against malformed client requests, preventing crashes and ensuring errors are handled more gracefully. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
e6dd976
to
359e38e
Compare
removed comments and unnecessary returns Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
rpc_server::create_node could previously return nullptr if the input ID was 0 (valid) or if an internal error (deserialization, recursion failure) occurred (invalid). This ambiguity made error handling difficult for the caller (`graph_compute`). This commit clarifies the meaning of nullptr: - `graph_compute` now checks if the input 'id' was non-zero when `create_node` returns nullptr, correctly identifying failures versus intentional null links. - `create_node` avoids recursive calls for zero IDs and propagates nullptr unambiguously on failure during recursion. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
The caller (`graph_compute`) already checks `id != 0` when handling a `nullptr` return from `create_node`, correctly distinguishing intentional null links from actual errors. This makes the initial `if (id == 0)` check redundant. Also removes the log message when a tensor ID is not found in the provided map which was added in this branch. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
Check the return value of `server.get_alloc_size` in the RPC server loop. If the call fails, return early to close the connection. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
Removes detailed, step-by-step size calculations and overflow checks in favor of simpler direct comparisons, assuming 64-bit overflow is unlikely. Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
359e38e
to
72c447a
Compare
Fixes #13067
The
rpc-server
was vulnerable to Denial of Service attacks via several RPC commands (SET_TENSOR
,GRAPH_COMPUTE
, etc.). Malformed messages could trigger failed assertions (e.g., invalidggml_type
) or out-of-bounds reads/writes leading toGGML_ABORT
calls, crashing the server process.This PR introduces robust input validation and replaces
abort()
calls with graceful error handling:deserialize_tensor
now checks if thetensor->type
is within the validGGML_TYPE_COUNT
range before callingggml_new_tensor_4d
. Returnsnullptr
on invalid type.GGML_ABORT
inset_tensor
,set_tensor_hash
, andget_tensor
handlers with error logging and returningfalse
when data/offset parameters are out of buffer bounds.create_node
now checks fornullptr
return values fromdeserialize_tensor
and its recursive calls, propagatingnullptr
upwards on failure. Usesfind
instead ofat
for safer map access.copy_tensor
now checks fornullptr
fromdeserialize_tensor
and sets the response status to failure if deserialization or bounds checks fail.graph_compute
now checks fornullptr
return fromcreate_node
and returns failure status correctly. The final return value now reflects the actual computation status.RPC_CMD_GET_ALLOC_SIZE
now checks the return value ofserver.get_alloc_size
in the RPC serverloop. If the call fails, return early to close the connection.