-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix error handling for GPU tensors #249
Conversation
00df006
to
b4946bd
Compare
Can you summarize the bug and fix in the description? |
@rmccorm4 added. |
b4946bd
to
4e25872
Compare
f68d168
to
fd3df73
Compare
b3c3d9f
to
73e902f
Compare
src/gpu_buffers.h
Outdated
uint32_t buffer_count; | ||
}; | ||
|
||
class GPUBufferTransporter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the term Transporter common - not familiar - maybe a short one or two line class description would help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments.
void | ||
GPUBufferTransporter::Complete(std::unique_ptr<SharedMemoryManager>& shm_pool) | ||
{ | ||
if (completed_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an error case as above with adding a buffer to a completed transaction?
src/pb_utils.h
Outdated
@@ -212,23 +212,17 @@ struct ResponseSenderBase { | |||
struct ResponseSendMessage : ResponseSenderBase { | |||
bi::managed_external_buffer::handle_t response; | |||
|
|||
// GPU Buffers handle | |||
// A pointer to GPUBuffersShm object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like we could keep this the same if the structure was renamed to GPUBuffers - is it still a handle or is that seperate from pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By pointer I meant a handle to a GPUBuffersShm. Will update the comment.
src/python_be.cc
Outdated
@@ -661,46 +660,33 @@ ModelInstanceState::ExecuteBLSRequest( | |||
lbackend_memory.reset(backend_memory); | |||
input_tensor->SetMemory(std::move(PbMemory::Create( | |||
Stub()->ShmPool(), std::move(lbackend_memory)))); | |||
gpu_buffer_transporter.AddBuffer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of transporter
would response
work? GPUBuffers
and GPUBuffersResponse
instead of GPUBuffersShm
and GPUBufferTransporter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like transporter more as I think "response" may imply that it can only be used with the backend responses which is not true. Let me know if you have any other suggestions.
I don't quite follow the logic of when we use the GPUBufferTransporter and when we use the GPUBuffersShm structure directly - but I think that's from my lack of familiarity with the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than minor comment
e661853
to
3779acb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Bug
the problem was that the
inference_response->Send
method did not properly handle the error case. GUARDED_RESPOND_IF_ERROR afterSend
would cause a double free because of sending the response twice.Another issue existed in the GPU tensor error handling where the error was not properly catched and returned to the stub process.
Fix
For the first issue the fix was to simplify the error handling and only send an error using the inference response.
For the second issue, new data structures were added to properly catch the error.
triton-inference-server/server#5871