Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatusCode.UNAVAILABLE] Received http2 header with status: 502 #7760

Open
furkanc opened this issue Nov 3, 2024 · 0 comments
Open

StatusCode.UNAVAILABLE] Received http2 header with status: 502 #7760

furkanc opened this issue Nov 3, 2024 · 0 comments

Comments

@furkanc
Copy link

furkanc commented Nov 3, 2024

Hello, I have two loaded and ready models. When I send request with grpc client to vision model it works. However I send request to text model it returns following error

tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Received http2 header with status: 502

and tritonserver logs are like following. As far as I understand model works fine.

│ kserve-container I1103 09:46:33.043301 1 infer_handler.h:1384] "Returning from ModelInferHandler, 0, ISSUED"                                                                                                                                                                                    │
│ kserve-container I1103 09:46:33.044603 1 infer_request.cc:132] "[request id: <id_unknown>] Setting state from PENDING to EXECUTING"                                                                                                                                                             │
│ kserve-container I1103 09:46:33.044636 1 libtorch.cc:2660] "model mediacenter-clip-text, instance mediacenter-clip-text_0_0, executing 1 requests"                                                                                                                                              │
│ kserve-container I1103 09:46:33.044645 1 libtorch.cc:1268] "TRITONBACKEND_ModelExecute: Running mediacenter-clip-text_0_0 with 1 requests"                                                                                                                                                      │
│ kserve-container I1103 09:46:33.044700 1 pinned_memory_manager.cc:198] "pinned memory allocation: size 56, addr 0x7f00ae000090"                                                                                                                                                                 │
│ kserve-container I1103 09:46:33.045526 1 infer_handler.cc:1029] "ModelInferHandler::InferResponseComplete, 0 step ISSUED"                                                                                                                                                                       │
│ kserve-container I1103 09:46:33.045670 1 infer_request.cc:132] "[request id: <id_unknown>] Setting state from EXECUTING to RELEASED"                                                                                                                                                            │
│ kserve-container I1103 09:46:33.045681 1 infer_handler.cc:642] "ModelInferHandler::InferRequestComplete"                                                                                                                                                                                        │
│ kserve-container I1103 09:46:33.045693 1 pinned_memory_manager.cc:226] "pinned memory deallocation: addr 0x7f00ae000090"                                                                                                                                                                        │
│ kserve-container I1103 09:46:33.045765 1 infer_handler.h:1374] "Received notification for ModelInferHandler, 0"                                                                                                                                                                                 │
│ kserve-container I1103 09:46:33.045784 1 infer_handler.h:1377] "Grpc::CQ::Next() Running state_id 0\n\tContext step 0 id 0\n\t\t State id 0: State step 1\n"                                                                                                                                    │
│ kserve-container I1103 09:46:33.045794 1 infer_handler.cc:723] "Process for ModelInferHandler, rpc_ok=1, 0 step COMPLETE"                                                                                                                                                                       │
│ kserve-container I1103 09:46:33.045802 1 infer_handler.h:1384] "Returning from ModelInferHandler, 0, FINISH"                                                                                                                                                                                    │
│ kserve-container I1103 09:46:33.045811 1 infer_handler.h:1377] "Grpc::CQ::Next() Running state_id 0\n\tContext step 0 id 0\n\t\t State id 0: State step 2\n"                                                                                                                                    │
│ kserve-container I1103 09:46:33.045826 1 infer_handler.cc:723] "Process for ModelInferHandler, rpc_ok=1, 0 step FINISH"                                                                                                                                                                         │
│ kserve-container I1103 09:46:33.045836 1 infer_handler.h:1380] "Done for ModelInferHandler, 0"                                                                                                                                                                                                  │
│ kserve-container I1103 09:46:33.045847 1 infer_handler.h:1258] "StateRelease, 0 Step FINISH"

-- text model config.pbtxt

name: "mediacenter-clip-text"
platform: "pytorch_libtorch"
max_batch_size: 8
dynamic_batching {
  preferred_batch_size: [ 4, 8 ]
  max_queue_delay_microseconds: 1000
}
input [
  {
    name: "input_ids"
    data_type: TYPE_INT64
    dims: [ -1 ]
  }
]
output [
  {
    name: "text_embeds"
    data_type: TYPE_FP32
    dims: [ 768 ]
  }
]
instance_group [
  {
    kind: KIND_GPU
    count: 1
  }
]

-- vision model config.pbtxt

name: "mediacenter-clip-vision"
platform: "pytorch_libtorch"
max_batch_size: 8
dynamic_batching {
  preferred_batch_size: [ 4, 8 ]
  max_queue_delay_microseconds: 1000
}
input [
  {
    name: "pixel_values"
    data_type: TYPE_FP32
    dims: [ 3, 224, 224 ]
  }
]
output [
  {
    name: "image_embeds"
    data_type: TYPE_FP32
    dims: [ 768 ]
  }
]
instance_group [
  {
    kind: KIND_GPU
    count: 1
  }
]

Actually I couldn't figure out what's wrong. When I load both models on notebook it works. I can provide more info if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant