[TensorRT-LLM][ERROR] Assertion failed: hasValues == configValue.has_value() (/app/tensorrt_llm/cpp/include/tensorrt_llm/runtime/samplingConfig.h:46 #1447

NikolaBorisov · 2024-04-13T04:21:17Z

System Info

H100 x8 SXM 80G, 2TB Ram, x86, main branch of TRTLLM

Who can help?

@byshiue

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Run Triton with tensorrtLLM on 8x H100 server with mistral 8x22b model

Expected behavior

No crashes

actual behavior

at some point the server prints an error
We are seeing a crash in samplingConfig.h:46.

[TensorRT-LLM][ERROR] Encountered an error in forward function: [TensorRT-LLM][ERROR] Assertion failed: hasValues == configValue.has_value() (/app/tensorrt_llm/cpp/include/tensorrt_llm/runtime/samplingConfig.h:46)
1       0x7f2f2005df31 tensorrt_llm::common::throwRuntimeError(char const*, int, std::string const&) + 102
2       0x7f2db5ceffcc std::optional<std::vector<int, std::allocator<int> > > tensorrt_llm::runtime::SamplingConfig::fuseValues<int>(std::vector<tensorrt_llm::runtime::SamplingConfig, std::allocator<tensorrt_llm::runtime::SamplingConfig> > const&, std::function<std::optional<std::vector<int, std::allocator<int> > > (int)>) + 476
3       0x7f2db5cf0ae9 tensorrt_llm::runtime::SamplingConfig::SamplingConfig(std::vector<tensorrt_llm::runtime::SamplingConfig, std::allocator<tensorrt_llm::runtime::SamplingConfig> > const&) + 873
4       0x7f2db5ce4344 tensorrt_llm::runtime::GptDecoderBatch::newRequests(std::vector<int, std::allocator<int> > const&, std::vector<tensorrt_llm::runtime::decoder_batch::Request, std::allocator<tensorrt_llm::runtime::decoder_batch::Request> > const&, std::vector<tensorrt_llm::runtime::SamplingConfig, std::allocator<tensorrt_llm::runtime::SamplingConfig> > const&) + 404
5       0x7f2db5dfb5f3 tensorrt_llm::batch_manager::TrtGptModelInflightBatching::setupDecoderStep(std::map<unsigned long, std::shared_ptr<tensorrt_llm::batch_manager::LlmRequest>, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, std::shared_ptr<tensorrt_llm::batch_manager::LlmRequest> > > >&, std::vector<unsigned long, std::allocator<unsigned long> > const&) + 851
6       0x7f2db5dfd8d7 tensorrt_llm::batch_manager::TrtGptModelInflightBatching::forward(std::list<std::shared_ptr<tensorrt_llm::batch_manager::LlmRequest>, std::allocator<std::shared_ptr<tensorrt_llm::batch_manager::LlmRequest> > >&) + 5495
7       0x7f2db5dafd34 tensorrt_llm::batch_manager::GptManager::step(std::list<std::shared_ptr<tensorrt_llm::batch_manager::LlmRequest>, std::allocator<std::shared_ptr<tensorrt_llm::batch_manager::LlmRequest> > >&, std::set<unsigned long, std::less<unsigned long>, std::allocator<unsigned long> >&) + 36
8       0x7f2db5db7e64 tensorrt_llm::batch_manager::GptManager::decoupled_execution_loop() + 404
9       0x7f31359f2253 /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253) [0x7f31359f2253]
10      0x7f3135781ac3 /usr/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7f3135781ac3]
11      0x7f3135813850 /usr/lib/x86_64-linux-gnu/libc.so.6(+0x126850) [0x7f3135813850]
[TensorRT-LLM][WARNING] Step function failed, continuing.

After this crash the server continues to work, but batch size is limited to 2. It prints the above error number of times.

additional notes

Causes server to be unsuable

The text was updated successfully, but these errors were encountered:

juney-nvidia · 2024-04-13T12:09:24Z

@nekorobov Can you help take a look of this issue? Thanks

June

nekorobov · 2024-04-18T12:21:31Z

Hi @NikolaBorisov , thank you for reporting the issue. It likely happens due to the fact that one request has some value set in the samplingConfig, while the other request does not have it. Could you, please, either confirm or deny that this is the case in your setup? If it is the case, the temporary fix is, on the caller side, to enforce that either all requests or none of them specify parameter to sampling config.

Meanwhile, we'll work on the fix from our side, thanks

nekorobov · 2024-04-24T12:14:18Z

The issue should be solved in the latest main branch. Could you try it and reopen if it does not work for you? Thank you!

NikolaBorisov added the bug Something isn't working label Apr 13, 2024

juney-nvidia assigned nekorobov Apr 13, 2024

juney-nvidia added the triaged Issue has been triaged by maintainers label Apr 13, 2024

kaiyux mentioned this issue Apr 24, 2024

Update TensorRT-LLM #1492

Merged

nekorobov closed this as completed Apr 24, 2024

kaiyux mentioned this issue Jun 5, 2024

TensorRT-LLM v0.10 update #1734

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TensorRT-LLM][ERROR] Assertion failed: hasValues == configValue.has_value() (/app/tensorrt_llm/cpp/include/tensorrt_llm/runtime/samplingConfig.h:46 #1447

[TensorRT-LLM][ERROR] Assertion failed: hasValues == configValue.has_value() (/app/tensorrt_llm/cpp/include/tensorrt_llm/runtime/samplingConfig.h:46 #1447

NikolaBorisov commented Apr 13, 2024

juney-nvidia commented Apr 13, 2024

nekorobov commented Apr 18, 2024 •

edited

Loading

nekorobov commented Apr 24, 2024

[TensorRT-LLM][ERROR] Assertion failed: hasValues == configValue.has_value() (/app/tensorrt_llm/cpp/include/tensorrt_llm/runtime/samplingConfig.h:46 #1447

[TensorRT-LLM][ERROR] Assertion failed: hasValues == configValue.has_value() (/app/tensorrt_llm/cpp/include/tensorrt_llm/runtime/samplingConfig.h:46 #1447

Comments

NikolaBorisov commented Apr 13, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

actual behavior

additional notes

juney-nvidia commented Apr 13, 2024

nekorobov commented Apr 18, 2024 • edited Loading

nekorobov commented Apr 24, 2024

nekorobov commented Apr 18, 2024 •

edited

Loading