Error configuring NeMo Guardrails with a TensorRT LLM serving on TRT-LLM server #657

w8jie · 2024-07-31T10:18:06Z

Has anyone successfully integrated TensorRT LLM with NeMo Guardrails? There is a lack of documentation on utilizing TensorRT LLM for Nemo Guardrails so I am hoping for some guidance here.

I am getting the following error when trying to execute self check input:
Error while execution self_check_input: [StatusCode.UNAVAILABLE] DNS resolution failed for http://triton-models:8000/v2/models/ensemble/generate: C-ares status is not ARES_SUCCESS qtype=AAAA name=http://triton-models:8000/v2/models/ensemble/generate is_balancer=0: Could not contact DNS servers

My config.yml:

models:
 - type: main
   engine: trt_llm
   parameters:
    server_url: http://triton-models:8000/v2/models/ensemble/generate

# These instructions configure the bot to answer questions about the employee handbook and the company's policies.
instructions:
  - type: general
    content: |
      Below is a conversation between a user and a bot called X Virtual Assistant.
      The bot is talkative and provides lots of specific details from its context.
      If the bot does not know the answer to a question, it truthfully says it does not know.

rails:
  # Input rails are invoked when a new message from the user is received.
  input:
    flows:
      - self check input

  # Output rails are triggered after a bot message has been generated.
  output:
    flows:
      - self check output

langchain integration:

# load NeMo Guardrails
config = RailsConfig.from_path("./guardrails_config")
guardrails = RunnableRails(config, passthrough=False)

class ChatChain(LLMChain):
    def __init__(self, llm, examples):
        super().__init__(llm)
        prompt = chat_template.format_chat_prompt(examples)
        custom_chain = (
            prompt
            | llm
            | StrOutputParser()
        )

        mem_chain = RunnableWithMessageHistory(
            custom_chain,
            get_message_history,
            input_messages_key="input",
            history_messages_key="history",
            output_messages_key="output"
        )

        self.mem_chain_with_guardrails = guardrails | mem_chain
    
    def get_chain(self):
        return self.mem_chain_with_guardrails

Pip version:
nemoguardrails==0.9.0
tritonclient[all]

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error configuring NeMo Guardrails with a TensorRT LLM serving on TRT-LLM server #657

Error configuring NeMo Guardrails with a TensorRT LLM serving on TRT-LLM server #657

w8jie commented Jul 31, 2024

Error configuring NeMo Guardrails with a TensorRT LLM serving on TRT-LLM server #657

Error configuring NeMo Guardrails with a TensorRT LLM serving on TRT-LLM server #657

Comments

w8jie commented Jul 31, 2024