You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is it possible to configure Nemo-Guardrails to avoid sending the actual user input to the LLM? I understand that the actual user input won't be sent if the input rails are triggered. However, is it also possible to prevent the user input from being sent, regardless of whether the input rails are triggered or not? Thanks!
The text was updated successfully, but these errors were encountered:
Yes, it is possible to configure NeMo Guardrails to avoid sending the actual user input to the LLM, regardless of whether input rails are triggered.
In your colang file you could add the following:
define user_input_passes_guardrails as user says something not offensive or inappropriate
when user_input_passes_guardrails:
bot says "pass"
action stop_processing # This stops the input from being sent to the LLM
You could also chain another LLM to your guardrails:
Use a secondary LLM as a pre-processing step. I would call this an Observer.
This secondary LLM evaluates the user input against the defined guardrails.
It outputs a simple "pass" or "fail" result.
If the input passes, a sanitized or rephrased version of the input (not the original) is sent to the primary LLM. Or even just a simple "pass" is sent to the primary LLM.
If it fails, the input is blocked entirely. Or "fail" is sent to primary LLM
In essence, you would prompt your Observer LLM in a way that makes sure it doesn't output the user's input.
Hi team,
Is it possible to configure Nemo-Guardrails to avoid sending the actual user input to the LLM? I understand that the actual user input won't be sent if the input rails are triggered. However, is it also possible to prevent the user input from being sent, regardless of whether the input rails are triggered or not? Thanks!
The text was updated successfully, but these errors were encountered: