docs: update cleanlab integration description (#1080)

jwmueller · web-flow · commit d2734ac59659 · 2025-04-14T12:03:38.000+02:00
* Update cleanlab integration description

Signed-off-by: Jonas Mueller &lt;1390638+jwmueller@users.noreply.github.com&gt;

* address feedback

Signed-off-by: Jonas Mueller &lt;1390638+jwmueller@users.noreply.github.com&gt;

---------

Signed-off-by: Jonas Mueller &lt;1390638+jwmueller@users.noreply.github.com&gt;
diff --git a/docs/user-guides/community/cleanlab.md b/docs/user-guides/community/cleanlab.md
@@ -1,12 +1,12 @@
 # Cleanlab Integration
 
-The `cleanlab trustworthiness` flow uses trustworthiness score with a default threshold of 0.6 to determine if the output should be allowed or not (i.e., if the trustworthiness score is below the threshold, the response is considered "untrustworthy").
+Cleanlab's state-of-the-art [LLM uncertainty estimator](https://cleanlab.ai/blog/trustworthy-language-model/) scores the _trustworthiness_ of any LLM response, to detect incorrect outputs and hallucinations in real-time.
 
-A high trustworthiness score generally correlates with high-quality responses. In a question-answering application, high trustworthiness is indicative of correct responses, while in general open-ended applications, a high score corresponds to the response being helpful and informative. Trustworthiness scores are less useful for creative or open-ended requests.
+In question-answering or RAG applications: high trustworthiness is indicative of a correct response. In open-ended chat applications, a high score corresponds to the response being helpful and informative. Low trustworthiness scores indicate outputs that are likely bad or incorrect, or complex prompts where the LLM might have output the right response this time but might output the wrong response when run on the same prompt again (so it cannot be trusted).
 
-The mathematical derivation of the score is explained in [Cleanlab's documentation](https://help.cleanlab.ai/tutorials/tlm/#how-does-the-tlm-trustworthiness-score-work), and you can also access [trustworthiness score benchmarks](https://cleanlab.ai/blog/trustworthy-language-model/).
+The trustworthiness score is further explained and comprehensively benchmarked in [Cleanlab's documentation](https://help.cleanlab.ai/tlm/).
 
-You can easily change the cutoff value for the trustworthiness score by adjusting the threshold in the [config](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/cleanlab/flows.co). For example, to change the threshold to 0.7, you can add the following flow to your config:
+The `cleanlab trustworthiness` guardrail flow uses a default trustworthiness score threshold of 0.6 to determine if your LLM output should be allowed or not. When the trustworthiness score falls below the threshold, the corresponding LLM response is flagged as _unstrustworthy_. You can easily change the cutoff value for the trustworthiness score by adjusting the threshold in the [config](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/cleanlab/flows.co). For example, to change the threshold to 0.7, add the following flow to your config:
 
 ```colang
 define subflow cleanlab trustworthiness
@@ -23,12 +23,12 @@ define bot response untrustworthy
 
 ## Setup
 
-Install `cleanlab-studio` to use Cleanlab's trustworthiness score:
+Install the Python client to use Cleanlab's trustworthiness score:
 
 ```
 pip install cleanlab-studio
 ```
 
-Then, you can get an API key for free by [creating a Cleanlab account](https://app.cleanlab.ai/?signup_origin=TLM) or experiment with TLM in the [playground](https://tlm.cleanlab.ai/). You can also [email Cleanlab](mailto:sales@cleanlab.ai) for any special requests or support.
+You can get an API key for free by [creating a Cleanlab account](https://tlm.cleanlab.ai/) or experiment with the trustworthiness scores in the [playground](https://chat.cleanlab.ai/chat). Feel free to [email Cleanlab](mailto:suport@cleanlab.ai) with any questions.
 
 Lastly, set the `CLEANLAB_API_KEY` environment variable with the API key.