From 0acf9f035d683e072cfae067b0671c000117c3d6 Mon Sep 17 00:00:00 2001 From: prezakhani <13303554+Pouyanpi@users.noreply.github.com> Date: Thu, 29 Aug 2024 19:02:04 +0200 Subject: [PATCH 1/2] docs: move Cleanlab details to its own page at community --- docs/user_guides/community/cleanlab.md | 22 ++++++++++++++++++++++ docs/user_guides/guardrails-library.md | 22 +--------------------- 2 files changed, 23 insertions(+), 21 deletions(-) create mode 100644 docs/user_guides/community/cleanlab.md diff --git a/docs/user_guides/community/cleanlab.md b/docs/user_guides/community/cleanlab.md new file mode 100644 index 000000000..aa981b893 --- /dev/null +++ b/docs/user_guides/community/cleanlab.md @@ -0,0 +1,22 @@ +# Cleanlab Integration + +The `cleanlab trustworthiness` flow uses trustworthiness score with a default threshold of 0.6 to determine if the output should be allowed or not (i.e., if the trustworthiness score is below the threshold, the response is considered "untrustworthy"). + +A high trustworthiness score generally correlates with high-quality responses. In a question-answering application, high trustworthiness is indicative of correct responses, while in general open-ended applications, a high score corresponds to the response being helpful and informative. Trustworthiness scores are less useful for creative or open-ended requests. + +The mathematical derivation of the score is explained in [Cleanlab's documentation](https://help.cleanlab.ai/tutorials/tlm/#how-does-the-tlm-trustworthiness-score-work), and you can also access [trustworthiness score benchmarks](https://cleanlab.ai/blog/trustworthy-language-model/). + +You can easily change the cutoff value for the trustworthiness score by adjusting the threshold in the [config](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/cleanlab/flows.co). For example, to change the threshold to 0.7, you can add the following flow to your config: + +```colang +define subflow cleanlab trustworthiness + """Guardrail based on trustworthiness score.""" + $result = execute call cleanlab api + + if $result.trustworthiness_score < 0.7 + bot response untrustworthy + stop + +define bot response untrustworthy + "Don't place much confidence in this response" +``` diff --git a/docs/user_guides/guardrails-library.md b/docs/user_guides/guardrails-library.md index db7ccd1e3..fcee22b8d 100644 --- a/docs/user_guides/guardrails-library.md +++ b/docs/user_guides/guardrails-library.md @@ -652,27 +652,7 @@ rails: - cleanlab trustworthiness ``` -The `cleanlab trustworthiness` flow uses trustworthiness score with a default threshold of 0.6 to determine if the output should be allowed or not (i.e., if the trustworthiness score is below the threshold, the response is considered "untrustworthy"). - - -A high trustworthiness score generally correlates with high-quality responses. In a question-answering application, high trustworthiness is indicative of correct responses, while in general open-ended applications, a high score corresponds to the response being helpful and informative. Trustworthiness scores are less useful for creative or open-ended requests. - -The mathematical derivation of the score is explained in [Cleanlab's documentation](https://help.cleanlab.ai/tutorials/tlm/#how-does-the-tlm-trustworthiness-score-work), and you can also access [trustworthiness score benchmarks](https://cleanlab.ai/blog/trustworthy-language-model/). - -You can easily change the cutoff value for the trustworthiness score by adjusting the threshold in the [config](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/cleanlab/flows.co). For example, to change the threshold to 0.7, you can add the following flow to your config: - -```colang -define subflow cleanlab trustworthiness - """Guardrail based on trustworthiness score.""" - $result = execute call cleanlab api - - if $result.trustworthiness_score < 0.7 - bot response untrustworthy - stop - -define bot response untrustworthy - "Don't place much confidence in this response" -``` +For more details, check out the [Cleanlab Integration](./community/cleanlab.md) page. ## Other From 3b524e6fe10fa9cb7c4c322990dc6b768f986886 Mon Sep 17 00:00:00 2001 From: prezakhani <13303554+Pouyanpi@users.noreply.github.com> Date: Thu, 29 Aug 2024 19:04:31 +0200 Subject: [PATCH 2/2] docs: add cleanlab setup from library --- docs/user_guides/community/cleanlab.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/docs/user_guides/community/cleanlab.md b/docs/user_guides/community/cleanlab.md index aa981b893..ceb74fe8c 100644 --- a/docs/user_guides/community/cleanlab.md +++ b/docs/user_guides/community/cleanlab.md @@ -20,3 +20,15 @@ define subflow cleanlab trustworthiness define bot response untrustworthy "Don't place much confidence in this response" ``` + +## Setup + +Install `cleanlab-studio` to use Cleanlab's trustworthiness score: + +``` +pip install cleanlab-studio +``` + +Then, you can get an API key for free by [creating a Cleanlab account](https://app.cleanlab.ai/?signup_origin=TLM) or experiment with TLM in the [playground](https://tlm.cleanlab.ai/). You can also [email Cleanlab](mailto:sales@cleanlab.ai) for any special requests or support. + +Lastly, set the `CLEANLAB_API_KEY` environment variable with the API key.