Cleanlab Integration

The cleanlab trustworthiness flow uses trustworthiness score with a default threshold of 0.6 to determine if the output should be allowed or not (i.e., if the trustworthiness score is below the threshold, the response is considered "untrustworthy").

A high trustworthiness score generally correlates with high-quality responses. In a question-answering application, high trustworthiness is indicative of correct responses, while in general open-ended applications, a high score corresponds to the response being helpful and informative. Trustworthiness scores are less useful for creative or open-ended requests.

The mathematical derivation of the score is explained in Cleanlab's documentation, and you can also access trustworthiness score benchmarks.

You can easily change the cutoff value for the trustworthiness score by adjusting the threshold in the config. For example, to change the threshold to 0.7, you can add the following flow to your config:

define subflow cleanlab trustworthiness
  """Guardrail based on trustworthiness score."""
  $result = execute call cleanlab api

  if $result.trustworthiness_score < 0.7
    bot response untrustworthy
    stop

define bot response untrustworthy
  "Don't place much confidence in this response"

Setup

Install cleanlab-studio to use Cleanlab's trustworthiness score:

pip install cleanlab-studio

Then, you can get an API key for free by creating a Cleanlab account or experiment with TLM in the playground. You can also email Cleanlab for any special requests or support.

Lastly, set the CLEANLAB_API_KEY environment variable with the API key.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cleanlab.md

cleanlab.md

Cleanlab Integration

Setup

Files

cleanlab.md

Latest commit

History

cleanlab.md

File metadata and controls

Cleanlab Integration

Setup