You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We also have to add the AutoAlign's guardrail endpoint in parameters.
270
271
272
+
"multi_language" is an optional parameter to enable guardrails for non-English information
273
+
271
274
One of the advanced configs is matching score (ranging from 0 to 1) which is a threshold that determines whether the guardrail will block the input/output or not.
272
275
If the matching score is higher (i.e. close to 1) then the guardrail will be more strict.
273
276
Some guardrails have very different format of `matching_scores` config,
@@ -317,7 +320,7 @@ The actions `autoalign_input_api` and `autoalign_output_api` takes in two argume
317
320
`show_toxic_phrases`. Both the arguments expect boolean value being passed to them. The default value of
318
321
`show_autoalign_message`is `True` and for `show_toxic_phrases` is False. The `show_autoalign_message` controls whether
319
322
we will show any output from autoalign or not. The response from AutoAlign would be presented as a subtext, when
320
-
`show_autoalign_message`is kept `True`. Details regarding the second argument can be found in `text_toxicity_extraction`
323
+
`show_autoalign_message`is kept `True`. Details regarding the second argument can be found in `toxicity_detection`
321
324
section.
322
325
323
326
@@ -380,13 +383,17 @@ For intellectual property detection, the matching score has to be following form
380
383
"matching_scores": { "score": 0.5}
381
384
```
382
385
383
-
### Confidential detection
386
+
### Confidential Info detection
387
+
388
+
```{warning}
389
+
Backward incompatible changes are introduced in v0.12.0 due to AutoAlign API changes
390
+
```
384
391
385
-
The goal of the confidential detection rail is to determine if the text has any kind of confidential information. This rail can be applied at both input and output.
386
-
This guardrail can be added by adding `confidential_detection` key in the dictionary under `guardrails_config` section
392
+
The goal of the confidential info detection rail is to determine if the text has any kind of confidential information. This rail can be applied at both input and output.
393
+
This guardrail can be added by adding `confidential_info_detection` key in the dictionary under `guardrails_config` section
387
394
which is under `input` or `output` section which should be in `autoalign` section in `config.yml`.
388
395
389
-
For confidential detection, the matching score has to be following format:
396
+
For confidential info detection, the matching score has to be following format:
390
397
391
398
```yaml
392
399
"matching_scores": {
@@ -436,8 +443,12 @@ For tonal detection, the matching score has to be following format:
436
443
437
444
### Toxicity extraction
438
445
446
+
```{warning}
447
+
Backward incompatible changes are introduced in v0.12.0 due to AutoAlign API changes
448
+
```
449
+
439
450
The goal of the toxicity detection rail is to determine if the text has any kind of toxic content. This rail can be applied at both input and output. This guardrail not just detects the toxicity of the text but also extracts toxic phrases from the text.
440
-
This guardrail can be added by adding `text_toxicity_extraction` key in the dictionary under `guardrails_config` section
451
+
This guardrail can be added by adding `toxicity_detection` key in the dictionary under `guardrails_config` section
441
452
which is under `input` or `output` section which should be in `autoalign` section in `config.yml`.
442
453
443
454
For text toxicity detection, the matching score has to be following format:
The `threshold` can be changed depending upon the use-case, the `output_result`
@@ -627,6 +647,69 @@ for ideal chit-chat.
627
647
628
648
629
649
630
-
The output of the factcheck endpoint provides you with a factcheck score against which we can add a threshold which determines whether the given output is factually correct or not.
650
+
The output of the groundness check endpoint provides you with a factcheck score against which we can add a threshold which determines whether the given output is factually correct or not.
631
651
632
652
The supporting documents or the evidence has to be placed within a `kb` folder within `config` folder.
653
+
654
+
655
+
### Fact Check
656
+
657
+
```{warning}
658
+
Backward incompatible changes are introduced in v0.12.0 due to AutoAlign API changes
659
+
```
660
+
661
+
The fact check uses the bot response and user input prompt to check the factual correctness of the bot response based on the user prompt. Unlike groundness check, fact check does not use a pre-existing internal knowledge base.
662
+
To use AutoAlign's fact check module, modify the `config.yml` from example autoalign_factcheck_config.
Specify the fact_check_endpoint to the correct AutoAlign environment.
713
+
Then set to the corresponding subflows for fact check guardrail.
714
+
715
+
The output of the fact check endpoint provides you with a fact check score that combines the factual correctness of various statements made by the bot response. Then provided with a user set threshold, will log a warning if the bot response is determined to be factually incorrect
0 commit comments