Skip to content

Commit 35cc703

Browse files
feat: add pii masking capability to private ai integration (#901)
* Add tests for private ai pii masking * Add pii masking action and flow for private ai * Update private ai docs to add pii masking * Add private ai pii masking example config * Add private ai pii masking example in notebook * Apply suggestions from code review Co-authored-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com> Signed-off-by: Girish Sharma <girishsharma001@gmail.com> * Test private ai with live api instead of mocked detector * Improve error handling for private ai --------- Signed-off-by: Girish Sharma <girishsharma001@gmail.com> Co-authored-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
1 parent 4e5ff95 commit 35cc703

File tree

9 files changed

+577
-69
lines changed

9 files changed

+577
-69
lines changed

docs/user-guides/community/privateai.md

Lines changed: 38 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
11
# Private AI Integration
22

3-
[Private AI](https://docs.private-ai.com/?utm_medium=github&utm_campaign=nemo-guardrails) allows you to detect and mask Personally Identifiable Information (PII) in your data. This integration enables NeMo Guardrails to use Private AI for PII detection in input, output and retrieval flows.
3+
[Private AI](https://docs.private-ai.com/?utm_medium=github&utm_campaign=nemo-guardrails) allows you to detect and mask Personally Identifiable Information (PII) in your data. This integration enables NeMo Guardrails to use Private AI for PII detection and masking in input, output, and retrieval flows.
44

55
## Setup
66

77
1. Ensure that you have access to Private AI API server running locally or in the cloud. To get started with the cloud version, you can use the [Private AI Portal](https://portal.private-ai.com/?utm_medium=github&utm_campaign=nemo-guardrails). For containerized deployments, check out this [Quickstart Guide](https://docs.private-ai.com/quickstart/?utm_medium=github&utm_campaign=nemo-guardrails).
88

99
2. Update your `config.yml` file to include the Private AI settings:
1010

11+
**PII detection config**
12+
1113
```yaml
1214
rails:
1315
config:
@@ -31,19 +33,48 @@ rails:
3133
- detect pii on output
3234
```
3335
36+
The detection flow will not let the input/output/retrieval text pass if PII is detected.
37+
38+
**PII masking config**
39+
40+
```yaml
41+
rails:
42+
config:
43+
privateai:
44+
server_endpoint: http://your-privateai-api-endpoint/process/text # Replace this with your Private AI process text endpoint
45+
input:
46+
entities: # If no entity is specified here, all supported entities will be detected by default.
47+
- NAME_FAMILY
48+
- LOCATION_ADDRESS_STREET
49+
- EMAIL_ADDRESS
50+
output:
51+
entities:
52+
- NAME_FAMILY
53+
- LOCATION_ADDRESS_STREET
54+
- EMAIL_ADDRESS
55+
input:
56+
flows:
57+
- mask pii on input
58+
output:
59+
flows:
60+
- mask pii on output
61+
```
62+
63+
The masking flow will mask the PII in the input/output/retrieval text before they are sent to the LLM/user. For example, `Hi John Doe, my email is john.doe@example.com` will be converted to `Hi [NAME], my email is [EMAIL_ADDRESS]`.
64+
3465
Replace `http://your-privateai-api-endpoint/process/text` with your actual Private AI process text endpoint and set the `PAI_API_KEY` environment variable if you're using the Private AI cloud API.
3566

3667
3. You can customize the `entities` list under both `input` and `output` to include the PII types you want to detect. A full list of supported entities can be found [here](https://docs.private-ai.com/entities/?utm_medium=github&utm_campaign=nemo-guardrails).
3768

3869
## Usage
3970

40-
Once configured, the Private AI integration will automatically:
71+
Once configured, the Private AI integration can automatically:
4172

42-
1. Detect PII in user inputs before they are processed by the LLM.
43-
2. Detect PII in LLM outputs before they are sent back to the user.
44-
3. Detect PII in retrieved chunks before they are sent to the LLM.
73+
1. Detect or mask PII in user inputs before they are processed by the LLM.
74+
2. Detect or mask PII in LLM outputs before they are sent back to the user.
75+
3. Detect or mask PII in retrieved chunks before they are sent to the LLM.
4576

46-
The `detect_pii` action in `nemoguardrails/library/privateai/actions.py` handles the PII detection process.
77+
The `detect_pii` and `mask_pii` actions in `nemoguardrails/library/privateai/actions.py` handle the PII detection and masking processes, respectively.
4778

4879
## Customization
4980

@@ -56,6 +87,6 @@ If the Private AI detection API request fails, the system will assume PII is pre
5687
## Notes
5788

5889
- Ensure that your Private AI process text endpoint is properly set up and accessible from your NeMo Guardrails environment.
59-
- The integration currently supports PII detection only.
90+
- The integration currently supports PII detection and masking.
6091

6192
For more information on Private AI and its capabilities, please refer to the [Private AI documentation](https://docs.private-ai.com/?utm_medium=github&utm_campaign=nemo-guardrails).

docs/user-guides/guardrails-library.md

Lines changed: 8 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -750,9 +750,9 @@ For more details, check out the [GCP Text Moderation](https://github.com/NVIDIA/
750750

751751
### Private AI PII Detection
752752

753-
NeMo Guardrails supports using [Private AI API](https://docs.private-ai.com/?utm_medium=github&utm_campaign=nemo-guardrails) for PII detection in input, output and retrieval flows.
753+
NeMo Guardrails supports using [Private AI API](https://docs.private-ai.com/?utm_medium=github&utm_campaign=nemo-guardrails) for PII detection and masking input, output and retrieval flows.
754754

755-
To activate the PII detection, you need specify `server_endpoint`, and the entities that you want to detect. You'll also need to set the `PAI_API_KEY` environment variable if you're using the Private AI cloud API.
755+
To activate the PII detection or masking, you need specify `server_endpoint`, and the entities that you want to detect or mask. You'll also need to set the `PAI_API_KEY` environment variable if you're using the Private AI cloud API.
756756

757757
```yaml
758758
rails:
@@ -773,6 +773,8 @@ rails:
773773

774774
#### Example usage
775775

776+
**PII detection**
777+
776778
```yaml
777779
rails:
778780
input:
@@ -786,44 +788,19 @@ rails:
786788
- detect pii on retrieval
787789
```
788790

789-
For more details, check out the [Private AI Integration](https://github.com/NVIDIA/NeMo-Guardrails/blob/develop/docs/user-guides/community/privateai.md) page.
790-
791-
### Private AI PII Detection
792-
793-
NeMo Guardrails supports using [Private AI API](https://docs.private-ai.com/?utm_medium=github&utm_campaign=nemo-guardrails) for PII detection in input, output and retrieval flows.
794-
795-
To activate the PII detection, you need specify `server_endpoint`, and the entities that you want to detect. You'll also need to set the `PAI_API_KEY` environment variable if you're using the Private AI cloud API.
796-
797-
```yaml
798-
rails:
799-
config:
800-
privateai:
801-
server_endpoint: http://your-privateai-api-endpoint/process/text # Replace this with your Private AI process text endpoint
802-
input:
803-
entities: # If no entity is specified here, all supported entities will be detected by default.
804-
- NAME_FAMILY
805-
- EMAIL_ADDRESS
806-
...
807-
output:
808-
entities:
809-
- NAME_FAMILY
810-
- EMAIL_ADDRESS
811-
...
812-
```
813-
814-
#### Example usage
791+
**PII masking**
815792

816793
```yaml
817794
rails:
818795
input:
819796
flows:
820-
- detect pii on input
797+
- mask pii on input
821798
output:
822799
flows:
823-
- detect pii on output
800+
- mask pii on output
824801
retrieval:
825802
flows:
826-
- detect pii on retrieval
803+
- mask pii on retrieval
827804
```
828805

829806
For more details, check out the [Private AI Integration](./community/privateai.md) page.
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
models:
2+
- type: main
3+
engine: openai
4+
model: gpt-3.5-turbo-instruct
5+
6+
rails:
7+
config:
8+
privateai:
9+
server_endpoint: https://api.private-ai.com/cloud/v3/process/text
10+
input:
11+
entities:
12+
- NAME_FAMILY
13+
- LOCATION_ADDRESS_STREET
14+
- EMAIL_ADDRESS
15+
output:
16+
entities: # If no entity is specified here, all supported entities will be masked by default.
17+
- NAME_FAMILY
18+
- LOCATION_ADDRESS_STREET
19+
- EMAIL_ADDRESS
20+
input:
21+
flows:
22+
- mask pii on input
23+
24+
output:
25+
flows:
26+
- mask pii on output

0 commit comments

Comments
 (0)