-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
11 changed files
with
1,119 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
# Cloud Data Loss Prevention (DLP) API Samples | ||
The [Data Loss Prevention API](https://cloud.google.com/dlp/docs/) provides programmatic access to | ||
a powerful detection engine for personally identifiable information and other privacy-sensitive data | ||
in unstructured data streams. | ||
|
||
## Setup | ||
- A Google Cloud project with billing enabled | ||
- [Enable](https://console.cloud.google.com/launcher/details/google/dlp.googleapis.com) the DLP API. | ||
- (Local testing)[Create a service account](https://cloud.google.com/docs/authentication/getting-started) | ||
and set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable pointing to the downloaded credentials file. | ||
|
||
## Build | ||
This project uses the [Assembly Plugin](https://maven.apache.org/plugins/maven-assembly-plugin/usage.html) to build an uber jar. | ||
Run: | ||
``` | ||
mvn clean package | ||
``` | ||
|
||
## Retrieve InfoTypes | ||
An [InfoType identifier](https://cloud.google.com/dlp/docs/infotypes-categories) represents an element of sensitive data. | ||
|
||
[Info types](https://cloud.google.com/dlp/docs/infotypes-reference#global) are updated periodically. Use the API to retrieve the most current | ||
info types for a given category. eg. HEALTH or GOVERNMENT. | ||
``` | ||
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Metadata -category GOVERNMENT | ||
``` | ||
|
||
## Retrieve Categories | ||
[Categories](https://cloud.google.com/dlp/docs/infotypes-categories) provide a way to easily access a group of related InfoTypes. | ||
``` | ||
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Metadata | ||
``` | ||
|
||
## Inspect data for sensitive elements | ||
Inspect strings, files locally and on Google Cloud Storage and Cloud Datastore kinds with the DLP API. | ||
|
||
Note: image scanning is not currently supported on Google Cloud Storage. | ||
For more information, refer to the [API documentation](https://cloud.google.com/dlp/docs). | ||
Optional flags are explained in [this resource](https://cloud.google.com/dlp/docs/reference/rest/v2beta1/content/inspect#InspectConfig). | ||
``` | ||
Commands: | ||
-s <string> Inspect a string using the Data Loss Prevention API. | ||
-f <filepath> Inspects a local text, PNG, or JPEG file using the Data Loss Prevention API. | ||
-gcs -bucketName <bucketName> -fileName <fileName> Inspects a text file stored on Google Cloud Storage using the Data Loss | ||
Prevention API. | ||
-ds -projectId [projectId] -namespace [namespace] - kind <kind> Inspect a Datastore instance using the Data Loss Prevention API. | ||
Options: | ||
--help Show help | ||
-minLikelihood [string] [choices: "LIKELIHOOD_UNSPECIFIED", "VERY_UNLIKELY", "UNLIKELY", "POSSIBLE", "LIKELY", "VERY_LIKELY"] | ||
[default: "LIKELIHOOD_UNSPECIFIED"] | ||
specifies the minimum reporting likelihood threshold. | ||
-f, --maxFindings [number] [default: 0] | ||
maximum number of results to retrieve | ||
-q, --includeQuote [boolean] [default: true] include matching string in results | ||
-t, --infoTypes restrict to limited set of infoTypes [ default: []] | ||
[ eg. PHONE_NUMBER US_PASSPORT] | ||
``` | ||
### Examples | ||
- Inspect a string: | ||
``` | ||
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -s "My phone number is (123) 456-7890 and my email address is me@somedomain.com" | ||
``` | ||
- Inspect a local file (text / image): | ||
``` | ||
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -f resources/test.txt | ||
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -f resources/test.png | ||
``` | ||
- Inspect a file on Google Cloud Storage: | ||
``` | ||
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -gcs -bucketName my-bucket -fileName my-file.txt | ||
``` | ||
- Inspect a Google Cloud Datastore kind: | ||
``` | ||
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Inspect -ds -kind my-kind | ||
``` | ||
|
||
## Automatic redaction of sensitive data | ||
[Automatic redaction](https://cloud.google.com/dlp/docs/classification-redaction) produces an output with sensitive data matches removed. | ||
|
||
``` | ||
Commands: | ||
-s <string> Source input string | ||
-r <replacement string> String to replace detected info types | ||
Options: | ||
--help Show help | ||
-minLikelihood choices: "LIKELIHOOD_UNSPECIFIED", "VERY_UNLIKELY", "UNLIKELY", "POSSIBLE", "LIKELY", "VERY_LIKELY"] | ||
[default: "LIKELIHOOD_UNSPECIFIED"] | ||
specifies the minimum reporting likelihood threshold. | ||
-infoTypes restrict operation to limited set of info types [ default: []] | ||
[ eg. PHONE_NUMBER US_PASSPORT] | ||
``` | ||
|
||
### Example | ||
- Replace sensitive data in text with `_REDACTED_`: | ||
``` | ||
java -cp target/dlp-samples-1.0-jar-with-dependencies.jar com.example.dlp.Redact -s "My phone number is (123) 456-7890 and my email address is me@somedomain.com" -r "_REDACTED_" | ||
``` | ||
|
||
## Integration tests | ||
### Setup | ||
- [Create a Google Cloud Storage bucket](https://console.cloud.google.com/storage) and upload [test.txt](src/test/resources/test.txt). | ||
- [Create a Google Cloud Datastore](https://console.cloud.google.com/datastore) kind and add an entity with properties: | ||
- `property1` : john@doe.com | ||
- `property2` : 343-343-3435 | ||
- Update the Google Cloud Storage path and Datastore kind in [InspectIT.java](src/test/java/com/example/dlp/InspectIT.java). | ||
- Ensure that `GOOGLE_APPLICATION_CREDENTIALS` points to authorized service account credentials file. | ||
|
||
## Run | ||
Run all tests: | ||
``` | ||
mvn clean verify | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!-- | ||
Copyright 2017 Google Inc. | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--> | ||
<!-- [START pom] --> | ||
<project> | ||
<modelVersion>4.0.0</modelVersion> | ||
<packaging>jar</packaging> | ||
<groupId>com.example</groupId> | ||
<artifactId>dlp-samples</artifactId> | ||
<version>1.0</version> | ||
|
||
<!-- Parent defines config for testing & linting. --> | ||
<parent> | ||
<artifactId>doc-samples</artifactId> | ||
<groupId>com.google.cloud</groupId> | ||
<version>1.0.0</version> | ||
<relativePath>..</relativePath> | ||
</parent> | ||
|
||
<properties> | ||
<maven.compiler.source>1.8</maven.compiler.source> | ||
<maven.compiler.target>1.8</maven.compiler.target> | ||
<google.auth.version>0.7.0</google.auth.version> | ||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | ||
</properties> | ||
|
||
<!-- Temporary workaround for known issue : https://github.com/GoogleCloudPlatform/google-cloud-java/issues/2192 --> | ||
<dependencyManagement> | ||
<dependencies> | ||
<dependency> | ||
<groupId>com.google.auth</groupId> | ||
<artifactId>google-auth-library-credentials</artifactId> | ||
<version>${google.auth.version}</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>com.google.auth</groupId> | ||
<artifactId>google-auth-library-oauth2-http</artifactId> | ||
<version>${google.auth.version}</version> | ||
</dependency> | ||
</dependencies> | ||
</dependencyManagement> | ||
<!--- End of workaround --> | ||
|
||
<dependencies> | ||
<!-- [START dlp_maven] --> | ||
<dependency> | ||
<groupId>com.google.cloud</groupId> | ||
<artifactId>google-cloud-dlp</artifactId> | ||
<version>0.20.2-alpha</version> | ||
</dependency> | ||
<!-- [END dlp_maven] --> | ||
<dependency> | ||
<groupId>commons-cli</groupId> | ||
<artifactId>commons-cli</artifactId> | ||
<version>1.4</version> | ||
</dependency> | ||
<!-- Test dependencies --> | ||
<dependency> | ||
<groupId>junit</groupId> | ||
<artifactId>junit</artifactId> | ||
<version>4.12</version> | ||
</dependency> | ||
</dependencies> | ||
<!-- Build jar with dependencies for testing --> | ||
<build> | ||
<plugins> | ||
<plugin> | ||
<artifactId>maven-assembly-plugin</artifactId> | ||
<version>3.0.0</version> | ||
<configuration> | ||
<descriptorRefs> | ||
<descriptorRef>jar-with-dependencies</descriptorRef> | ||
</descriptorRefs> | ||
</configuration> | ||
<executions> | ||
<execution> | ||
<id>make-assembly</id> <!-- this is used for inheritance merges --> | ||
<phase>package</phase> <!-- bind to the packaging phase --> | ||
<goals> | ||
<goal>single</goal> | ||
</goals> | ||
</execution> | ||
</executions> | ||
</plugin> | ||
</plugins> | ||
</build> | ||
</project> | ||
<!-- [END pom] --> |
Oops, something went wrong.