Skip to content

Commit

Permalink
[TA] Samples and documentation for Opinion Mining and Pii (#14829)
Browse files Browse the repository at this point in the history
* samples, readme + fix version

* update links

* update samples
  • Loading branch information
maririos authored Sep 3, 2020
1 parent 11976f4 commit 385d220
Show file tree
Hide file tree
Showing 13 changed files with 561 additions and 13 deletions.
40 changes: 40 additions & 0 deletions sdk/textanalytics/Azure.AI.TextAnalytics/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Azure Cognitive Services Text Analytics is a cloud service that provides advance
* Sentiment Analysis
* Key Phrase Extraction
* Named Entity Recognition
* Personally Identifiable Information (PII) Recognition
* Linked Entity Recognition

[Source code][textanalytics_client_src] | [Package (NuGet)][textanalytics_nuget_package] | [API reference documentation][textanalytics_refdocs] | [Product documentation][textanalytics_docs] | [Samples][textanalytics_samples]
Expand Down Expand Up @@ -124,6 +125,7 @@ The following section provides several code snippets using the `client` [created
* [Analyze Sentiment](#analyze-sentiment)
* [Extract Key Phrases](#extract-key-phrases)
* [Recognize Entities](#recognize-entities)
* [Recognize PII Entities](#recognize-pii-entities)
* [Recognize Linked Entities](#recognize-linked-entities)

### Async examples
Expand Down Expand Up @@ -159,6 +161,8 @@ Console.WriteLine($" Negative confidence score: {docSentiment.ConfidenceScore
```
For samples on using the production recommended option `AnalyzeSentimentBatch` see [here][analyze_sentiment_sample].

To get more granular information about the opinions related to aspects of a product/service, also knows as Aspect-based Sentiment Analysis in Natural Language Processing (NLP), see sample on sentiment analysis with opinion mining [here][analyze_sentiment_opinion_mining_sample].

Please refer to the service documentation for a conceptual discussion of [sentiment analysis][sentiment_analysis].

### Extract Key Phrases
Expand Down Expand Up @@ -198,6 +202,33 @@ For samples on using the production recommended option `RecognizeEntitiesBatch`

Please refer to the service documentation for a conceptual discussion of [named entity recognition][named_entity_recognition].

### Recognize PII Entities
Run a predictive model to identify a collection of entities containing Personally Identifiable Information found in the passed-in document or batch of documents, and categorize those entities into categories such as US social security number, drivers license number, or credit card number.

```C# Snippet:RecognizePiiEntities
string document = "A developer with SSN 859-98-0987 whose phone number is 800-102-1100 is building tools with our APIs.";

PiiEntityCollection entities = client.RecognizePiiEntities(document).Value;

Console.WriteLine($"Redacted Text: {entities.RedactedText}");
if (entities.Count > 0)
{
Console.WriteLine($"Recognized {entities.Count} PII entit{(entities.Count > 1 ? "ies" : "y")}:");
foreach (PiiEntity entity in entities)
{
Console.WriteLine($"Text: {entity.Text}, Category: {entity.Category}, SubCategory: {entity.SubCategory}, Confidence score: {entity.ConfidenceScore}");
}
}
else
{
Console.WriteLine("No entities were found.");
}
```

For samples on using the production recommended option `RecognizePiiEntitiesBatch` see [here][recognize_pii_entities_sample].

Please refer to the service documentation for supported [PII entity types][pii_entity_type].

### Recognize Linked Entities
Run a predictive model to identify a collection of entities found in the passed-in document or batch of documents, and include information linking the entities to their corresponding entries in a well-known knowledge base.

Expand Down Expand Up @@ -307,7 +338,11 @@ Samples are provided for each main functional area, and for each area, samples a
- [Analyze Sentiment][analyze_sentiment_sample]
- [Extract Key Phrases][extract_key_phrases_sample]
- [Recognize Entities][recognize_entities_sample]
- [Recognize PII Entities][recognize_pii_entities_sample]
- [Recognize Linked Entities][recognize_linked_entities_sample]

### Advanced samples
- [Analyze Sentiment with Opinion Mining][analyze_sentiment_opinion_mining_sample]
- [Create a mock client][mock_client_sample] for testing using the [Moq][moq] library.

## Contributing
Expand Down Expand Up @@ -337,6 +372,7 @@ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_con
[key_phrase_extraction]: https://docs.microsoft.com/azure/cognitive-services/Text-Analytics/how-tos/text-analytics-how-to-keyword-extraction
[named_entity_recognition]: https://docs.microsoft.com/azure/cognitive-services/Text-Analytics/how-tos/text-analytics-how-to-entity-linking
[named_entities_categories]: https://docs.microsoft.com/azure/cognitive-services/Text-Analytics/named-entity-types
[pii_entity_type]:https://docs.microsoft.com/azure/cognitive-services/text-analytics/named-entity-types?tabs=personal

[textanalytics_client_class]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/src/TextAnalyticsClient.cs
[azure_identity]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/identity/Azure.Identity
Expand All @@ -351,8 +387,12 @@ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_con

[detect_language_sample]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample1_DetectLanguage.md
[analyze_sentiment_sample]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample2_AnalyzeSentiment.md
[analyze_sentiment_opinion_mining_sample]: https://github.com/maririos/azure-sdk-for-net/blob/cd74dde9546ea675f9289e1b4e6fd804bda2a3bc/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample2.1_AnalyzeSentimentWithOpinionMining.md
<!--[analyze_sentiment_opinion_mining_sample]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample2.1_AnalyzeSentimentWithOpinionMining.md-->
[extract_key_phrases_sample]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample3_ExtractKeyPhrases.md
[recognize_entities_sample]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample4_RecognizeEntities.md
[recognize_pii_entities_sample]: https://github.com/Azure/azure-sdk-for-net/tree/cd74dde9546ea675f9289e1b4e6fd804bda2a3bc/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample5_RecognizePiiEntities.md
<!--[recognize_pii_entities_sample]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample5_RecognizePiiEntities.md-->
[recognize_linked_entities_sample]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample6_RecognizeLinkedEntities.md
[mock_client_sample]: https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample_MockClient.md

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -322,13 +322,13 @@ public TextAnalyticsClient(System.Uri endpoint, Azure.Core.TokenCredential crede
}
public partial class TextAnalyticsClientOptions : Azure.Core.ClientOptions
{
public TextAnalyticsClientOptions(Azure.AI.TextAnalytics.TextAnalyticsClientOptions.ServiceVersion version = Azure.AI.TextAnalytics.TextAnalyticsClientOptions.ServiceVersion.V3_1_Preview_1) { }
public TextAnalyticsClientOptions(Azure.AI.TextAnalytics.TextAnalyticsClientOptions.ServiceVersion version = Azure.AI.TextAnalytics.TextAnalyticsClientOptions.ServiceVersion.V3_1_Preview_2) { }
public string DefaultCountryHint { get { throw null; } set { } }
public string DefaultLanguage { get { throw null; } set { } }
public enum ServiceVersion
{
V3_0 = 1,
V3_1_Preview_1 = 2,
V3_1_Preview_2 = 2,
}
}
[System.Runtime.InteropServices.StructLayoutAttribute(System.Runtime.InteropServices.LayoutKind.Sequential)]
Expand Down
11 changes: 8 additions & 3 deletions sdk/textanalytics/Azure.AI.TextAnalytics/samples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,21 @@ Azure Cognitive Services Text Analytics is a cloud service that provides advance
* Sentiment Analysis
* Key Phrase Extraction
* Named Entity Recognition
* Personally Identifiable Information (PII) Recognition
* Linked Entity Recognition

You can find samples for each of this main functions below, as well as a sample on how to create a mock client for testing purposes.
To get started you'll need a Text Analytics endpoint and credentials. See Text Analytics Client Library [Readme][README] for more information and instructions.

## Common scenarios samples
- [Detect Language](https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample1_DetectLanguage.md)
- [Analyze Sentiment](https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample2_AnalyzeSentiment.md)
- [Extract Key Phrases](https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample3_ExtractKeyPhrases.md)
- [Recognize Entities](https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample4_RecognizeEntities.md)
- [Recognize PII Entities](https://github.com/Azure/azure-sdk-for-net/tree/cd74dde9546ea675f9289e1b4e6fd804bda2a3bc/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample5_RecognizePiiEntities.md)
<!--- [Recognize PII Entities](https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample5_RecognizePiiEntities.md)-->
- [Recognize Linked Entities](https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample6_RecognizeLinkedEntities.md)

## Advanced samples
- [Analyze Sentiment with Opinion Mining](https://github.com/maririos/azure-sdk-for-net/blob/cd74dde9546ea675f9289e1b4e6fd804bda2a3bc/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample2.1_AnalyzeSentimentWithOpinionMining.md)
<!--- [Analyze Sentiment with Opinion Mining](https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample2.1_AnalyzeSentimentWithOpinionMining.md)-->
- [Mock client](https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/textanalytics/Azure.AI.TextAnalytics/samples/Sample_MockClient.md)

[README]: https://github.com/Azure/azure-sdk-for-net/blob/master/sdk/textanalytics/Azure.AI.TextAnalytics/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Analyze sentiment with Opinion Mining

This sample demonstrates how to analyze sentiment of documents and get more granular information about the opinions related to aspects of a product/service, also knows as Aspect-based Sentiment Analysis in Natural Language Processing (NLP). This feature is only available for clients with api version v3.1-preview.1 and higher.

For the purpose of the sample, we will be the administrator of a hotel and we've set a system to look at the online reviews customers are posting to identify the major complaints about our hotel.
In order to do so, we will use the Sentiment Analysis feature of the Text Analytics client library. To get started you'll need a Text Analytics endpoint and credentials. See [README][README] for links and instructions.

## Creating a `TextAnalyticsClient`

To create a new `TextAnalyticsClient`, you need a Text Analytics endpoint and credentials. You can use the [DefaultAzureCredential][DefaultAzureCredential] to try a number of common authentication methods optimized for both running as a service and development. In the sample below, however, you'll use a Text Analytics API key credential by creating an `AzureKeyCredential` object, that if needed, will allow you to update the API key without creating a new client.

You can set `endpoint` and `apiKey` based on an environment variable, a configuration setting, or any way that works for your application.

```C# Snippet:TextAnalyticsSample1CreateClient
var client = new TextAnalyticsClient(new Uri(endpoint), new AzureKeyCredential(apiKey));
```

## Identify complaints

To get a deeper analysis into which are the aspects that people considered good or bad, we will need to include the `AdditionalSentimentAnalyses.OpinionMining` type into the `AnalyzeSentimentOptions`.

```C# Snippet:TAAnalyzeSentimentWithOpinionMining
var documents = new List<string>
{
"The food and service were unacceptable, but the concierge were nice.",
"The rooms were beautiful. The AC was good and quiet.",
"The breakfast was good, but the toilet was smelly.",
"Loved this hotel - good breakfast - nice shuttle service - clean rooms.",
"I had a great unobstructed view of the Microsoft campus.",
"Nice rooms but bathrooms were old and the toilet was dirty when we arrived.",
"We changed rooms as the toilet smelled."
};

AnalyzeSentimentResultCollection reviews = client.AnalyzeSentimentBatch(documents, options: new AnalyzeSentimentOptions() { AdditionalSentimentAnalyses = AdditionalSentimentAnalyses.OpinionMining });

Dictionary<string, int> complaints = GetComplaints(reviews);

var negativeAspect = complaints.Aggregate((l, r) => l.Value > r.Value ? l : r).Key;
Console.WriteLine($"Alert! major complaint is *{negativeAspect}*");
Console.WriteLine();
Console.WriteLine("---All complaints:");
foreach (KeyValuePair<string, int> complaint in complaints)
{
Console.WriteLine($" {complaint.Key}, {complaint.Value}");
}
```

Output:
```
Alert! major complaint is *toilet*
---All complaints:
food, 1
service, 1
toilet, 3
bathrooms, 1
rooms, 1
```

## Define method `GetComplaints`
Implementation for calculating complaints:

```C# Snippet:TAGetComplaints
private Dictionary<string, int> GetComplaints(AnalyzeSentimentResultCollection reviews)
{
var complaints = new Dictionary<string, int>();
foreach (AnalyzeSentimentResult review in reviews)
{
foreach (SentenceSentiment sentence in review.DocumentSentiment.Sentences)
{
foreach (MinedOpinion minedOpinion in sentence.MinedOpinions)
{
if (minedOpinion.Aspect.Sentiment == TextSentiment.Negative)
{
complaints.TryGetValue(minedOpinion.Aspect.Text, out var value);
complaints[minedOpinion.Aspect.Text] = value + 1;
}
}
}
}
return complaints;
}
```


To see the full example source files, see:
* [Synchronous Analyze Sentiment with Opinion Mining](https://github.com/maririos/azure-sdk-for-net/blob/cd74dde9546ea675f9289e1b4e6fd804bda2a3bc/sdk/textanalytics/Azure.AI.TextAnalytics/tests/samples/Sample2.1_AnalyzeSentimentWithOpinionMining.cs)
* [Asynchronous Analyze Sentiment with Opinion Mining](https://github.com/maririos/azure-sdk-for-net/blob/cd74dde9546ea675f9289e1b4e6fd804bda2a3bc/sdk/textanalytics/Azure.AI.TextAnalytics/tests/samples/Sample2.1_AnalyzeSentimentWithOpinionMiningAsync.cs)
<!--* [Synchronous Analyze Sentiment with Opinion Mining](https://github.com/Azure/azure-sdk-for-net/blob/master/sdk/textanalytics/Azure.AI.TextAnalytics//tests/samples/Sample2.1_AnalyzeSentimentWithOpinionMining.cs)
* [Asynchronous Analyze Sentiment with Opinion Mining](https://github.com/Azure/azure-sdk-for-net/blob/master/sdk/textanalytics/Azure.AI.TextAnalytics//tests/samples/Sample2.1_AnalyzeSentimentWithOpinionMiningAsync.cs)-->

[DefaultAzureCredential]: https://github.com/Azure/azure-sdk-for-net/blob/master/sdk/identity/Azure.Identity/README.md
[README]: https://github.com/Azure/azure-sdk-for-net/blob/master/sdk/textanalytics/Azure.AI.TextAnalytics/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Console.WriteLine($" Neutral confidence score: {docSentiment.ConfidenceScores
Console.WriteLine($" Negative confidence score: {docSentiment.ConfidenceScores.Negative}.");
```

## Analyzing the sentiment of multipile documents
## Analyzing the sentiment of multiple documents

To analyze the sentiment of a collection of documents in the same language, call `AnalyzeSentimentBatch` on an `IEnumerable` of strings. The results are returned as a `AnalyzeSentimentResultCollection`.

Expand Down
Loading

1 comment on commit 385d220

@callratnesh
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested "Sample5_RecognizePiiEntities" with below string. Not sure why Credit Card Number, CVV and PIN are not getting redacted. Can you please help

string documentA = @"Good morning, everybody. My name is Van Bokhorst Serdar, and today I feel like sharing a whole lot of personal information with you.
Let's start with my Email address SerdarvanBokhorst@dayrep.com. My address is 2657 Koontz Lane, Los Angeles, CA. My phone number is 818-828-6231.
My Social security number is 548-95-6370. My Bank account number is 940517528812 and routing number 195991012. My credit card number is 5534816011668430,
Expiration Date 6/1/2022, my C V V code is 121, and my password 123456. Well, I think that's it. You know a whole lot about me. And I hope that Amazon comprehend is doing a
good job at identifying PII entities so you can redact my personal information away from this document. Let's check. IBN GB33BUKB20201555555555";

Please sign in to comment.