Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Selectionmarks #15248

Merged
4 commits merged into from
Oct 5, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion sdk/formrecognizer/Azure.AI.FormRecognizer/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

## 3.1.0-beta.1 (Unreleased)
- It defaults to the latest supported API version, which currently is `2.1-preview.1`.
Note that new functionality hasn't been implemented in the client library.
- Added support to `StartRecognizeContent` to recognize selection marks such as check boxes and radio buttons.
- Added support to train and recognize custom forms with selection marks such as check boxes and radio buttons. This functionality is only available in train with labels scenarios.

## 3.0.0 (2020-08-20)

Expand Down
17 changes: 14 additions & 3 deletions sdk/formrecognizer/Azure.AI.FormRecognizer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Azure Cognitive Services Form Recognizer is a cloud service that uses machine learning to recognize form fields, text, and tables in form documents. It includes the following capabilities:

- Recognize Custom Forms - Recognize and extract form fields and other content from your custom forms, using models you trained with your own form types.
- Recognize Form Content - Recognize and extract tables, lines and words in forms documents, without the need to train a model.
- Recognize Form Content - Recognize and extract tables, lines, words, and selection marks like radio buttons and check boxes in forms documents, without the need to train a model.
- Recognize Receipts - Recognize and extract common fields from US receipts, using a pre-trained receipt model.

[Source code][formreco_client_src] | [Package (NuGet)][formreco_nuget_package] | [API reference documentation][formreco_refdocs] | [Product documentation][formreco_docs] | [Samples][formreco_samples]
Expand Down Expand Up @@ -101,7 +101,7 @@ var client = new FormRecognizerClient(new Uri(endpoint), new DefaultAzureCredent
`FormRecognizerClient` provides operations for:

- Recognizing form fields and content, using custom models trained to recognize your custom forms. These values are returned in a collection of `RecognizedForm` objects. See example [Recognize Custom Forms](#recognize-custom-forms).
- Recognizing form content, including tables, lines and words, without the need to train a model. Form content is returned in a collection of `FormPage` objects. See example [Recognize Content](#recognize-content).
- Recognizing form content, including tables, lines, words, and selection marks like radio buttons and check boxes without the need to train a model. Form content is returned in a collection of `FormPage` objects. See example [Recognize Content](#recognize-content).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should say something about training custom models with labels to recognize selection marks as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it but looking at Recognizing form fields and content, using custom models trained to recognize your custom forms. makes me think that it already includes it. Although selectionmarks is a new concept, the idea of recognizing fields and content still applies.

- Recognizing common fields from US receipts, using a pre-trained receipt model on the Form Recognizer service. These fields and meta-data are returned in a collection of `RecognizedForm` objects. See example [Recognize Receipts](#recognize-receipts).

### FormTrainingClient
Expand Down Expand Up @@ -137,7 +137,7 @@ The following section provides several code snippets illustrating common pattern
* [Manage Custom Models Synchronously](#manage-custom-models-synchronously)

### Recognize Content
Recognize text and table data, along with their bounding box coordinates, from documents.
Recognize text, tables, and selection marks like radio buttons and check boxes data, along with their bounding box coordinates, from documents.

```C# Snippet:FormRecognizerSampleRecognizeContentFromUri
FormPageCollection formPages = await client.StartRecognizeContentFromUriAsync(invoiceUri).WaitForCompletionAsync();
Expand All @@ -160,6 +160,17 @@ foreach (FormPage page in formPages)
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}) contains text: '{cell.Text}'.");
}
}

maririos marked this conversation as resolved.
Show resolved Hide resolved
for (int i = 0; i < page.SelectionMarks.Count; i++)
{
FormSelectionMark selectionMark = page.SelectionMarks[i];
Console.WriteLine($"Selection Mark {i} is {selectionMark.State.ToString()}.");
Console.WriteLine(" Its bounding box is:");
Console.WriteLine($" Upper left => X: {selectionMark.BoundingBox[0].X}, Y= {selectionMark.BoundingBox[0].Y}");
Console.WriteLine($" Upper right => X: {selectionMark.BoundingBox[1].X}, Y= {selectionMark.BoundingBox[1].Y}");
Console.WriteLine($" Lower right => X: {selectionMark.BoundingBox[2].X}, Y= {selectionMark.BoundingBox[2].Y}");
Console.WriteLine($" Lower left => X: {selectionMark.BoundingBox[3].X}, Y= {selectionMark.BoundingBox[3].Y}");
}
}
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ public readonly partial struct FieldValue
public System.DateTime AsDate() { throw null; }
public System.Collections.Generic.IReadOnlyDictionary<string, Azure.AI.FormRecognizer.Models.FormField> AsDictionary() { throw null; }
public float AsFloat() { throw null; }
public Azure.AI.FormRecognizer.Models.FormSelectionMarkState AsFormSelectionMarkState() { throw null; }
public long AsInt64() { throw null; }
public System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormField> AsList() { throw null; }
public string AsPhoneNumber() { throw null; }
Expand Down Expand Up @@ -143,6 +144,7 @@ internal FormPage() { }
public float Height { get { throw null; } }
public System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormLine> Lines { get { throw null; } }
public int PageNumber { get { throw null; } }
public System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormSelectionMark> SelectionMarks { get { throw null; } }
public System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormTable> Tables { get { throw null; } }
public float TextAngle { get { throw null; } }
public Azure.AI.FormRecognizer.Models.LengthUnit Unit { get { throw null; } }
Expand Down Expand Up @@ -180,21 +182,36 @@ public static partial class FormRecognizerModelFactory
public static Azure.AI.FormRecognizer.Models.FieldValue FieldValueWithInt64ValueType(long value) { throw null; }
public static Azure.AI.FormRecognizer.Models.FieldValue FieldValueWithListValueType(System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormField> value) { throw null; }
public static Azure.AI.FormRecognizer.Models.FieldValue FieldValueWithPhoneNumberValueType(string value) { throw null; }
public static Azure.AI.FormRecognizer.Models.FieldValue FieldValueWithSelectionMarkValueType(Azure.AI.FormRecognizer.Models.FormSelectionMarkState value) { throw null; }
public static Azure.AI.FormRecognizer.Models.FieldValue FieldValueWithStringValueType(string value) { throw null; }
public static Azure.AI.FormRecognizer.Models.FieldValue FieldValueWithTimeValueType(System.TimeSpan value) { throw null; }
public static Azure.AI.FormRecognizer.Models.FormField FormField(string name, Azure.AI.FormRecognizer.Models.FieldData labelData, Azure.AI.FormRecognizer.Models.FieldData valueData, Azure.AI.FormRecognizer.Models.FieldValue value, float confidence) { throw null; }
public static Azure.AI.FormRecognizer.Models.FormLine FormLine(Azure.AI.FormRecognizer.Models.FieldBoundingBox boundingBox, int pageNumber, string text, System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormWord> words) { throw null; }
[System.ComponentModel.EditorBrowsableAttribute(System.ComponentModel.EditorBrowsableState.Never)]
public static Azure.AI.FormRecognizer.Models.FormPage FormPage(int pageNumber, float width, float height, float textAngle, Azure.AI.FormRecognizer.Models.LengthUnit unit, System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormLine> lines, System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormTable> tables) { throw null; }
public static Azure.AI.FormRecognizer.Models.FormPage FormPage(int pageNumber, float width, float height, float textAngle, Azure.AI.FormRecognizer.Models.LengthUnit unit, System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormLine> lines, System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormTable> tables, System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormSelectionMark> selectionMarks) { throw null; }
public static Azure.AI.FormRecognizer.Models.FormPageCollection FormPageCollection(System.Collections.Generic.IList<Azure.AI.FormRecognizer.Models.FormPage> list) { throw null; }
public static Azure.AI.FormRecognizer.Models.FormPageRange FormPageRange(int firstPageNumber, int lastPageNumber) { throw null; }
public static Azure.AI.FormRecognizer.Models.FormRecognizerError FormRecognizerError(string errorCode, string message) { throw null; }
public static Azure.AI.FormRecognizer.Models.FormSelectionMark FormSelectionMark(Azure.AI.FormRecognizer.Models.FieldBoundingBox boundingBox, int pageNumber, string text, float confidence, Azure.AI.FormRecognizer.Models.FormSelectionMarkState state) { throw null; }
public static Azure.AI.FormRecognizer.Models.FormTable FormTable(int pageNumber, int columnCount, int rowCount, System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormTableCell> cells) { throw null; }
public static Azure.AI.FormRecognizer.Models.FormTableCell FormTableCell(Azure.AI.FormRecognizer.Models.FieldBoundingBox boundingBox, int pageNumber, string text, int columnIndex, int rowIndex, int columnSpan, int rowSpan, bool isHeader, bool isFooter, float confidence, System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormElement> fieldElements) { throw null; }
public static Azure.AI.FormRecognizer.Models.FormWord FormWord(Azure.AI.FormRecognizer.Models.FieldBoundingBox boundingBox, int pageNumber, string text, float confidence) { throw null; }
public static Azure.AI.FormRecognizer.Models.RecognizedForm RecognizedForm(string formType, Azure.AI.FormRecognizer.Models.FormPageRange pageRange, System.Collections.Generic.IReadOnlyDictionary<string, Azure.AI.FormRecognizer.Models.FormField> fields, System.Collections.Generic.IReadOnlyList<Azure.AI.FormRecognizer.Models.FormPage> pages) { throw null; }
public static Azure.AI.FormRecognizer.Models.RecognizedFormCollection RecognizedFormCollection(System.Collections.Generic.IList<Azure.AI.FormRecognizer.Models.RecognizedForm> list) { throw null; }
public static Azure.AI.FormRecognizer.Training.TrainingDocumentInfo TrainingDocumentInfo(string name, int pageCount, System.Collections.Generic.IEnumerable<Azure.AI.FormRecognizer.Models.FormRecognizerError> errors, Azure.AI.FormRecognizer.Training.TrainingStatus status) { throw null; }
}
public partial class FormSelectionMark : Azure.AI.FormRecognizer.Models.FormElement
{
internal FormSelectionMark() { }
public float Confidence { get { throw null; } }
public Azure.AI.FormRecognizer.Models.FormSelectionMarkState State { get { throw null; } }
}
public enum FormSelectionMarkState
{
Selected = 0,
Unselected = 1,
}
public partial class FormTable
{
internal FormTable() { }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ description: Samples for the Azure.AI.FormRecognizer client library
# Azure Form Recognizer client SDK Samples
Azure Cognitive Services Form Recognizer is a cloud service that uses machine learning to recognize form fields, text, and tables in form documents. It includes the following capabilities:

- Recognize form content - Recognize and extract tables, lines, and words in forms documents, without the need to train a model.
- Recognize form content - Recognize and extract tables, lines, words, and selection marks like radio buttons and check boxes in forms documents, without the need to train a model.
- Recognize custom forms - Recognize and extract form fields and other content from your custom forms, using models you trained with your own form types.
- Recognize receipts - Recognize and extract common fields from US receipts, using a pre-trained receipt model.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Recognize form content

This sample demonstrates how to recognize tables, lines, and words in documents, without the need to train a model.
This sample demonstrates how to recognize tables, lines, words, and selection marks like radio buttons and check boxes in forms documents, without the need to train a model.

To get started you'll need a Cognitive Services resource or a Form Recognizer resource. See [README][README] for prerequisites and instructions.

Expand Down Expand Up @@ -42,6 +42,17 @@ foreach (FormPage page in formPages)
Console.WriteLine($" Cell ({cell.RowIndex}, {cell.ColumnIndex}) contains text: '{cell.Text}'.");
}
}

for (int i = 0; i < page.SelectionMarks.Count; i++)
{
FormSelectionMark selectionMark = page.SelectionMarks[i];
Console.WriteLine($"Selection Mark {i} is {selectionMark.State.ToString()}.");
Console.WriteLine(" Its bounding box is:");
Console.WriteLine($" Upper left => X: {selectionMark.BoundingBox[0].X}, Y= {selectionMark.BoundingBox[0].Y}");
Console.WriteLine($" Upper right => X: {selectionMark.BoundingBox[1].X}, Y= {selectionMark.BoundingBox[1].Y}");
Console.WriteLine($" Lower right => X: {selectionMark.BoundingBox[2].X}, Y= {selectionMark.BoundingBox[2].Y}");
Console.WriteLine($" Lower left => X: {selectionMark.BoundingBox[3].X}, Y= {selectionMark.BoundingBox[3].Y}");
}
}
```

Expand Down
42 changes: 42 additions & 0 deletions sdk/formrecognizer/Azure.AI.FormRecognizer/src/FieldValue.cs
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,17 @@ internal FieldValue(IReadOnlyDictionary<string, FormField> value)
ValueDictionary = value;
}

/// <summary>
/// Initializes a new instance of the <see cref="FieldValue"/> structure.
/// </summary>
/// <param name="value">The actual field value.</param>
internal FieldValue(FormSelectionMarkState value)
: this()
{
ValueType = FieldValueType.SelectionMark;
ValueSelectionMark = value;
}

/// <summary>
/// The data type of the field value.
/// </summary>
Expand Down Expand Up @@ -150,6 +161,12 @@ internal FieldValue(IReadOnlyDictionary<string, FormField> value)
/// </summary>
private IReadOnlyDictionary<string, FormField> ValueDictionary { get; }

/// <summary>
/// The <see cref="FieldValueSelectionMark"/> value of this instance. Values are usually extracted from
/// <see cref="_fieldValue"/>, so this property is exclusively used for mocking.
/// </summary>
private FormSelectionMarkState ValueSelectionMark { get; }

/// <summary>
/// Gets the value of the field as a <see cref="string"/>.
/// </summary>
Expand Down Expand Up @@ -349,5 +366,30 @@ public IReadOnlyDictionary<string, FormField> AsDictionary()

return fieldDictionary;
}

/// <summary>
/// Gets the value of the field as a <see cref="FormSelectionMarkState"/>.
/// </summary>
/// <returns>The value of the field converted to <see cref="FormSelectionMarkState"/>.</returns>
/// <exception cref="InvalidOperationException">Thrown when <see cref="ValueType"/> is not <see cref="FieldValueType.SelectionMark"/>.</exception>
public FormSelectionMarkState AsFormSelectionMarkState()
{
if (ValueType != FieldValueType.SelectionMark)
{
throw new InvalidOperationException($"Cannot get field as SelectionMark. Field value's type is {ValueType}.");
}

if (_fieldValue == null)
{
return ValueSelectionMark;
}

if (!_fieldValue.ValueSelectionMark.HasValue)
{
throw new InvalidOperationException($"Field value is null.");
}

return _fieldValue.ValueSelectionMark.Value;
maririos marked this conversation as resolved.
Show resolved Hide resolved
}
}
}
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

using System;
using System.Collections.Generic;
using Azure.Core;

namespace Azure.AI.FormRecognizer.Models
Expand All @@ -11,11 +13,57 @@ internal partial class FieldValue_internal
/// <summary>Integer value.</summary>
public long? ValueInteger { get; }

/// <summary> Selection mark value. </summary>
public FormSelectionMarkState? ValueSelectionMark { get; }

internal FieldValue_internal(string value)
{
Type = FieldValueType.String;
ValueString = value;
Text = value;
}

/// <summary> Initializes a new instance of FieldValue_internal. </summary>
/// <param name="type"> Type of field value. </param>
/// <param name="valueString"> String value. </param>
/// <param name="valueDate"> Date value. </param>
/// <param name="valueTime"> Time value. </param>
/// <param name="valuePhoneNumber"> Phone number value. </param>
/// <param name="valueNumber"> Floating point value. </param>
/// <param name="valueInteger"> Integer value. </param>
/// <param name="valueArray"> Array of field values. </param>
/// <param name="valueObject"> Dictionary of named field values. </param>
/// <param name="valueSelectionMark"> Selection mark value. </param>
/// <param name="text"> Text content of the extracted field. </param>
/// <param name="boundingBox"> Bounding box of the field value, if appropriate. </param>
/// <param name="confidence"> Confidence score. </param>
/// <param name="elements"> When includeTextDetails is set to true, a list of references to the text elements constituting this field. </param>
/// <param name="page"> The 1-based page number in the input document. </param>
internal FieldValue_internal(FieldValueType type, string valueString, DateTimeOffset? valueDate, TimeSpan? valueTime, string valuePhoneNumber, float? valueNumber, long? valueInteger, IReadOnlyList<FieldValue_internal> valueArray, IReadOnlyDictionary<string, FieldValue_internal> valueObject, FormSelectionMarkState? valueSelectionMark, string text, IReadOnlyList<float> boundingBox, float? confidence, IReadOnlyList<string> elements, int? page)
{
Type = type;
ValueString = valueString;
ValueDate = valueDate;
ValueTime = valueTime;
ValuePhoneNumber = valuePhoneNumber;
ValueNumber = valueNumber;
ValueInteger = valueInteger;
ValueArray = valueArray;
ValueObject = valueObject;
ValueSelectionMark = valueSelectionMark;

BoundingBox = boundingBox;
Confidence = confidence;
Elements = elements;
Page = page;

if (Type == FieldValueType.SelectionMark)
{
ValueSelectionMark = FormSelectionMarkStateExtensions.ToFormSelectionMarkState(text);
Text = ValueSelectionMark.ToString();
}
else
Text = text;
}
}
}
Loading