Skip to content

Commit

Permalink
[formrecognizer] composed model (#14029)
Browse files Browse the repository at this point in the history
* initial model compose

* fix tests for composed model and repr

* mypy

* fix comments

* add fixmes

* set form_type on CustomFormSubmodel correctly

* fix training tests

* remove error map/cont token from compose model

* add composed model tests

* adding more tests for new fields

* updating readmes/changelog

* small fix

* update sample to be more consistent with .NET

* add api version table to readme

* add tests requested in feedback

* raise better error for multiapi

* some docs feedback

* testing feedback
  • Loading branch information
kristapratico authored Oct 8, 2020
1 parent de390a6 commit 2fed772
Show file tree
Hide file tree
Showing 48 changed files with 4,297 additions and 1,399 deletions.
9 changes: 9 additions & 0 deletions sdk/formrecognizer/azure-ai-formrecognizer/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,19 @@

## 3.1.0b1 (Unreleased)

This version of the SDK defaults to the latest supported API version, which currently is v2.1-preview.

**New features**

- Recognize receipt methods now take keyword argument `locale` to optionally indicate the locale of the receipt for
improved results
- Added ability to create a composed model from the `FormTrainingClient` by calling method `begin_create_composed_model()`
- Added the properties `display_name` and `properties` to types `CustomFormModel` and `CustomFormModelInfo`
- Added keyword argument `display_name` to `begin_training()` and `begin_create_composed_model()`
- Added model type `CustomFormModelProperties` that includes information like if a model is a composed model
- Added property `model_id` to `CustomFormSubmodel` and `TrainingDocumentInfo`
- Added properties `model_id` and `form_type_confidence` to `RecognizedForm`


## 3.0.0 (2020-08-20)

Expand Down
30 changes: 24 additions & 6 deletions sdk/formrecognizer/azure-ai-formrecognizer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,18 @@ from form documents. It includes the following main functionalities:
Install the Azure Form Recognizer client library for Python with [pip][pip]:

```bash
pip install azure-ai-formrecognizer
pip install azure-ai-formrecognizer --pre
```

> Note: This version of the client library supports the v2.0 version of the Form Recognizer service
> Note: This version of the client library defaults to the v2.1-preview version of the service
This table shows the relationship between SDK versions and supported API versions of the service

|SDK version|Supported API version of service
|-|-
|3.0.0 - Latest GA release (can be installed by removing the `--pre` flag)| 2.0
|3.1.0b1 - Latest release (beta)| 2.0, 2.1-preview


#### Create a Form Recognizer resource
Form Recognizer supports both [multi-service and single-service access][multi_and_single_service].
Expand Down Expand Up @@ -135,6 +143,7 @@ Sample code snippets are provided to illustrate using a FormRecognizerClient [he
- Training custom models with labels to recognize specific fields and values you specify by labeling your custom forms. A `CustomFormModel` is returned indicating the fields the model will extract, as well as the estimated accuracy for each field. See the [service documentation][fr-train-with-labels] for a more detailed explanation.
- Managing models created in your account.
- Copying a custom model from one Form Recognizer resource to another.
- Creating a composed model from a collection of existing trained models with labels.

Please note that models can also be trained using a graphical user interface such as the [Form Recognizer Labeling Tool][fr-labeling-tool].

Expand Down Expand Up @@ -183,6 +192,8 @@ result = poller.result()

for recognized_form in result:
print("Form type: {}".format(recognized_form.form_type))
print("Form type confidence: {}".format(recognized_form.form_type_confidence))
print("Form was analyzed using model with ID: {}".format(recognized_form.model_id))
for name, field in recognized_form.fields.items():
print("Field '{}' has label '{}' with value '{}' and a confidence score of {}".format(
name,
Expand Down Expand Up @@ -275,21 +286,23 @@ form_training_client = FormTrainingClient(endpoint, credential)

container_sas_url = "<container-sas-url>" # training documents uploaded to blob storage
poller = form_training_client.begin_training(
container_sas_url, use_training_labels=False
container_sas_url, use_training_labels=False, display_name="my first model"
)
model = poller.result()

# Custom model information
print("Model ID: {}".format(model.model_id))
print("Display name: {}".format(model.display_name))
print("Is composed model?: {}".format(model.properties.is_composed_model))
print("Status: {}".format(model.status))
print("Training started on: {}".format(model.training_started_on))
print("Training completed on: {}".format(model.training_completed_on))

print("\nRecognized fields:")
for submodel in model.submodels:
print(
"The submodel with form type '{}' has recognized the following fields: {}".format(
submodel.form_type,
"The submodel with form type '{}' and model ID '{}' has recognized the following fields: {}".format(
submodel.form_type, submodel.model_id,
", ".join(
[
field.label if field.label else name
Expand Down Expand Up @@ -336,6 +349,8 @@ model_id = "<model_id from the Train a Model sample>"

custom_model = form_training_client.get_custom_model(model_id=model_id)
print("Model ID: {}".format(custom_model.model_id))
print("Display name: {}".format(model.display_name))
print("Is composed model?: {}".format(model.properties.is_composed_model))
print("Status: {}".format(custom_model.status))
print("Training started on: {}".format(custom_model.training_started_on))
print("Training completed on: {}".format(custom_model.training_completed_on))
Expand Down Expand Up @@ -388,6 +403,7 @@ These code samples show common scenario operations with the Azure Form Recognize
* Train a model with labels: [sample_train_model_with_labels.py][sample_train_model_with_labels]
* Manage custom models: [sample_manage_custom_models.py][sample_manage_custom_models]
* Copy a model between Form Recognizer resources: [sample_copy_model.py][sample_copy_model]
* Create a composed model from a collection of models trained with labels: |[sample_create_composed_model.py][sample_create_composed_model]

### Async APIs
This library also includes a complete async API supported on Python 3.5+. To use it, you must
Expand All @@ -403,7 +419,7 @@ are found under the `azure.ai.formrecognizer.aio` namespace.
* Train a model with labels: [sample_train_model_with_labels_async.py][sample_train_model_with_labels_async]
* Manage custom models: [sample_manage_custom_models_async.py][sample_manage_custom_models_async]
* Copy a model between Form Recognizer resources: [sample_copy_model_async.py][sample_copy_model_async]

* Create a composed model from a collection of models trained with labels: [sample_create_composed_model_async.py][sample_create_composed_model_async]

### Additional documentation

Expand Down Expand Up @@ -475,3 +491,5 @@ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_con
[sample_train_model_without_labels_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/async_samples/sample_train_model_without_labels_async.py
[sample_copy_model]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/sample_copy_model.py
[sample_copy_model_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/async_samples/sample_copy_model_async.py
[sample_create_composed_model]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples
[sample_create_composed_model_async]: https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/formrecognizer/azure-ai-formrecognizer/samples/async_samples/
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@
CustomFormModel,
CustomFormSubmodel,
CustomFormModelField,
FieldValueType
FieldValueType,
CustomFormModelProperties,
)
from ._api_versions import FormRecognizerApiVersion

Expand Down Expand Up @@ -62,7 +63,8 @@
'CustomFormModel',
'CustomFormSubmodel',
'CustomFormModelField',
'FieldValueType'
'FieldValueType',
'CustomFormModelProperties',
]

__VERSION__ = VERSION
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
Any,
Dict,
Union,
List,
TYPE_CHECKING,
)
from azure.core.tracing.decorator import distributed_trace
Expand Down Expand Up @@ -96,6 +97,7 @@ def begin_training(self, training_files_url, use_training_labels, **kwargs):
:keyword bool include_subfolders: A flag to indicate if subfolders within the set of prefix folders
will also need to be included when searching for content to be preprocessed. Not supported if
training with labels.
:keyword str display_name: A display name for your model.
:keyword int polling_interval: Waiting time between two polls for LRO operations
if no Retry-After header is present. Defaults to 5 seconds.
:keyword str continuation_token: A continuation token to restart a poller from a saved state.
Expand All @@ -105,6 +107,8 @@ def begin_training(self, training_files_url, use_training_labels, **kwargs):
:raises ~azure.core.exceptions.HttpResponseError:
Note that if the training fails, the exception is raised, but a model with an
"invalid" status is still created. You can delete this model by calling :func:`~delete_model()`
.. versionadded:: v2.1-preview
The *display_name* keyword argument
.. admonition:: Example:
Expand All @@ -118,13 +122,16 @@ def begin_training(self, training_files_url, use_training_labels, **kwargs):

def callback_v2_0(raw_response):
model = self._deserialize(self._generated_models.Model, raw_response)
return CustomFormModel._from_generated(model)
return CustomFormModel._from_generated(model, api_version=self.api_version)

def callback_v2_1(raw_response, _, headers): # pylint: disable=unused-argument
model = self._deserialize(self._generated_models.Model, raw_response)
return CustomFormModel._from_generated(model)
return CustomFormModel._from_generated(model, api_version=self.api_version)

cls = kwargs.pop("cls", None)
display_name = kwargs.pop("display_name", None)
if display_name and self.api_version == "2.0":
raise ValueError("'display_name' is only available for API version V2_1_PREVIEW and up")
continuation_token = kwargs.pop("continuation_token", None)
polling_interval = kwargs.pop("polling_interval", self._client._config.polling_interval)

Expand Down Expand Up @@ -169,6 +176,7 @@ def callback_v2_1(raw_response, _, headers): # pylint: disable=unused-argument
prefix=kwargs.pop("prefix", ""),
include_sub_folders=kwargs.pop("include_subfolders", False),
),
model_name=display_name
),
cls=deserialization_callback,
continuation_token=continuation_token,
Expand Down Expand Up @@ -273,7 +281,7 @@ def get_custom_model(self, model_id, **kwargs):
raise ValueError("model_id cannot be None or empty.")

response = self._client.get_custom_model(model_id=model_id, include_keys=True, **kwargs)
return CustomFormModel._from_generated(response)
return CustomFormModel._from_generated(response, api_version=self.api_version)

@distributed_trace
def get_copy_authorization(self, resource_id, resource_region, **kwargs):
Expand Down Expand Up @@ -373,6 +381,55 @@ def _copy_callback(raw_response, _, headers): # pylint: disable=unused-argument
**kwargs
)

@distributed_trace
def begin_create_composed_model(
self,
model_ids,
**kwargs
):
# type: (List[str], Any) -> LROPoller[CustomFormModel]
"""Creates a composed model from a collection of existing trained models with labels.
:param list[str] model_ids: List of model IDs to use in the composed model.
:keyword str display_name: Optional model display name.
:keyword int polling_interval: Default waiting time between two polls for LRO operations if
no Retry-After header is present.
:keyword str continuation_token: A continuation token to restart a poller from a saved state.
:return: An instance of an LROPoller. Call `result()` on the poller
object to return a :class:`~azure.ai.formrecognizer.CustomFormModel`.
:rtype: ~azure.core.polling.LROPoller[~azure.ai.formrecognizer.CustomFormModel]
:raises ~azure.core.exceptions.HttpResponseError:
.. admonition:: Example:
.. literalinclude:: ../samples/sample_create_composed_model.py
:start-after: [START begin_create_composed_model]
:end-before: [END begin_create_composed_model]
:language: python
:dedent: 8
:caption: Create a composed model
"""

def _compose_callback(raw_response, _, headers): # pylint: disable=unused-argument
model = self._deserialize(self._generated_models.Model, raw_response)
return CustomFormModel._from_generated_composed(model)

display_name = kwargs.pop("display_name", None)
polling_interval = kwargs.pop("polling_interval", self._client._config.polling_interval)
continuation_token = kwargs.pop("continuation_token", None)
try:
return self._client.begin_compose_custom_models_async(
{"model_ids": model_ids, "model_name": display_name},
cls=kwargs.pop("cls", _compose_callback),
polling=LROBasePolling(timeout=polling_interval, lro_algorithms=[TrainingPolling()], **kwargs),
continuation_token=continuation_token,
**kwargs
)
except ValueError:
raise ValueError(
"Method 'begin_create_composed_model' is only available for API version V2_1_PREVIEW and up"
)

def get_form_recognizer_client(self, **kwargs):
# type: (Any) -> FormRecognizerClient
"""Get an instance of a FormRecognizerClient from FormTrainingClient.
Expand Down
Loading

0 comments on commit 2fed772

Please sign in to comment.