-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
refactor: Updates to Document AI Python Samples (#323)
* Updated OCR Quickstart Sample Added Types to Request Creation Added ClientOptions object for type safety Simplified output code to print full text instead of paragraphs Updated Link to Document Object v1 specification Added mime_type as variable * Updates to process_document_sample - Same Updates as Quickstart Sample - Moved Imports to top of quickstart file * Updated Batch Process Example - Added typing - Use BatchProcessMetadata instead of Operation ID to get output files from GCS - Added MimeType specification - Added Alternatives for Directory Processing & Callbacks - Minor Changes to process_document/quickstart for unified style with batch * Updates to OCR Response Handling Sample - Separated Online Processing Request into function - Added explicit typing for documentai objects - Converted `.format()` to f-string - Simplified `layout_to_text()` * Updated Form Processing Sample - Updated to `v1` API - Separated processing request into function - Added explicit typing for Document AI Types - Separated `print_table_rows()` into function for modularity - Fixed Spelling error "Collumns" * Updated Specialized Processor Sample - Added Extraction of Properties (Nested Entities) and Normalized Values * Updates to Splitter/Classifier Sample - Updated to `v1` API - Changed Page Numeber Printout - (Splitter Classifiers now output all page numbers within a subdocument, instead of just the first and last) * Updated Test for process_document_sample - Added mime_type * Updated Document Quality Processor Sample - Updated to `v1` API - Moved API Call to separate function - Updated `.format()` to f-strings - Added Handling for Multiple Page Numbers per entity - Reused `page_refs_to_string()` from splitter/classifier example - Added `mime_type` as parameter * Updated Batch Processing Directory sample variable from CR comments * Added Sample Input PDF Files & Output JSON Files * Fixed Spelling Error in Invoice Parser Output filenames * Addressed Code Review Comments - Changed Copyright Year back to 2020 - Changed "property" variable to "prop" to avoid naming conflicts * Updated Client Library Requirements versions * Addressed Unit Test Failures * Re-added google-api-core to requirements.txt * Update samples/snippets/process_document_form_sample.py Co-authored-by: Anthonios Partheniou <partheniou@google.com> * Update samples/snippets/requirements.txt Co-authored-by: Anthonios Partheniou <partheniou@google.com> * Fixed "entirity" spelling error Co-authored-by: Gal Zahavi <38544478+galz10@users.noreply.github.com> Co-authored-by: Anthonios Partheniou <partheniou@google.com>
- Loading branch information
1 parent
35b59e6
commit bfe4ffc
Showing
58 changed files
with
462,989 additions
and
392 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.