-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OCR Service #389
Comments
Do you have some planning/sketching on this? Want to use something from aaron? |
I've got a skeleton that's wired up to the gradle build machinery that I'll clean up and provide after I clear out stuff for #387 |
We'll talk about it's mini API first, before going anywhere with it. |
Here is what I am using to process Tiffs to HOCR right now. |
Yeah pretty much. For a web service we should stream to stdin and receive from stdout, tacking the args from a header at the end. |
This might be worth investigating: |
Depends on #387
Add a simple OCR web service that executes Tesseract on any TIFF that gets POSTed to it. It should respect a header such as X-Islandora-Ocr-Args so you can pass in command line arguments.
Service will consume image/tiff.
Service will produce text/plain, text/html/ and application/pdf (because that's what Tesseract produces).
The text was updated successfully, but these errors were encountered: