-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I would like to get access to a raw OCR fragment #80
Comments
From the perspective of an OCR correction platform, I (the correction tool) would like to
|
So far, people have been using seeAlso to link from canvas to ALTO:
The Newspaper working group have some guidelines around this - https://www.slideshare.net/kestlund/newspapers-iiif-and-alto This could also be modelled as a service. |
My concern is that accessing the right element in the OCR file from the text annotation is not an straightforward process (using the geometrical information?) {
"@id":"http://dams.llgc.org.uk/iiif/3320863/annotation/5014243419640",
"@type":"oa:Annotation",
"motivation":"sc:painting",
"resource":
{
"@type":"cnt:ContentAsText",
"format":"text/plain",
"chars":"NEWS."
},
"on":"http://dams.llgc.org.uk/iiif/3320860/canvas/3320863#xywh=5014,2434,196,40"
}, I suppose that for this specific use case (getting access to the XML stuff), we need another annotations list to reference XML external segments (http://iiif.io/api/presentation/2.1/#segments): {
"@context": "http://iiif.io/api/presentation/2/context.json",
"@id": "http://example.org/iiif/book1/annotation/anno1",
"@type": "oa:Annotation",
"motivation": "sc:painting",
"resource":{
"@id": "http://example.org/iiif/book1/res/alto.xml#xpointer(//String[@id='Str_001'])",
"@type": "dctypes:Text",
"format": "application/alto+xml"
},
"on": "http://example.org/iiif/book1/canvas/p1#xywh=100,100,500,300"
} |
Description
Some use cases need to get access to information stored in the OCR format:
For these use cases, getting access to the raw OCR objects (or reference to the...) from the IIIF annotation layer would be usefull.
The text was updated successfully, but these errors were encountered: