You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Name: Emma Frothingham
Email: emma.frothingham@stanford.edu
Comment re: druid:vq627fg9932
Noticing this and some other transcripts are having some OCR issues. When you do a text search (I've been
using "interview" since I know its in the front matter) it will take you to the correct page, but the text
that's highlighted is incorrect.
Also noticing this on the following DRUIDs: qk681gt0202, cf701vv6058, hd885gx4798, fp554yc9826, ky320sk0660,
rm620vw2642, cy927bt6518, cb635cj8418, zy935dw5016, tv486ws8292, st067nr2181, yq520nc5196, vc245bc2056,
fr376sh7526, cc039qv8268, wg274vq1218, mt661sw7493, wd772fk6025
In a discussion with @anarchivist, they indicate this seems to either be a bug in ABBYY, or a bug in content search. It is noted that for druid:vq627fg9932, the ABBYY-generated page size in the XML differs from the actual page size found in the technicalMetadata, hence the incorrect display of the hit-highlighting.
@calavano do you have any thoughts here? Mark says this is not about Alto 3.1.
The text was updated successfully, but these errors were encountered:
These items were accessioned around 2019 and while I haven't checked all of them, it does not look like the files have ever been changed after accessioning. So they've had mismatching ALTO and image dimensions for a couple of years. Since the SDR accessioning process doesn't change the image size, I wonder if something happened during the OCR process where a set of images were OCR'd and then the images re-sized post-OCR and pre-SDR?
Looking at more recent oral history accessions, I don't see the same image dimension mismatches so it doesn't look like a current, ongoing problem.
On 11/4/21, 2:55 PM, "purl-feedback on behalf of feedback@purl.stanford.edu" <purl-feedback-bounces@lists.stanford.edu on behalf of feedback@purl.stanford.edu> wrote:
In a discussion with @anarchivist, they indicate this seems to either be a bug in ABBYY, or a bug in content search. It is noted that for druid:vq627fg9932, the ABBYY-generated page size in the XML differs from the actual page size found in the technicalMetadata, hence the incorrect display of the hit-highlighting.
@calavano do you have any thoughts here? Mark says this is not about Alto 3.1.
The text was updated successfully, but these errors were encountered: