Replies: 2 comments
-
I will implement this shortly - it will set the default language to the language used for OCR, unless the field was already set to something else. I think if someone is concerned about distinguishing the OCR language and document language, I imagine they need code to manage other things too. Or is there a common use case for making this distinction? |
Beta Was this translation helpful? Give feedback.
-
Hi James, Regarding the suggested solution, I think it's better not to mismatch the OCR language and the document language. I propose that we make it clear that the OCR language parameter (-l) is for defining the character set(s) for recognition, while the document language is for accessibility purposes. On the other hand, I can imagine languages that contain identical character sets but with different pronunciations. This means I can use the same OCR language, but I need different document languages. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I would like to propose expanding the metadata options with a '--language' parameter to set the document's default language.
This addition would be very useful for specifying the reading language of the resulting document, and it aligns with the guidelines at https://www.w3.org/WAI/WCAG22/Techniques/pdf/PDF16.
Thank you for considering this suggestion.
Beta Was this translation helpful? Give feedback.
All reactions