You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tesseract Version: 4.1.3, but affects latest main branch as well
Platform: Linux localhost.localdomain 5.3.18-150300.59.68-default #1 SMP Wed May 4 11:29:09 UTC 2022 (ea30951) x86_64 x86_64 x86_64 GNU/Linux
Current Behavior
Using the training tools from src/training as a library/shared object is impossible or requires lots of manual modifications for each upstream change in this repository.
Current limitations include (but are not limited to):
Duplicate names (main() and parameter flags).
Parameters have to be passed as argc and argv in some cases where parameter handling is done without flags, while flag-based parameters are not nice to handle either.
In some source files exit(1) is used inside main(), instead of return 1 which generally should be equivalent.
Expected Behavior
The training tools can be used as a library/shared object with a well-defined API like it is possible with the regular Tesseract code. This allows developers to use the training tools in their own projects, for example by wrapping the native C++ code into a language-specific package (like a Python module) to avoid subprocess calls.
Suggested Fix
Introduce a header for the training functionality.
Make implementations independent from main() and allow calling them with regular parameters.
The text was updated successfully, but these errors were encountered:
The current build already creates libtesseract_training.a. Is that library sufficient (maybe if in addition a shared library libtesseract_training.so is built)?
Depending on the use case, either a static or a shared library may make sense, so supporting both (if feasible) probably is the way to go in my opinion.
As for the original request, a suitable header file seems to be still missing. Additionally, the issues regarding duplicate names and the non-optimal behavior for providing parameters appear to still be unresolved.
Environment
Linux localhost.localdomain 5.3.18-150300.59.68-default #1 SMP Wed May 4 11:29:09 UTC 2022 (ea30951) x86_64 x86_64 x86_64 GNU/Linux
Current Behavior
Using the training tools from
src/training
as a library/shared object is impossible or requires lots of manual modifications for each upstream change in this repository.Current limitations include (but are not limited to):
main()
and parameter flags).exit(1)
is used insidemain()
, instead ofreturn 1
which generally should be equivalent.Expected Behavior
The training tools can be used as a library/shared object with a well-defined API like it is possible with the regular Tesseract code. This allows developers to use the training tools in their own projects, for example by wrapping the native C++ code into a language-specific package (like a Python module) to avoid subprocess calls.
Suggested Fix
main()
and allow calling them with regular parameters.The text was updated successfully, but these errors were encountered: