Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if datadir exists: fixes #4364 #4365

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/api/baseapi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,10 @@ static void ExtractFontName(const char* filename, std::string* fontname) {
*/
static void addAvailableLanguages(const std::string &datadir,
std::vector<std::string> *langs) {
if (!std::filesystem::exists(datadir)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not sufficient. It passes if datadir exists and is a file (which still raises an exception). I suggest to catch the exception which is thrown by std::filesystem::recursive_directory_iterator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

touch a.txt
tesseract --list-langs --tessdata-dir a.txt
Error: The directory 'a.txt/' does not exist.
List of available languages in "a.txt/" (0):

@stweil: Can you please provide a scenario / OS where this patch fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On macOS:

% touch a.txt
% tesseract --list-langs --tessdata-dir a.txt                       
libc++abi: terminating due to uncaught exception of type std::__1::__fs::filesystem::filesystem_error: filesystem error: in recursive_directory_iterator: Not a directory ["a.txt/"]
zsh: abort      tesseract --list-langs --tessdata-dir a.txt

Copy link
Contributor

@egorpugin egorpugin Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have main() wrapped into try catch?
Error text is ok, we need only catch std::exception globally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed that in 0a9df94

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stweil : I tested this PR on linux (g++ 11.3.0) and it does not produce "uncaught exception" (without egorpugin commit). Do you still need to add something like this?

diff --git a/src/api/baseapi.cpp b/src/api/baseapi.cpp
index 667ead67..e684c7fe 100644
--- a/src/api/baseapi.cpp
+++ b/src/api/baseapi.cpp
@@ -146,6 +146,10 @@ static void ExtractFontName(const char* filename, std::string* fontname) {
  */
 static void addAvailableLanguages(const std::string &datadir,
                                   std::vector<std::string> *langs) {
+  if (!std::filesystem::is_directory(datadir)) {
+    std::cerr << "Error: '" << datadir << "' is not directory.\n";
+    return;
+  }
   if (!std::filesystem::exists(datadir)) {
     std::cerr << "Error: The directory '" << datadir << "' does not exist.\n";
     return;

Or should be make these checks at ccutils.cpp and fail if directory not exists?

Copy link
Member

@stweil stweil Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the error messages.

Old

libc++abi: terminating due to uncaught exception of type std::__1::__fs::filesystem::filesystem_error: filesystem error: in recursive_directory_iterator: No such file or directory ["tessdata/"]
libc++abi: terminating due to uncaught exception of type std::__1::__fs::filesystem::filesystem_error: filesystem error: in recursive_directory_iterator: Not a directory ["tessdata/"]

New:

exception: filesystem error: in recursive_directory_iterator: Not a directory ["tessdata/"]
exception: filesystem error: in recursive_directory_iterator: No such file or directory ["tessdata/"]

The new error messages are nicer, but are they user friendly? And do we want to introduce C++ exceptions in the C / C++ API? If the answers are no, we must either catch or avoid the exception in addAvailableLanguages and print a user friendly error message.

Copy link
Contributor

@egorpugin egorpugin Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. C API should be removed later.
  2. We want c++ exceptions in api. Nothing wrong here.
  3. Errors are fine.

std::cerr << "Error: The directory '" << datadir << "' does not exist.\n";
return;
}
for (const auto& entry :
std::filesystem::recursive_directory_iterator(datadir,
std::filesystem::directory_options::follow_directory_symlink |
Expand Down
Loading