-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] Fix possible segfaults in hardsubx_classifier.c due to strdup #963
Conversation
src/lib_ccx/hardsubx_classifier.c
Outdated
if(word==NULL || strlen(word)==0) | ||
continue; | ||
word = strdup(word); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem correct to me (but the previous one isn't either).
word = strdup(word);
We're losing the original pointer - assuming TessResultIteratorGetUTF8Text() allocates something (I assume it does, but I don't know for sure) we now have a leak.
Using strdup() seems OK to me IF we're going to be messing with the data later, but in this case we should immediately free the tesseract resource after making a copy, since we no longer need it.
TessBaseAPIGetIterator
TessAPI.TessResultIterator TessBaseAPIGetIterator(TessAPI.TessBaseAPI handle)
Get a reading-order iterator to the results of LayoutAnalysis and/or Recognize. ****. WARNING! This class points to data held within the TessBaseAPI class, and therefore can only be used while the TessBaseAPI class still exists and has not been subjected to a call of Init, SetImage, Recognize, Clear, End, DetectOS, or anything else that changes the internal PAGE_RES.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, I didn't realize that I was creating a memory leak. So something like:
char *word_ = TessResultIteratorGetUTF8Text(it, level);
if(word_ == NULL || !strlen(word_))
continue;
char *word = strdup(word_);
free(word_);
This should work fine right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No... you can't use free() to unallocate resources returned by TessResultIteratorGetUTF8Text() - take a look at the link I sent. Most likely (but check - I just browsed the API) it's TessResultIteratorDelete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment on the affected line.
ba426d8
to
a43900d
Compare
strdup will give a segmentation fault if the argument passed to it is NULL. TessResultIteratorGetUTF8Text returns a char* which can be NULL and we should not call strdup directly over it. Once we check if the value returned is not NULL, then we can call strdup.
In raising this pull request, I confirm the following (please check boxes):
My familiarity with the project is as follows (check one):
strdup
will give a segmentation fault if the argument passed to it isNULL
.TessResultIteratorGetUTF8Text
returns achar*
which can beNULL
and we should not call
strdup
directly over it. Once we check if thevalue returned is not
NULL
, then we can callstrdup
.This also fixes #928.