-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request: Have .extract_text()
return an empty string (''
) instead of None
in the case of no text found in a PDF
#482
Comments
Thank you for this proposal, @tungph. I think it makes sense, but I want to be careful that we're not overlooking important use-cases. Especially: Are there instances where it would be important to distinguish between |
My preference would be to keep the existing An important question should also be whether pdfplumber can even return an empty string? I think so yes as there exists a Unicode code for an empty string and if a PDF consists of that single character, an empty string will be returned. It would be ambiguous as it can mean both that no character exists or a null character exists. |
Thanks @samkit-jain. That (needing to distinguish between
Still, I'm not quite 100% there. Is And, as you point out: This, if implemented, would constitute a breaking change for many workflows. It would probably have to wait for |
extract_text
utils must return an empty string ''
instead of None
in case of no text found in a pdf.extract_text()
return an empty string (''
) instead of None
in the case of no text found in a PDF
pdfplumber/pdfplumber/utils.py
Lines 372 to 374 in 002500a
I would like to make a small suggestion here to avoid for
None
checking in the calling code,extract_text
utils must return an empty string''
instead ofNone
in case of no text found in a pdf.The text was updated successfully, but these errors were encountered: