Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type check for obj in PDFPageInterpreter #441

Closed
tongbaojia opened this issue Jun 9, 2020 · 2 comments · Fixed by #451
Closed

Type check for obj in PDFPageInterpreter #441

tongbaojia opened this issue Jun 9, 2020 · 2 comments · Fixed by #451

Comments

@tongbaojia
Copy link
Contributor

tongbaojia commented Jun 9, 2020

Bug report

Traceback (most recent call last):
  File "pdf2txt.py", line 188, in <module>
    sys.exit(main())
  File "pdf2txt.py", line 182, in main
    outfp = extract_text(**vars(A))
  File "pdf2txt.py", line 56, in extract_text
    pdfminer.high_level.extract_text_to_fp(fp, **locals())
  File "***/lib/python3.6/site-packages/pdfminer/high_level.py", line 86, in extract_text_to_fp
    interpreter.process_page(page)
  File "***/lib/python3.6/site-packages/pdfminer/pdfinterp.py", line 895, in process_page
    self.render_contents(page.resources, page.contents, ctm=ctm)
  File "***/lib/python3.6/site-packages/pdfminer/pdfinterp.py", line 908, in render_contents
    self.execute(list_value(streams))
  File "***/lib/python3.6/site-packages/pdfminer/pdfinterp.py", line 933, in execute
    func(*args)
  File "***/lib/python3.6/site-packages/pdfminer/pdfinterp.py", line 840, in do_EI
    if 'W' in obj and 'H' in obj:
TypeError: a bytes-like object is required, not 'str'
@tongbaojia
Copy link
Contributor Author

sample.pdf

@pietermarsman
Copy link
Member

I can replicate this issue using the newest pdfminer.six.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants