Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROB: Raise PdfReadError when missing /Root in trailer. #2808

Merged
merged 2 commits into from
Aug 23, 2024

Conversation

BertrandBordage
Copy link
Contributor

@BertrandBordage BertrandBordage commented Aug 22, 2024

Fixes #2806.

When running the same code as described in #2806 with the same PDF, now this happens:

>>> list(reader.pages)
Object 493 0 not defined.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/pypdf/pypdf/_page.py", line 2356, in __len__
    return self.length_function()
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/pypdf/pypdf/_doc_common.py", line 352, in get_num_pages
    self._flatten()
  File "/pypdf/pypdf/_doc_common.py", line 1100, in _flatten
    catalog = self.root_object
              ^^^^^^^^^^^^^^^^
  File "/pypdf/pypdf/_reader.py", line 195, in root_object
    raise PdfReadError('Cannot find "/Root" key in trailer')
pypdf.errors.PdfReadError: Cannot find "/Root" key in trailer

@BertrandBordage BertrandBordage changed the title [Robustness] Raise PdfReadError when missing /Root in trailer. ROB: Raise PdfReadError when missing /Root in trailer. Aug 22, 2024
@BertrandBordage BertrandBordage marked this pull request as draft August 22, 2024 23:16
Copy link

codecov bot commented Aug 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.86%. Comparing base (d2d520b) to head (9e73fb9).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2808   +/-   ##
=======================================
  Coverage   95.85%   95.86%           
=======================================
  Files          51       51           
  Lines        8573     8576    +3     
  Branches     1695     1696    +1     
=======================================
+ Hits         8218     8221    +3     
  Misses        212      212           
  Partials      143      143           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@BertrandBordage BertrandBordage marked this pull request as ready for review August 22, 2024 23:31
@stefan6419846 stefan6419846 merged commit 9f08cd0 into py-pdf:main Aug 23, 2024
16 checks passed
@pubpub-zz pubpub-zz mentioned this pull request Sep 15, 2024
pubpub-zz added a commit that referenced this pull request Sep 17, 2024
## Version 5.0.0, 2024-09-15

This version drops support for Python 3.7 (not maintained since July 2023), PdfMerger (use PdfWriter instead) and AnnotationBuilder (use annotations instead).


### Deprecations (DEP)
- Remove the deprecated PfdMerger and AnnotationBuilder classes and other deprecations cleanup (#2813)
- Drop Python 3.7 support (#2793)

### New Features (ENH)
- Add capability to remove /Info from PDF (#2820)
- Add incremental capability to PdfWriter (#2811)
- Add UniGB-UTF16 encodings (#2819)
- Accept utf strings for metadata (#2802)
- Report PdfReadError instead of RecursionError (#2800)
- Compress PDF files merging identical objects (#2795)

### Bug Fixes (BUG)
- Fix sheared image (#2801)

### Robustness (ROB)
- Robustify .set_data() (#2821)
- Raise PdfReadError when missing /Root in trailer (#2808)
- Fix extract_text() issues on damaged PDFs (#2760)
- Handle images with empty data when processing an image from bytes (#2786)

### Developer Experience (DEV)
- Fix coverage uploads (#2832)
- Test against Python 3.13 (#2776)


[Full Changelog](4.3.1...5.0.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing root object raising: 'NoneType' object has no attribute 'get_object' (different from #1295 & #1689)
2 participants