-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Closed
Labels
is-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF
Description
I was attempting to read the outline of a PDF (link here), but I found an unexpected AssertionError.
Environment
My environment was in Google Colab with a simple PIP installation setup.
!pip install pypdf
Here is the debug:
$ python -m platform
Linux-6.1.123+-x86_64-with-glibc2.35
$ python -c "import pypdf;print(pypdf._debug_versions)"
pypdf==6.0.0, crypt_provider=('cryptography', '43.0.3'), PIL=11.3.0Code + PDF
This is a minimal, complete example that shows the issue:
from pypdf import PdfReader
pdf_reader = PdfReader("e371fffe0b_a7cccde95a.pdf")
outline = pdf_reader.outlineDownload the PDF file here (14 MB). Sorry, it's not a small PDF.
Traceback
This is the complete traceback I see:
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
[/tmp/ipython-input-381717387.py](https://localhost:8080/#) in <cell line: 0>()
1 from pypdf import PdfReader
2 pdf_reader = PdfReader("e371fffe0b_a7cccde95a.pdf")
----> 3 outline = pdf_reader.outline
6 frames
[/usr/local/lib/python3.12/dist-packages/pypdf/_doc_common.py](https://localhost:8080/#) in outline(self)
830 'bookmarks').
831 """
--> 832 return self._get_outline()
833
834 def _get_outline(
[/usr/local/lib/python3.12/dist-packages/pypdf/_doc_common.py](https://localhost:8080/#) in _get_outline(self, node, outline)
849 if not is_null_or_none(lines) and "/First" in lines:
850 node = cast(DictionaryObject, lines["/First"])
--> 851 self._named_destinations = self._get_named_destinations()
852
853 if node is None:
[/usr/local/lib/python3.12/dist-packages/pypdf/_doc_common.py](https://localhost:8080/#) in _get_named_destinations(self, tree, retval)
480 # recurse down the tree
481 for kid in cast(ArrayObject, tree[PagesAttributes.KIDS]):
--> 482 self._get_named_destinations(kid.get_object(), retval)
483 # §7.9.6, entries in a name tree node dictionary
484 elif CA.NAMES in tree: # /Kids and /Names are exclusives (§7.9.6)
[/usr/local/lib/python3.12/dist-packages/pypdf/_doc_common.py](https://localhost:8080/#) in _get_named_destinations(self, tree, retval)
501 else:
502 continue
--> 503 dest = self._build_destination(key, value)
504 if dest is not None:
505 retval[key] = dest
[/usr/local/lib/python3.12/dist-packages/pypdf/_doc_common.py](https://localhost:8080/#) in _build_destination(self, title, array)
947 page, typ, *array = array # type: ignore
948 try:
--> 949 return Destination(title, page, Fit(fit_type=typ, fit_args=array)) # type: ignore
950 except PdfReadError:
951 logger_warning(f"Unknown destination: {title} {array}", __name__)
[/usr/local/lib/python3.12/dist-packages/pypdf/generic/_data_structures.py](https://localhost:8080/#) in __init__(self, title, page, fit)
1611
1612 DictionaryObject.__init__(self)
-> 1613 self[NameObject("/Title")] = TextStringObject(title)
1614 self[NameObject("/Page")] = page
1615 self[NameObject("/Type")] = typ
[/usr/local/lib/python3.12/dist-packages/pypdf/generic/_base.py](https://localhost:8080/#) in __new__(cls, value)
670 o.utf16_bom = b""
671 if o.startswith(("\xfe\xff", "\xff\xfe")):
--> 672 assert org is not None, "mypy"
673 try:
674 o = str.__new__(cls, org.decode("utf-16"))
AssertionError: mypy
Metadata
Metadata
Assignees
Labels
is-bugFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDFFrom a users perspective, this is a bug - a violation of the expected behavior with a compliant PDF