Skip to content

Commit fa6c927

Browse files
committed
Recent changes
1 parent 5a06e96 commit fa6c927

File tree

7 files changed

+110
-91
lines changed

7 files changed

+110
-91
lines changed

docs/index.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,6 @@ You can contribute to `pypdf on GitHub <https://github.com/py-pdf/pypdf>`_.
6767
modules/errors
6868
modules/generic
6969
modules/PdfDocCommon
70-
7170

7271
.. toctree::
7372
:caption: Developer Guide

docs/user/adding-pdf-outlines.md

Lines changed: 62 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -7,81 +7,85 @@ With pypdf, you can create simple or deeply nested outlines programmatically.
77

88
## `PdfWriter.add_outline_item()`
99

10-
**Source:** `pypdf/_writer.py`
10+
**Source:** `pypdf/_writer.py`
1111
Adds an outline (bookmark) entry to the PDF document.
1212

13-
1413
## **Syntax**
1514

1615
```python
1716
add_outline_item(
17+
self,
1818
title: str,
19-
page_number: int | None = None,
20-
parent: Any | None = None,
21-
color: tuple | None = None,
19+
page_number: Union[None, PageObject, IndirectObject, int],
20+
parent: Union[None, TreeObject, IndirectObject] = None,
21+
before: Union[None, TreeObject, IndirectObject] = None,
22+
color: Optional[Union[tuple[float, float, float], str]] = None,
2223
bold: bool = False,
2324
italic: bool = False,
25+
fit: Fit = PAGE_FIT,
2426
is_open: bool = True,
25-
fit: str | None = None,
26-
zoom: float | None = None
27-
) -> Any
27+
) -> IndirectObject:
2828
```
2929

30-
3130
## Parameters
3231

3332
The following parameters are available for `add_outline_item()`:
3433

35-
| Name | Type | Default | Description |
36-
|--------------|---------------------------|---------|-------------|
37-
| `title` | `str` || The visible text label shown in the PDF outline panel. |
38-
| `page_number`| `int`, optional | `None` | Zero-based target page index. If `None`, the item becomes a non-clickable parent/group header. |
39-
| `parent` | `Any`, optional | `None` | The parent outline item under which this one will be nested. If omitted, this becomes a top-level outline. |
40-
| `color` | `tuple`, optional | `None` | RGB color tuple with values between `0–1`. Example: `(1, 0, 0)` for red. |
41-
| `bold` | `bool` | `False` | If `True`, the outline title is displayed in bold. |
42-
| `italic` | `bool` | `False` | If `True`, the outline title is displayed in italic. |
43-
| `is_open` | `bool` | `True` | Whether the outline node is expanded when the PDF opens. |
44-
| `fit` | `str`, optional | `None` | Controls how the destination page is displayed (Fit, FitH, FitV, FitR, XYZ). |
45-
| `zoom` | `float`, optional | `None` | Used only when `fit="XYZ"`. Example: `1.0` = 100% zoom. |
46-
47-
### Fit Mode Options
34+
| Name | Type | Default | Description |
35+
| ------------- | ------------------------------------------------ | ---------- | ---------------------------------------------------------------------------------------------------------------------------------- |
36+
| `title` | `str` || The visible text label shown in the PDF outline panel. |
37+
| `page_number` | `None`, `int`, `PageObject`, or `IndirectObject` || Destination page for the outline item. May be set to `None`, making the entry non-clickable and usable as a parent/group node. |
38+
| `parent` | `None`, `TreeObject`, or `IndirectObject` | `None` | Makes the outline item a child of the given parent outline node. If omitted, it becomes a top-level entry. |
39+
| `before` | `None`, `TreeObject`, or `IndirectObject` | `None` | Inserts the outline item before another existing outline item at the same level. Used to control ordering. |
40+
| `color` | `tuple[float, float, float]` or `str`, optional | `None` | Sets the outline text color. Tuples must use 0–1 float values (e.g., `(1, 0, 0)` for red). Some named colors may also be accepted. |
41+
| `bold` | `bool` | `False` | Displays the outline title in bold. |
42+
| `italic` | `bool` | `False` | Displays the outline title in italic. |
43+
| `fit` | `Fit` | `PAGE_FIT` | Determines how the destination page is displayed (Fit, FitH, FitV, FitR, XYZ, etc.). |
44+
| `is_open` | `bool` | `True` | Controls whether the outline node appears expanded in the PDF viewer when opened. |
4845

49-
| Value | Meaning |
50-
|--------|---------|
51-
| `"Fit"` | Display the entire page. |
52-
| `"FitH"` | Fit to width, aligned at the top. |
53-
| `"FitV"` | Fit to height. |
54-
| `"FitR"` | Fit a specific rectangle region. |
55-
| `"XYZ"` | Use a custom zoom level (`zoom=` required). |
5646

47+
### Fit Mode Options
5748

49+
| Fit Method | Meaning |
50+
| --------------------------------------------- | ------------------------------------------ |
51+
| `Fit.fit()` | Display the entire page. |
52+
| `Fit.fit_horizontally(top)` | Fit page width, aligned at the given top. |
53+
| `Fit.fit_vertically(left)` | Fit page height, aligned at the given left.|
54+
| `Fit.fit_rectangle(left, bottom, right, top)` | Fit a specific rectangular region. |
55+
| `Fit.xyz(left, top, zoom)` | Custom position and zoom level. |
5856

59-
## **Return Type:** `Any`
57+
## **Return Type:** `IndirectObject`
6058

61-
The method returns a reference to the created outline item.
62-
This reference is typically used when creating nested (child) outline items.
59+
Returns a reference to the created outline item, which can be used when adding nested children.
6360

6461
### Example
6562
```python
66-
parent = writer.add_outline_item("Chapter 1", page_number=0)
67-
writer.add_outline_item("Section 1.1", page_number=1, parent=parent)
63+
chapter = writer.add_outline_item("Chapter 1", page_number=0)
64+
writer.add_outline_item("Section 1.1", page_number=1, parent=chapter)
6865
```
6966

70-
7167
## Exceptions
7268

7369
The `add_outline_item()` method may raise the following exceptions:
7470

75-
| Exception | When it occurs |
76-
|-----------------|----------------|
77-
| `ValueError` | Raised when `page_number` is out of range, `fit` is invalid, or `color` is not a valid `(r, g, b)` tuple (each value must be a float between 0–1). |
78-
| Internal errors | Occur if an invalid `parent` reference is passed, or if the outline tree becomes corrupted internally. |
71+
| Exception | When it occurs |
72+
|---------------|----------------|
73+
| `ValueError` | Raised when `page_number` is out of range, the `fit` argument is invalid, or when the `color` tuple contains values outside the 0–1 float range. |
74+
| `TypeError` | Raised when arguments such as `parent`, `before`, or `color` are provided using unsupported types. |
75+
| `IndexError` | May occur if a referenced page index is not available in the document. |
7976

77+
## Example: Full PDF Outline with All Parameters
8078

79+
```{testsetup}
80+
pypdf_test_setup("user/adding-pdf-outlines", {
81+
"output.pdf": "../resources/output.pdf",
82+
"example1.pdf": "../resources/example1.pdf",
83+
"input.pdf": "../resources/input.pdf"
8184
82-
## Example: Full PDF Outline with All Parameters
85+
})
86+
```
8387

84-
```python
88+
```{testcode}
8589
from pypdf import PdfReader, PdfWriter
8690
from pypdf.generic._fit import Fit # Use Fit only
8791
@@ -105,7 +109,7 @@ chapter1 = writer.add_outline_item(
105109
106110
# Section under Chapter 1 (dark green, italic, collapsed)
107111
section1_1 = writer.add_outline_item(
108-
title="Section 1.1: Overview",
112+
title="Section 1.1: Getting Started",
109113
page_number=1,
110114
parent=chapter1,
111115
color=(0, 0.5, 0),
@@ -117,7 +121,7 @@ section1_1 = writer.add_outline_item(
117121
118122
# Section with custom zoom
119123
section1_2 = writer.add_outline_item(
120-
title="Section 1.2: Details",
124+
title="Section 1.2: Printing a Test Page",
121125
page_number=2,
122126
parent=chapter1,
123127
color=(1, 0, 0),
@@ -153,8 +157,6 @@ writer.add_outline_item(
153157
output_path = "output.pdf"
154158
with open(output_path, "wb") as f:
155159
writer.write(f)
156-
157-
print(f"PDF with outlines created successfully: {output_path}")
158160
```
159161

160162
### What this code demonstrates
@@ -165,11 +167,11 @@ print(f"PDF with outlines created successfully: {output_path}")
165167
* Using different page view modes: `Fit, FitH, FitV, XYZ.`
166168
* Produces a navigable outline tree in the PDF reader.
167169

168-
169170
## Adding a Simple Outline
170171

171172
Use this when you want a single top-level bookmark pointing to a page.
172-
```python
173+
174+
```{testcode}
173175
from pypdf import PdfReader, PdfWriter
174176
175177
reader = PdfReader("input.pdf")
@@ -198,18 +200,17 @@ with open("output.pdf", "wb") as f:
198200

199201
![PDF outline output](simple-outline.png)
200202

201-
202203
## Adding Nested (Hierarchical) Outlines
203204

204205
Nested outlines create structures like:
205206

206207
```text
207-
Chapter 1
208+
Introduction
208209
├── Section 1.1
209210
└── Section 1.2
210211
```
211212

212-
```python
213+
```{testcode}
213214
from pypdf import PdfReader, PdfWriter
214215
215216
reader = PdfReader("input.pdf")
@@ -219,35 +220,36 @@ writer = PdfWriter()
219220
for page in reader.pages:
220221
writer.add_page(page)
221222
222-
# Add parent (chapter)
223-
chapter = writer.add_outline_item(
224-
title="Chapter 1",
223+
# Add parent (Introduction)
224+
introduction = writer.add_outline_item(
225+
title="Introduction",
225226
page_number=0
226227
)
227228
228229
# Add children (sections)
229230
writer.add_outline_item(
230-
title="Section 1.1",
231+
title="Section 1",
231232
page_number=1,
232-
parent=chapter
233+
parent=introduction
233234
)
234235
235236
writer.add_outline_item(
236-
title="Section 1.2",
237+
title="Section 2",
237238
page_number=2,
238-
parent=chapter
239+
parent=introduction
239240
)
240241
241242
with open("output.pdf", "wb") as f:
242243
writer.write(f)
243244
```
244245

246+
245247
### What the nested outline code does
246248

247249
* Copies all pages into the new PDF
248-
* Creates a top-level outline called Chapter 1
249-
* Adds Section 1.1 under that chapter
250-
* Adds Section 1.2 under the same chapter
250+
* Creates a top-level outline called Introduction
251+
* Adds Section 1 under that Introduction
252+
* Adds Section 2 under the same Introduction
251253
* Produces an outline tree like:
252254

253255
![PDF outline output](nested-outline.png)
@@ -260,7 +262,3 @@ with open("output.pdf", "wb") as f:
260262
- You can build multiple hierarchical levels (chapter → section → subsection → etc.).
261263
- A bookmark must point to a valid page, or the PDF reader may hide or ignore it.
262264
- Nested outlines improve navigation for large PDFs.
263-
264-
265-
266-

docs/user/nested-outline.png

100 KB
Loading

docs/user/reading-pdf-outlines.md

Lines changed: 48 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,4 @@
1-
```{testsetup}
2-
pypdf_test_setup("user/reading-pdf-outlines", {
3-
"example.pdf": "../resources/example.pdf",
4-
})
5-
```
6-
7-
# Reading PDF Outlines
1+
# Reading PDF Outlines
82

93
PDF outlines (also called *bookmarks*) help users navigate through a document.
104
pypdf allows you to read both simple and nested outlines from any PDF.
@@ -13,19 +7,19 @@ pypdf allows you to read both simple and nested outlines from any PDF.
137
```{note}
148
* **`outline` parameter:** It represents a single bookmark in the PDF. Each bookmark has a title and points to a specific page.
159
* **`get_destination_page_number(outline)`:** It returns the destination page as a **0-indexed integer** (first page = 0).
16-
* It supports both **simple (flat)** outlines and **nested (hierarchical)** outlines where Parent bookmarks contain child bookmarks.
10+
* It supports both **simple (flat)** outlines and **nested (hierarchical)** outlines where Parent bookmarks contain child bookmarks.
1711
```
1812

1913
## How nested outlines are represented internally
2014

21-
When a PDF contains hierarchical bookmarks (Chapter → Section → Topic),
22-
pypdf stores them using **lists inside lists**.
15+
When a PDF contains hierarchical bookmarks (Chapter → Section → Topic),
16+
pypdf stores them using **lists inside lists**.
2317
Each list represents a deeper level of nesting.
2418

2519
### How to interpret the structure
2620

27-
- **A `Destination`** → represents a single bookmark
28-
- **A list** → represents a group of child bookmarks
21+
- **A `Destination`** → represents a single bookmark
22+
- **A list** → represents a group of child bookmarks
2923
- Nesting can be **multiple levels deep** (there is no limit)
3024

3125
### Visual Interpretation (Tree Format)
@@ -71,17 +65,29 @@ This is how pypdf internally represents nested outlines:
7165

7266
Use this method if your PDF has a flat list of outlines.
7367

74-
7568
```{testsetup}
69+
pypdf_test_setup("user/reading-pdf-outlines", {
70+
"example1.pdf": "../resources/example1.pdf",
71+
"output.pdf": "../resources/output.pdf"
72+
})
73+
```
74+
75+
```{testcode}
7676
from pypdf import PdfReader
7777
78-
reader = PdfReader("example.pdf")
78+
reader = PdfReader("example1.pdf")
7979
8080
for outline in reader.outline:
8181
page_num = reader.get_destination_page_number(outline)
8282
print(f"{outline.title} -> page {page_num + 1}")
8383
```
8484

85+
```{testoutput}
86+
:hide:
87+
88+
Introduction -> page 1
89+
```
90+
8591
### What this simple code does:
8692
* Loops through each top-level outline item
8793
* Gets the page number for that outline
@@ -92,26 +98,42 @@ for outline in reader.outline:
9298

9399
Use this method when your PDF contains parent → child outline items
94100

95-
```python
101+
```{testcode}
96102
from pypdf import PdfReader
97103
98-
def print_outline(outlines, level=0):
104+
def print_outline(outlines, reader, level=0):
99105
"""Recursively print all outline items with indentation."""
100106
for item in outlines:
101-
if isinstance(item, list): # This is a nested list of bookmarks
102-
print_outline(item, level + 1)
107+
if isinstance(item, list): # nested children
108+
print_outline(item, reader, level + 1)
103109
else:
104-
page_num = reader.get_destination_page_number(item)
110+
try:
111+
page_num = reader.get_destination_page_number(item)
112+
except Exception:
113+
page_num = None
114+
105115
indent = " " * level
106-
print(f"{indent}- {item.title} (Page {page_num + 1})")
107116
108-
reader = PdfReader("example.pdf")
109-
print_outline(reader.outline)
117+
if page_num is None:
118+
print(f"{indent}- {item.title} (No page destination)")
119+
else:
120+
print(f"{indent}- {item.title} (Page {page_num + 1})")
121+
122+
reader = PdfReader("output.pdf")
123+
print_outline(reader.outline, reader)
110124
```
111125

112-
### What this nested code does:
113-
* Recursively handles nested outline structures
114-
* Adds indentation for each depth level
115-
* Prints a clean hierarchical view with page numbers
126+
```{testoutput}
127+
:hide:
116128
129+
- Introduction (Page 1)
130+
- Section 1 (Page 2)
131+
- Section 2 (Page 3)
132+
```
117133

134+
135+
### What this nested code does:
136+
* Recursively handles hierarchical outline structures
137+
* Indents output according to nesting depth
138+
* Prints page numbers when available
139+
* Safely handles undefined or invalid destinations

resources/example1.pdf

7.91 MB
Binary file not shown.

resources/input.pdf

7.91 MB
Binary file not shown.

resources/output.pdf

7.91 MB
Binary file not shown.

0 commit comments

Comments
 (0)