Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clone Objects #1194

Closed
pubpub-zz opened this issue Aug 1, 2022 · 2 comments
Closed

Clone Objects #1194

pubpub-zz opened this issue Aug 1, 2022 · 2 comments
Assignees
Labels
is-feature A feature request

Comments

@pubpub-zz
Copy link
Collaborator

Explanation

I've noted some (many?) issues dealing that PyPDF2 is missing capability to clone objects (and all items below)
for example this would allow to insert the same page multiple time but modifying it

Code Example

the proposed interface would be (for all type of objects)
object.clone(pdf_dest,force_duplicate=False) -> duplicated_object
pdf_dest would be a PdfWriter or PdfMerger
by default already duplicated indirect object will be kept unique but setting force_duplicate to true will force duplication

Any Feedbacks?

@MartinThoma MartinThoma added the is-feature A feature request label Aug 6, 2022
@MartinThoma
Copy link
Member

Sounds good!

I couldn't pin-point specific issues that were caused by this missing feature, but I also have the impression it would be helpful in some cases.

I'm uncertain what a good interface would be. Would the proposed interface mean we could take an object from a PdfReader and clone it to a PdfWriter? What would happen to (indirect) objects referenced within the cloned object? Do we maybe need a deep parameter (where False would mean a shallow copy of only the cloned object, and deep would mean a recursive approach)?

MartinThoma pushed a commit that referenced this issue Dec 11, 2022
The method `.clone(pdf_dest,[force_duplicate])` clones the objects and all referenced objects.

If an object is already cloned, the already cloned object is returned (unless force_duplicate is set)
mainly for internal use but can be used on a page
for pageObject/DictionnaryObject/[Encoded/Decoded/Content]Stream an extra parameter ignore_fields list that provide the list of fields that should not be cloned.

When available, the pointer to an object is available in `indirect_obj` attribute.

New API for add_page/insert_page that :

* returns the cloned page object
* ignore_fields can be provided as a parameter.

## Others

* file is closed at the end of PdfWriter.write when a filename is provided
* Breaking Change: `add_outline_item` now has a parameter before which is not the last parameter

## Update
* The public API of PdfMerger has been added to PdfWriter (ready to make PdfMerger an alias of it)
* Process properly Outline merging
* Process properly Named destinated

Deals with #1194, #1322, #471, #1337
@pubpub-zz
Copy link
Collaborator Author

cloning has been introduced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-feature A feature request
Projects
None yet
Development

No branches or pull requests

2 participants