Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloning entire PDF along with catalog properties #159

Closed
sdfereday opened this issue Aug 9, 2019 · 1 comment
Closed

Cloning entire PDF along with catalog properties #159

sdfereday opened this issue Aug 9, 2019 · 1 comment

Comments

@sdfereday
Copy link

Hello,
Is it possible to clone an entire PDF and include all of the original PDF's properties such as the catalog with acroForms, etc? I'm attempting to reconstruct a new PDF from an old PDF only with less pages. At present I've just got a little test function:

const includePages = async originalDoc => {
  // Create a new document to house the pages 'still' wanted
  const modifiedDoc = await PDFDocument.create()

  // Copy in the properties from the original (catalog, etc)
  // ... ¯\_(ツ)_/¯
  
  // Copy pages 0 and 1 only for example (say our original PDF has like 5 pages)
  const pages = await modifiedDoc.copyPages(originalDoc, [0, 1])
  pages.forEach(page => modifiedDoc.addPage(page))

  // Return the modified document with pages 0 and 1 only
  return modifiedDoc
}

At present the modified document only carries across the catalog with 'type' and 'pages' in its dictionary, but it's missing a few other ones such as the 'acroForm', so when I go to modify some of the interactive fields later on, it produces an error since it no longer exists.

Is this a thing anyone's come across and successfully implemented?

Thanks very much :)

@Hopding
Copy link
Owner

Hopding commented Dec 27, 2019

Hello @sdfereday!

I think that copying specific structures from a donor PDF makes sense for certain use cases (e.g. #218). But this can be tricky, because automatically copying all the catalog structures will probably result in undesired/unexpected behavior most of the time (e.g. nonsensically merged outlines, conflicting AcroFields, etc...). So this would need to be implemented carefully, perhaps on a case-by-base basis.

However, for the specific use case you're asking about here (constructing a new PDF from an old PDF only with less pages), it seems to me that you could just load the PDF and remove pages with PDFDocument.removePage. For example:

const pdfDoc = await PDFDocument.load(pdfBytes);
pdfDoc.removePage(0);
pdfDoc.removePage(2);
const newPdf = await pdfDoc.save();

I hope this helps. Please let me know if you have any additional questions! And my apologies for the (very) delayed response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants