Cloning entire PDF along with catalog properties #159

sdfereday · 2019-08-09T15:06:59Z

Hello,
Is it possible to clone an entire PDF and include all of the original PDF's properties such as the catalog with acroForms, etc? I'm attempting to reconstruct a new PDF from an old PDF only with less pages. At present I've just got a little test function:

const includePages = async originalDoc => {
  // Create a new document to house the pages 'still' wanted
  const modifiedDoc = await PDFDocument.create()

  // Copy in the properties from the original (catalog, etc)
  // ... ¯\_(ツ)_/¯
  
  // Copy pages 0 and 1 only for example (say our original PDF has like 5 pages)
  const pages = await modifiedDoc.copyPages(originalDoc, [0, 1])
  pages.forEach(page => modifiedDoc.addPage(page))

  // Return the modified document with pages 0 and 1 only
  return modifiedDoc
}

At present the modified document only carries across the catalog with 'type' and 'pages' in its dictionary, but it's missing a few other ones such as the 'acroForm', so when I go to modify some of the interactive fields later on, it produces an error since it no longer exists.

Is this a thing anyone's come across and successfully implemented?

Thanks very much :)

Hopding · 2019-12-27T18:09:22Z

Hello @sdfereday!

I think that copying specific structures from a donor PDF makes sense for certain use cases (e.g. #218). But this can be tricky, because automatically copying all the catalog structures will probably result in undesired/unexpected behavior most of the time (e.g. nonsensically merged outlines, conflicting AcroFields, etc...). So this would need to be implemented carefully, perhaps on a case-by-base basis.

However, for the specific use case you're asking about here (constructing a new PDF from an old PDF only with less pages), it seems to me that you could just load the PDF and remove pages with PDFDocument.removePage. For example:

const pdfDoc = await PDFDocument.load(pdfBytes);
pdfDoc.removePage(0);
pdfDoc.removePage(2);
const newPdf = await pdfDoc.save();

I hope this helps. Please let me know if you have any additional questions! And my apologies for the (very) delayed response.

Hopding mentioned this issue Oct 15, 2019

[Feature Request]: Preserve bookmarks when copying pages #218

Closed

Hopding closed this as completed Dec 27, 2019

Hopding mentioned this issue Dec 28, 2019

[Feature Request]: PDF should decrease in file size after calling removePage #140

Closed

Hopding mentioned this issue Feb 9, 2020

Links are lost after combining PDFs #341

Closed

Hopding mentioned this issue Apr 11, 2020

Existing tag structures are not maintained when copying pages. #402

Closed

Hopding mentioned this issue Apr 28, 2020

How to join different PDFs with AcroForms? #417

Closed

Hopding mentioned this issue Sep 28, 2020

Layer names lost when embedding pdf #521

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cloning entire PDF along with catalog properties #159

Cloning entire PDF along with catalog properties #159

sdfereday commented Aug 9, 2019

Hopding commented Dec 27, 2019

Cloning entire PDF along with catalog properties #159

Cloning entire PDF along with catalog properties #159

Comments

sdfereday commented Aug 9, 2019

Hopding commented Dec 27, 2019