Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when trying to write multi-page PDF file #143

Closed
slishak opened this issue Oct 31, 2018 · 5 comments
Closed

Error when trying to write multi-page PDF file #143

slishak opened this issue Oct 31, 2018 · 5 comments

Comments

@slishak
Copy link

slishak commented Oct 31, 2018

I appreciate that this is not necessarily an issue with Plotly or Orca, but I'm trying to use PyPDF2 to write a multiple page PDF file, as per the following Python example:

import plotly.io as pio
import plotly.graph_objs as go
import io
from PyPDF2 import PdfFileMerger
import numpy as np


def make_figure():
    """Function to make random figure"""
    x = np.linspace(0, 10, 100)
    y = np.random.randint(1, 100, 100)

    trace = go.Scatter(x=x, y=y, mode='markers')
    return go.Figure(data=[trace])


use_stream = False
merger = PdfFileMerger()

for i in range(3):
    print('Page {}'.format(i+1))

    if use_stream:
        pdf_file = io.BytesIO()
    else:
        pdf_file = '{}.pdf'.format(i)

    pio.write_image(make_figure(), pdf_file, 'pdf')

    if use_stream:
        pdf_file.seek(0)
    merger.append(pdf_file)

merger.write('merged.pdf')

All of the individual PDFs are written correctly, but PyPDF2 fails to merge them with the following error:
PyPDF2.utils.PdfReadError: Multiple definitions in dictionary at byte 0x8f0 for key /Type

This happens whether I write the files to disk or use an in-memory BytesIO stream.

Just wondering whether you think this is an issue with PyPDF2 or with Orca? I'm leaning towards thinking there's actually a bug in both, because PyPDF2 has this open issue: py-pdf/pypdf#325, but looking in the raw bytes of the PDF files written by Orca I can see repeated /Type /ExtGState lines which are maybe the root cause.

I'd also be interested in being able to append a new page to a PDF document directly from Plotly/Orca so that I don't have to use another library!

@slishak
Copy link
Author

slishak commented Nov 2, 2018

Can be fixed by using:

merger = PdfFileMerger(strict=False)

@etpinard
Copy link
Contributor

etpinard commented Nov 19, 2018

Have you tried merging orca pdf exports with another PDF utility e.g.

https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/

?

@slishak
Copy link
Author

slishak commented Nov 19, 2018

Yes, that seems to work fine. Although I found out that it also works if you disable strict mode in PyPDF2.PdfFileMerger so currently this issue isn't actually limiting me - I just get a lot of console warnings spitted out now!

@etpinard
Copy link
Contributor

etpinard commented Nov 19, 2018

Thanks for the reply. It sounds like orca PDF exports are ok.

I'll close this issue. We aren't planning on adding multi-page (multi-graph?) exports for the moment.

@slishak
Copy link
Author

slishak commented Nov 19, 2018

Agreed that they're "OK" and happy to have this issue closed - although looking at what's written in the issue linked in the original post, they do technically seem to be violating the PDF standard!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants