Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find headings in pdfs #329

Closed
vekunz opened this issue Jan 23, 2020 · 1 comment
Closed

Find headings in pdfs #329

vekunz opened this issue Jan 23, 2020 · 1 comment

Comments

@vekunz
Copy link

vekunz commented Jan 23, 2020

Hello,
I'm not sure if this is even possible, therefore I ask for it. I'm using pdf-lib in a pipeline where I first create pdfs from markdown and then merge some pdfs together. Currently, I have to create a table of contents manually and I have to update it every time I make changes to the source.
So my question is, is it possible to make a feature to detect on which page a specific heading is?

@vekunz vekunz changed the title Find headingin pdfs Find headings in pdfs Jan 23, 2020
@Hopding
Copy link
Owner

Hopding commented Jan 26, 2020

Hello @vekunz!

I'm afraid this is not really possible to do with pdf-lib today. It would require a lot of custom code to be written to extract and parse text. And this is not an easy or straightforward thing to do with PDFs (see #93 and #137).

To be clear, it is technically possible to do. And libraries like pdf.js that are designed for reading PDF documents (as opposed to creating/editing them) can do it. It's just that pdf-lib doesn't have the facilities to make this easy as of today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants