Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Every line is transformed as header element in markdown #74

Open
saravmajestic opened this issue Sep 22, 2023 · 1 comment
Open

Every line is transformed as header element in markdown #74

saravmajestic opened this issue Sep 22, 2023 · 1 comment

Comments

@saravmajestic
Copy link

Describe the bug
Thanks for this library. Very much helpful. I am seeing a weird issue. When parsing this PDF, each and every line is transformed as a header element instead of sentences/paragraphs. This issue is happening with original repo as well. Tested here: https://pdf2md.morethan.io/

To Reproduce
Steps to reproduce the behavior:

  1. call @opendocsg/pdf2md in cmd line with the above file as input
  2. Check the output

Expected behavior
Some of the text in the pdf, for ex: Selecting the “right” amount of information to include in a summary is a difficult task. A good... should not be treated as header

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: macOs
  • Browser NA
  • Version NA

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Since this issue exists in original repo, it will be great if you can point me how to resolve this issue. Appreciated!

@graylewis
Copy link

I'm having this same issue on the latest version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants