tweak README, add demo/banner image

0xabu · Feb 20, 2022 · 0e68261 · 0e68261
1 parent 3075620
commit 0e68261
Show file tree

Hide file tree

Showing 2 changed files with 15 additions and 14 deletions.
diff --git a/README.md b/README.md
@@ -4,10 +4,12 @@
 [![PyPI version](https://img.shields.io/pypi/v/pdfannots)](https://pypi.org/project/pdfannots/)
 
 This program extracts annotations (highlights, comments, etc.) from a PDF file,
-and formats them in a variety of ways. It is primarily intended for use in
-reviewing submissions to scientific conferences/journals.
+and formats them as Markdown or exports them to JSON. It is primarily intended
+for use in reviewing submissions to scientific conferences/journals.
 
-For the default markdown format, the output is as follows:
+![Sample/demo of pdfannots extracting Markdown from an annotated PDF](doc/demo.png)
+
+For the default Markdown format, the output is as follows:
 
  * Highlights without an attached comment are output first, as
  "highlights" with just the highlighted text included. Note that
@@ -25,11 +27,11 @@ For the default markdown format, the output is as follows:
  of this is to easily separate formatting or grammatical corrections
  from more substantial comments about the content of the document.
 
-For each annotation, the page number is given, along with the
-associated (highlighted/underlined) text, if any. Additionally, if the
-document embeds outlines (aka bookmarks), such as those generated by
-the LaTeX hyperref package, they are printed to help identify to which
-section in the document the annotation refers.
+For each annotation, the page number is given, along with the associated
+(highlighted/underlined) text, if any. Additionally, if the document embeds
+outlines (aka bookmarks), such as those generated by the LaTeX
+[hyperref](https://ctan.org/pkg/hyperref) package, they are printed to help
+identify to which section in the document the annotation refers.
 
 
 ### Usage
@@ -47,8 +49,8 @@ options and invocation.
 ### Known issues and limitations
 
  * While it is generally reliable, pdfminer (the underlying PDF parser) is
- less accurate than other tools (Poppler's pdftotext) at extracting text
- from a PDF. It has been known to fail in several different ways:
+ not infallible at extracting text from a PDF. It has been known to fail
+ in several different ways:
 
  * Sometimes it misses or misplaces individual characters, resulting in
  annotations with some or all of the text missing (in the latter case,
@@ -83,8 +85,8 @@ options and invocation.
 
  1. I'd like to change how the output is formatted.
 
- Some minor tweaks (e.g.: word wrap, skipping sections) can be accomplished
- via command-line arguments.
+ Some minor tweaks (e.g.: word wrap, skipping or reordering output sections)
+ can be accomplished via command-line arguments.
 
  All of the output comes from the relevant `Printer` subclass; more elaborate
  changes can be accomplished there. Pull requests to introduce new output
@@ -95,5 +97,4 @@ options and invocation.
  I hope that it was a constructive review, and that the annotations
  helped the reviewer give you more detailed feedback so you can improve
  your paper. This is, after all, just a tool, and it should not be an
- excuse for reviewer sloppiness. Note that I am not the only user of
- this script.
+ excuse for reviewer sloppiness.
diff --git a/doc/demo.png b/doc/demo.png