From 04bc965eda6944753e1158241042b5575f8d29f2 Mon Sep 17 00:00:00 2001 From: Ritchie Poh Date: Tue, 9 May 2023 17:06:29 +0800 Subject: [PATCH 1/6] Update comparisons.md with pdfplumber Added my take on the `pdfplumber` library compared to PyPDF. --- docs/meta/comparisons.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/docs/meta/comparisons.md b/docs/meta/comparisons.md index ccf04134c..46ac34b43 100644 --- a/docs/meta/comparisons.md +++ b/docs/meta/comparisons.md @@ -50,13 +50,18 @@ than PyPDF2. See [history of pypdf](history.md). [QPDF]: https://github.com/qpdf/qpdf -## pdfminer +## pdfminer.six [`pdfminer.six`](https://pypi.org/project/pdfminer.six/) is capable of extracting the [font size](https://stackoverflow.com/a/69962459/562769) / font weight (bold-ness). It has no capabilities for writing PDF files. -## pdfrw / pdfminer / pdfplumber +## pdfplumber +[`pdfplumber`](https://pypi.org/project/pdfplumber/) is a library focused mainly on extracting data from the file. Since `pdfplumber` is built on top of `pdfminer.six`, there are **no capabilities of exporting or modifying a PDF file**. It was also mentioned in [#440 (discussions)](https://github.com/jsvine/pdfplumber/discussions/440#discussioncomment-803880). However, `pdfplumber` is capable to convert a PDF file into an image, draw lines and rectangles on the image, and save it as an image file. + +The community over at `pdfplumber` is also active in answering questions and the library is actively maintained as of now. + +## pdfrw / pdfminer I don't have experience with any of those libraries. Please add a comparison if you know pypdf and [`pdfrw`](https://pypi.org/project/pdfrw/)! @@ -66,8 +71,6 @@ Please be aware that there is also Then there is [`pdfrw2`](https://pypi.org/project/pdfrw2/) which doesn't have a large community behind it. -And there is also [`pdfplumber`](https://pypi.org/project/pdfplumber/) - ## Document Generation There are (Python) [tools to generate PDF documents](https://github.com/py-pdf/awesome-pdf#generators). From 996564f8ccde9fce510bb93e5a8cf73136c108c9 Mon Sep 17 00:00:00 2001 From: Ritchie Poh Date: Fri, 19 May 2023 09:21:43 +0800 Subject: [PATCH 2/6] Update docs/meta/comparisons.md Co-authored-by: Martin Thoma --- docs/meta/comparisons.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/meta/comparisons.md b/docs/meta/comparisons.md index 46ac34b43..eb806c027 100644 --- a/docs/meta/comparisons.md +++ b/docs/meta/comparisons.md @@ -50,7 +50,7 @@ than PyPDF2. See [history of pypdf](history.md). [QPDF]: https://github.com/qpdf/qpdf -## pdfminer.six +## pdfminer.six and pdfplumber [`pdfminer.six`](https://pypi.org/project/pdfminer.six/) is capable of extracting the [font size](https://stackoverflow.com/a/69962459/562769) From 720a70790fd451a0e7bf4ec701ca3f2a404df22a Mon Sep 17 00:00:00 2001 From: Ritchie Poh Date: Fri, 19 May 2023 09:21:54 +0800 Subject: [PATCH 3/6] Update docs/meta/comparisons.md Co-authored-by: Martin Thoma --- docs/meta/comparisons.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/meta/comparisons.md b/docs/meta/comparisons.md index eb806c027..92f8f5560 100644 --- a/docs/meta/comparisons.md +++ b/docs/meta/comparisons.md @@ -56,7 +56,6 @@ than PyPDF2. See [history of pypdf](history.md). extracting the [font size](https://stackoverflow.com/a/69962459/562769) / font weight (bold-ness). It has no capabilities for writing PDF files. -## pdfplumber [`pdfplumber`](https://pypi.org/project/pdfplumber/) is a library focused mainly on extracting data from the file. Since `pdfplumber` is built on top of `pdfminer.six`, there are **no capabilities of exporting or modifying a PDF file**. It was also mentioned in [#440 (discussions)](https://github.com/jsvine/pdfplumber/discussions/440#discussioncomment-803880). However, `pdfplumber` is capable to convert a PDF file into an image, draw lines and rectangles on the image, and save it as an image file. The community over at `pdfplumber` is also active in answering questions and the library is actively maintained as of now. From b9d5e076b75f28516b952ee80a1a1677333bb9cb Mon Sep 17 00:00:00 2001 From: Martin Thoma Date: Sat, 20 May 2023 07:00:52 +0200 Subject: [PATCH 4/6] Update docs/meta/comparisons.md --- docs/meta/comparisons.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/meta/comparisons.md b/docs/meta/comparisons.md index 92f8f5560..36b366749 100644 --- a/docs/meta/comparisons.md +++ b/docs/meta/comparisons.md @@ -56,7 +56,7 @@ than PyPDF2. See [history of pypdf](history.md). extracting the [font size](https://stackoverflow.com/a/69962459/562769) / font weight (bold-ness). It has no capabilities for writing PDF files. -[`pdfplumber`](https://pypi.org/project/pdfplumber/) is a library focused mainly on extracting data from the file. Since `pdfplumber` is built on top of `pdfminer.six`, there are **no capabilities of exporting or modifying a PDF file**. It was also mentioned in [#440 (discussions)](https://github.com/jsvine/pdfplumber/discussions/440#discussioncomment-803880). However, `pdfplumber` is capable to convert a PDF file into an image, draw lines and rectangles on the image, and save it as an image file. +[`pdfplumber`](https://pypi.org/project/pdfplumber/) is a library focused on extracting data from PDF documents. Since `pdfplumber` is built on top of `pdfminer.six`, there are **no capabilities of exporting or modifying a PDF file** (see [#440 (discussions)](https://github.com/jsvine/pdfplumber/discussions/440#discussioncomment-803880)). However, `pdfplumber` is capable of converting a PDF file into an image, [draw lines and rectangles on the image](https://github.com/jsvine/pdfplumber#drawing-methods), and save it as an image file. The community over at `pdfplumber` is also active in answering questions and the library is actively maintained as of now. From bbd1f260e3227773dd3d3a60a000bcfb40501b4c Mon Sep 17 00:00:00 2001 From: Martin Thoma Date: Sat, 20 May 2023 07:02:05 +0200 Subject: [PATCH 5/6] Update docs/meta/comparisons.md --- docs/meta/comparisons.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/meta/comparisons.md b/docs/meta/comparisons.md index 36b366749..cac39e4f2 100644 --- a/docs/meta/comparisons.md +++ b/docs/meta/comparisons.md @@ -58,7 +58,7 @@ extracting the [font size](https://stackoverflow.com/a/69962459/562769) [`pdfplumber`](https://pypi.org/project/pdfplumber/) is a library focused on extracting data from PDF documents. Since `pdfplumber` is built on top of `pdfminer.six`, there are **no capabilities of exporting or modifying a PDF file** (see [#440 (discussions)](https://github.com/jsvine/pdfplumber/discussions/440#discussioncomment-803880)). However, `pdfplumber` is capable of converting a PDF file into an image, [draw lines and rectangles on the image](https://github.com/jsvine/pdfplumber#drawing-methods), and save it as an image file. -The community over at `pdfplumber` is also active in answering questions and the library is actively maintained as of now. +The `pdfplumber` community is active in answering questions and the library is maintained as of May 2023. ## pdfrw / pdfminer From 2b6a7b24f17d45a3dcaab28ad4c250421bfacf96 Mon Sep 17 00:00:00 2001 From: Martin Thoma Date: Sat, 20 May 2023 07:02:56 +0200 Subject: [PATCH 6/6] Update docs/meta/comparisons.md --- docs/meta/comparisons.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/meta/comparisons.md b/docs/meta/comparisons.md index cac39e4f2..d38861495 100644 --- a/docs/meta/comparisons.md +++ b/docs/meta/comparisons.md @@ -60,7 +60,7 @@ extracting the [font size](https://stackoverflow.com/a/69962459/562769) The `pdfplumber` community is active in answering questions and the library is maintained as of May 2023. -## pdfrw / pdfminer +## pdfrw / pdfrw2 I don't have experience with any of those libraries. Please add a comparison if you know pypdf and [`pdfrw`](https://pypi.org/project/pdfrw/)!