Skip to content

Commit

Permalink
[Monster PR] Upgrade to PDFBox 2.0 (#150)
Browse files Browse the repository at this point in the history
* Starting with upgrade to PDFBox 2.0 (#52)

* 2.0

* little progress in upgrading to pdfbox 2

* upgrade to pdfbox 2 starting to show signs of life

* Fix TextElement creation

* fix tabs

* Use the code from LegacyPDFStreamEngine to create the TextElements

* Fix removeText function using the example:

org.apache.pdfbox.examples.util.RemoveAllText

* close the document

* close removed text document

* fix array serialization

* add spanning cells test with CSV format

* - Remove capheight calculation
- Temporally set height

* Test writer two tables checking the json result object instead of the string

Add a test writer two tables for CSV output

* Fix pageTransform when there is a rotation
Add more csv tests

* fix path iterator

* update json tests

* update json outputs

* upgrade pdfbox version

* back to the old implementation and catch the IndexOutOfBoundsException

* Remove hardcoded code

* Remove more hardcoded code

* test all the elements of the detected table

* Change the expected table top value

* Increase the threshold factor to support a greater headings

* Fix rectangle comparator.

* fix wrong expected column size, 5 instead of 6.

add more tests

* update expected table, more spaces are expected to respect the alingment.

* when the text value has length > 1, clean the spaces.

* clean code

* remove stackstrace

* add log error

* upgrade all dependencies

* code formatting

* setting pom to snapshot version
  • Loading branch information
jazzido authored Mar 27, 2017
1 parent fc0eff4 commit f4c094e
Show file tree
Hide file tree
Showing 28 changed files with 2,055 additions and 1,513 deletions.
28 changes: 17 additions & 11 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<modelVersion>4.0.0</modelVersion>
<groupId>technology.tabula</groupId>
<artifactId>tabula</artifactId>
<version>0.9.2</version>
<version>1.0.0-SNAPSHOT</version>
<name>Tabula</name>
<description>Extract tables from PDF files</description>
<url>http://github.com/tabulapdf/tabula-java</url>
Expand Down Expand Up @@ -36,7 +36,7 @@
<connection>scm:git:git@github.com:tabulapdf/tabula-java.git</connection>
<developerConnection>scm:git:git@github.com:tabulapdf/tabula-java.git</developerConnection>
<url>git@github.com:tabulapdf/tabula-java.git</url>
<tag>tabula-0.9.2</tag>
<tag>tabula-1.0.0-SNAPSHOT</tag>
</scm>

<repositories>
Expand Down Expand Up @@ -134,8 +134,8 @@
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
<plugin>
Expand Down Expand Up @@ -222,31 +222,37 @@
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.21</version>
<version>1.7.25</version>
</dependency>

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.7.21</version>
<version>1.7.25</version>
</dependency>

<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.8.12</version>
<version>2.0.5</version>
</dependency>

<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox-tools</artifactId>
<version>2.0.5</version>
</dependency>

<dependency>
<groupId>org.bouncycastle</groupId>
<artifactId>bcprov-jdk15on</artifactId>
<version>1.55</version>
<version>1.56</version>
</dependency>

<dependency>
<groupId>org.bouncycastle</groupId>
<artifactId>bcmail-jdk15on</artifactId>
<version>1.55</version>
<version>1.56</version>
</dependency>

<dependency>
Expand All @@ -259,7 +265,7 @@
<dependency>
<groupId>commons-cli</groupId>
<artifactId>commons-cli</artifactId>
<version>1.3.1</version>
<version>1.4</version>
</dependency>

<dependency>
Expand All @@ -271,7 +277,7 @@
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.7</version>
<version>2.8.0</version>
</dependency>
</dependencies>

Expand Down
Loading

0 comments on commit f4c094e

Please sign in to comment.