Skip to content
petermr edited this page Aug 21, 2020 · 8 revisions

AMI SVG

Analyzes svg/ for text and paths

creation of SVG

	@Test
	public void testExtractVectors() {
		File projectDir = TEST_VECTOR;
		File treeDir = new File(projectDir, "PMC4491181");
		File targetDir = new File(TARGET_VECTOR, "create/");
		CMineTestFixtures.cleanAndCopyDir(projectDir, targetDir);

		String cmd = ""
				+ " -vv"
				+ " --forcemake"
//				+ " -t " + treeDir
				+ " -p " + targetDir
				+ " pdfbox"
				+ " --maxprimitives=100000"
//				+ " --pages=4 5"
				;
//		AMI.execute(cmd);

creation of paths

Note sure whether panels is required

		cmd = ""
				+ " -vv"
				+ " --forcemake"
//				+ " -t " + treeDir
				+ " -p " + projectDir
				+ " svg"
				+ " --panels "
				;
		AMI.execute(cmd);
	}

output

Generic values (AMISVGTool)
================================
input basename      null
input basename list null
cproject            /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10
ctree               
cTreeList           10 trees [/Users/pm286/workspace/cmdev/ami3/src/test/resour
excludeBase         {}
excludeTrees        {}
forceMake           true
includeBase         {}
includeTrees        null
log4j               {}
verbose             2

Specific values (AMISVGTool)
================================
Command line options for 'ami svg':
--caches            : d      null
--pages             : d      null
--panels            : m {xwidth=200,ywidth=100}
--regex             : d      null
--regexfile         : d      null
--tidysvg           : d      null
--vectorlog         : d vectors.log
--vectordir         : d  vectors/
--logfile           : d      null
--help              : d     false
--version           : d     false
AMISVGTool cTree: PMC4491181
cTree: PMC4491181
PAGE: p.0: 5
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.0/paths.svg
PAGE: p.1: 35
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.1/paths.svg
PAGE: p.2: 168
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.2/paths.svg
PAGE: p.3: 560
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.3/paths.svg
PAGE: p.4: 1521
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.4/paths.svg
PAGE: p.5: 409
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.5/paths.svg
PAGE: p.6: 0
PAGE: p.7: 10
writing paths to /Users/pm286/workspace/cmdev/ami3/src/test/resources/org/contentmine/ami/vector10/PMC4491181/svg/fulltext-page.7/paths.svg
AMISVGTool cTree: PMC4503998
cTree: PMC4503998
no svg/ dir
...

output tree

├── PMC4491181
│   ├── eupmc_result.json
│   ├── fulltext.pdf
│   ├── fulltext.xml
│   ├── pdfimages
│   │   ├── image.2.1.124_235.587_697.png
│   │   ├── image.2.2.246_357.587_697.png
│   │   ├── image.2.3.368_478.586_638.png
│   │   ├── image.2.4.368_474.649_698.png
│   │   ├── image.3.1.248_359.617_728.png
│   │   ├── image.4.1.70_275.70_186.png
│   │   ├── image.4.2.304_418.69_184.png
│   │   ├── image.4.3.429_532.82_173.png
│   │   ├── image.4.4.70_533.195_312.png
│   │   ├── image.4.5.69_534.318_434.png
│   │   ├── image.5.1.78_236.198_328.png
│   │   ├── image.5.2.250_371.198_328.png
│   │   └── image.6.1.121_475.205_280.png
│   ├── scholarly.html
│   └── svg
│       ├── fulltext-page.0
│       │   └── paths.svg
│       ├── fulltext-page.0.svg
│       ├── fulltext-page.1
│       │   └── paths.svg
│       ├── fulltext-page.1.svg
│       ├── fulltext-page.2
│       │   └── paths.svg
│       ├── fulltext-page.2.svg
│       ├── fulltext-page.3
│       │   └── paths.svg
│       ├── fulltext-page.3.svg
│       ├── fulltext-page.4
│       │   └── paths.svg
│       ├── fulltext-page.4.svg
│       ├── fulltext-page.5
│       │   └── paths.svg
│       ├── fulltext-page.5.svg
│       ├── fulltext-page.6.svg
│       ├── fulltext-page.7
│       │   └── paths.svg
│       ├── fulltext-page.7.svg
│       └── vectors.log
├── PMC4503998

aggregation of paths

Paths can be extracted from the SVG and will often represent diagrams. Example:

NOTE:

  • the PDF contains small rectangles which are added by appendRectangle(). However this has not been debugged yet and so some are missing.
  • the letters and digits are not included. Note that the bold letters are not coded characters, but stroked glyphs which will need decoding.

bugs

2020-08-21 appendRectangle is used both to draw rectangles and clip. Because I don't understand the clipping fully, I disabled it. Occasionally it is real rectangles. So rectangles may be missing. Hmmm... Maybe make all rects unfilled...

Clone this wiki locally