-
Notifications
You must be signed in to change notification settings - Fork 950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix .paint_path handling of single line segments #530
Conversation
- Fixes typo ("ml" should have been "mlh") - Removes if-statement that required individual line segments to be strictly horizontal or vertical.
Althoguh 'mlh' is the canonical implementation for a single line segment, 'ml' is fairly common. Adds tests and sample PDF.
I'm an user of both pdfminer.six and pdfplumber and appreciate for all your great work! |
This commit corrects the manner in which "pts" are extracted from Beziér path commands. See Table 4.9 of PDF reference manual, and new comments in code for details. Previously, depending on whether the command (c, v, or y) the code was extracting some combination of control points (not on curve) and the actual points-on-curve. This commit also refactors .paint_path, so that apply_matrix_pt is only called in one place, and to treat the "h" command in a manner more consistent with other path commands.
Now that .paint_path has been refactored, adding support for rect-forming mllll paths requires no extra code, beyond a minor tweak to the relevant elif statement.
The new commits above do three main things, described below. Fixing the point-extraction from Beziér path commandsI spent some time reading the PDF spec and You're probably familiar with that table, but I'm pasting it here for others who might not be, and for ease of reference: Previously, (Relatedly: Are the maintainers open to adding a Refactoring
|
(Note: The initial post-commit/push Travis build appears to have failed — but because of a problem installing the |
…to fix-paint-path
…ome trivial due to refactoring
Travis seem to be migrated and therefore not working anymore... I'm trying to figure out how to get the tests working again. |
This reverts commit 41c0518
Many thanks for the earlier review @pietermarsman, and for merging! Much appreciated. |
Thank you for merging my previous PR (#512) re.
PDFLayoutAnalyzer.paint_path(...)
. That PR focused on the handling of non-rectangular quadrilaterals and of subpaths. I neglected, however, to closely examine (or test) the handling of individual line segments, and largely carried over the previous logic. This PR attempts to fix that logic:Fixes typo ("ml" should be "mlh"), which was causing all single-line segments to be ignored.
Removes the if-statement that required individual line segments to be strictly horizontal or vertical. (If the maintainers believe, however, that diagonal lines should be instances of
LTCurve
instead ofLTLine
, however, that's an easy adjustment.)How Has This Been Tested?
I have added an assertion to
tests/test_converter.py
: jsvine@4ef21ec#diff-f56d97c6216a37d8ca841f31c63b704d05b7e3a42d6dbe783411d6fef8204615R98-R112Checklist
works
version
is not necessary
verified that this is not necessary
CHANGELOG.md