Skip to content

Enhancements to addHTML parser #1902

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Feb 7, 2021
Merged

Enhancements to addHTML parser #1902

merged 11 commits into from
Feb 7, 2021

Conversation

lubosdz
Copy link
Contributor

@lubosdz lubosdz commented Jul 11, 2020

Description

Currently importing HTML via PhpWord\Shared\Html::addHtml() has couple of bugs and misses support for some at least basic HTML tags & attributes. This PR attempts to fix these issues without BC break (at least no existing unit test break):

Fixed legacy issues:

  • fix passing width property into cells, allowing also to control colum widths inside tables. Following would currently not work style="width: 35%" - because width is passed as a wrong argument. Once fixed, it will also allow to control columns width inside tables. Since all widths are currently ignored (not recognized) they are set to width = "auto".
  • fix unit conversion for margins/spacing - instead of converting to points, it should convert to twips. Currently passing 10px results into 7.5 twips therefore value is seemingly ignored, unless users sets some absurd value like width: 600px.

New enhancements:

  • support width also as an attribute e.g. <td width="25%">
  • support cellspacing for tables e.g. <table cellspacing="3">
  • support all border-* variants e.g. border-bottom: 1px #DDDDDD solid; (top, right, left, bottom) - useful for tables & cells
  • support bgColor for rows, cells, e.g. <tr bgColor="#DDDDDD">
  • support horiz. rule <hr /> with limited custom styling
  • support both inline vertical-align and valign for cells e.g. <th valign="bottom"> or style="vertical-align: middle;"
  • support start & numbering type in ordered lists e.g. <ol type="A" start="3">

Checklist:

  • I have run composer run-script check --timeout=0 and no errors were reported
  • The new code is covered by unit tests (check build/coverage for coverage report)
  • I have updated the documentation to describe the changes

More enhancements possible (with BC break)

A two more fixes should be fixed to make current method addHtml() more usable and compatible:

  • fix cell property inheritance - currently all elements always inherit all properties from parent node. This however should not apply to cells. Cells should not inherit most of parent properties e.g. borders, margins etc. because it produces unexpected output. In spite of BC break this should be fixed ASAP. Current behaviour is quite incompatible with legacy HTML output. Moreover, inheriting widths from parent nodes causes unpredictable results in table layouts.
  • add support for Headings <h1> .. <h6> out of the box. Current implementation relies that user will somewhere define his own heading style Heading1, Heading2 etc., which adds extra code boilerplate for developers and also requires to study how to do that (increases learning curve). Since H1 .. H6 are very basic elements in any HTML, these should work out of the box.

Unfortunatelly, both improvements above will cause BC break, and therefore have been excluded from this PR in order to make space for discussion.

@lubosdz
Copy link
Contributor Author

lubosdz commented Jul 22, 2020

There is 1 failing job - not sure what to do about it, does not seem related and repeatedly shows up with other PRs. Log:

Loading composer repositories with package information
Updating dependencies (including require-dev)
PHP Fatal error:  Allowed memory size of 1610612736 bytes exhausted (tried to allocate 72 bytes) in phar:///home/travis/.phpenv/versions/5.6.40/bin/composer/src/Composer/DependencyResolver/Rule2Literals.php on line 48
Fatal error: Allowed memory size of 1610612736 bytes exhausted (tried to allocate 72 bytes) in phar:///home/travis/.phpenv/versions/5.6.40/bin/composer/src/Composer/DependencyResolver/Rule2Literals.php on line 48

@lubosdz
Copy link
Contributor Author

lubosdz commented Jul 28, 2020

Is this repo dead? No feedback since 3 weeks ago ... ?

Julien1138 added a commit to naept/PHPWord that referenced this pull request Nov 10, 2020
@devfake
Copy link

devfake commented Dec 21, 2020

What's up with this PR? This adds some good features :|

@lubosdz
Copy link
Contributor Author

lubosdz commented Dec 22, 2020

If you need a quick enhancement, you may want to use composer repo https://github.com/lubosdz/PHPWord - which holds these unmerged changes. But there is no guarantee it will not change. Or make clon for yourself.

This repo looks truly like abandoned, even though not officially. Probably time & money issue, like with most open sources ..

@troosan troosan added this to the v0.18.0 milestone Feb 7, 2021
@troosan troosan added the HTML label Feb 7, 2021
@troosan troosan merged commit 9e322dd into PHPOffice:develop Feb 7, 2021
@lubosdz
Copy link
Contributor Author

lubosdz commented Feb 8, 2021

@troosan
Thank you for merging.
Do you think it makes sense to make another PR with more serious BC break - as mentioned above? (support for headers H1 .. H6, fix for table cells inheritance).

@troosan
Copy link
Contributor

troosan commented Feb 12, 2021

@lubosdz feel free to create another PR with those changes. I won't be merging them in this release though, it's been waiting for a long time :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

3 participants