Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-fs-table-paginate tends to leave only thead and tfoot before page break #399

Open
VictorAtPL opened this issue Oct 28, 2019 · 7 comments
Open
Assignees
Labels

Comments

@VictorAtPL
Copy link

I need to convert the html with multiple tables, which has thead, tbody, tfoot. The tables should be paginated if needed.

Here is the result I could achieve:
openhtmltopdf_bug

I've already tried #354 with no success.

Could you please guide me what should I try and/or where can I find the code responsible for paginated tables? I will try to do some dirt fix for myself.

@danfickle
Copy link
Owner

Yes, nested paginated tables are definitely buggy. I think it is related to the ContentLimitContainer (in the render package) and its use, but I don't fully understand the table code.

Another issue, which you may be able to solve is we don't have a minimal example showing the problem. An absolutely minimal sample would help a lot.

@danfickle danfickle added the bug label Nov 1, 2019
@VictorAtPL
Copy link
Author

Yes, nested paginated tables are definitely buggy. I think it is related to the ContentLimitContainer (in the render package) and its use, but I don't fully understand the table code.

Do you know how this library works in overall? There is lack of information of it. I am debugging the code for the second consecutive day and still not sure how it works.

It looks like: HTML -> BlockBox -> Layer -> Ops -> PdfBox renderer? I still can't find the place in code where table is split across to pages. For me it looks like the table is queued to render on both pages, but there are some overlays/constraints which prevents the insufficient part of table to be rendered on proper page.

If I understand it right, there is one TableBox for each Table in html and later if there is need to split the page on multiple pages then more than one layer is created. Am I right?

Another issue, which you may be able to solve is we don't have a minimal example showing the problem. An absolutely minimal sample would help a lot.

If I don't succeed to do it using my private code, I will try to prepare a MWE.

@VictorAtPL
Copy link
Author

VictorAtPL commented Nov 20, 2019

Here is MWE:
https://gist.github.com/VictorAtPL/82e035e5f4e306aa692cb206d6624347

If you think I should make it even more minimal, I will try.
background-color and border statements can be of course removed if it will make debugging easier.

Preview:
image

VictorAtPL added a commit to VictorAtPL/openhtmltopdf that referenced this issue Nov 21, 2019
@VictorAtPL
Copy link
Author

I made a dirty fix in #416, although still hiding the border should be added and some test cases of other users should be performed.

For my purpose - it works perfectly, but it rather hides the symptom than treat the cause.
From what I understood, it can be problem with ContentLimitContainer as @danfickle said. Maybe the ContentLimitContainer doesn't take into account the header and footer, or one of them? Not sure, but unfortunately do not have time for digging more in this issue.

danfickle added a commit that referenced this issue Nov 21, 2019
@VictorAtPL
Copy link
Author

@danfickle
Any help on how to run this test: 884194c
And how does it work? It only asserts that the execution has succeeded or somehow check if the header and footer are hidden and I am missing something?

@danfickle
Copy link
Owner

Ok, after a couple of hours of messing around in the code, it dawned on me that the core problem is that when the first table row is moved to the next page, the actual table is not also being moved.

So looking in TableRowBox, about 98 we see that this is meant to happen already.

It turns out the problem is the next function, isShouldMoveToNextPage, is returning false because it is not taking into account the page-break-inside: avoid but later on the table row is being moved anyway as page break algorithm for block like elements (in BlockBoxing) does take into account CSS page break properties.

So, in essence, the problem is one of duplicate functionality.

A quality fix which I'll work on tomorrow would factor out the duplicate functionality into a method. In the meantime a simple fix could be adding the following code at L118:

else if (getStyle().isAvoidPageBreakInside()) {
            return true;
}

P.S The idea of the visual regression tests is to output a PDF, render it to an image with PDFBOX and then compare it to a render of a proof PDF.

Since for this test there is no proof yet, it simple outputs a PDF to your target directory under /openhtmltopdf-examples/target/test/visual-tests/test-output/issue-399-table-header-with-no-rows.pdf.

@VictorAtPL
Copy link
Author

Nice,
good to hear you found some more quality solution than mine to fix the core of the problem, not the result.

Looking forward for your PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants