Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the resulting height of the rendered html? #427

Open
DSW-AK opened this issue Jan 9, 2020 · 12 comments
Open

How to get the resulting height of the rendered html? #427

DSW-AK opened this issue Jan 9, 2020 · 12 comments
Assignees

Comments

@DSW-AK
Copy link

DSW-AK commented Jan 9, 2020

Hi Dan,

I need the resulting height of the rendered html because other text is placed directly after the rendered html text.
But what is the correct way to get the rendered height?

Actually I collect the BlockBoxes and the LineBoxes of the BlockBoxes.
To calculate the height I add the height of each LineBox.
For each BlockBox I add the margin-bottom.
To get the correct height of the rendered HTML I add the sum of LineBox-Height to the sum of margin-bottoms.
The result is exactly right which can be compared with some existing pdf from another system.
But my calculated value I can not find in any property / method of the renderer.
I tried childrenHeight, height, absY, paintinginfo.aggregateBounds and paintinginfo.outerMarginCorner etc but not with the hoped success.

What am I doing wrong because my algorithm is very complicated and I can not belive that this is the right way?

I use the current released version 1.0.1.

Thanks

@danfickle
Copy link
Owner

There is no formal api for this, however you should be able to get the height (in css pixels which are 96 pixels to the inch) of the body element using a variation of this code. The renderer now requires closing.

Thinking more on this, you'll also need to take into account the page margins. I'll try to have a look at this when I'm at my computer tomorrow.

@DSW-AK
Copy link
Author

DSW-AK commented Jan 14, 2020

Hi Dan,

thanks for the reply.

I checked your suggestion, the height is only correct if no word wrap occurs.
If I add the height from the additional line generated by word wrap, the resulting height is also correct.

Here is an example, the last <p> results in a word wrap:

<!DOCTYPE html PUBLIC
"-//OPENHTMLTOPDF//MATH XHTML Character Entities With MathML 1.0//EN" "">

<html>
<head><style type="text/css">p.MsoNormal, li.MsoNormal, div.MsoNormal{ margin-top:0cm; margin-right:0cm; margin-bottom: 0cm; margin-left:0cm;  font-size:10pt;}  
    @page { size: A4 portrait; margin: 141.7323pt 26.275635pt 85.03938pt 84.75pt; }
    @page:first { size: A4 portrait; margin: 411.01953pt 26.275635pt 85.03938pt 84.75pt; }
  </style>
</head>
<body style="margin:0px;  font-size:10pt;  font-family:'Arial'; line-height:112%">
  <p class="MsoNormal">
    <span style="font-family: 'Arial';font-size: 10pt;">Xx-Xxx:<span style="font-family:'Arial';  font-size:10pt;">&nbsp; &nbsp; </span>
    <span style="position:absolute;  left:123px;  font-size:10pt;  font-family:'Arial;">22.10.2019</span>
    </span>
  </p>
  <p class="MsoNormal">
    <span style="font-family: 'Arial';font-size: 10pt;">Xxxxxx:<span style="font-family:'Arial';  font-size:10pt;">&nbsp; &nbsp; </span>
    <span style="position:absolute;  left:123px;  font-size:10pt;  font-family:'Arial';">23.10.2019</span>
    </span>
  </p>
  <p class="MsoNormal">
    <span style="font-family: 'Arial';font-size: 10pt;">Xxxxxxxxxxxxx:<span style="font-family:'Arial';  font-size:10pt;">&nbsp; &nbsp; </span>
    <span style="position:absolute;  left:123px;  font-size:10pt;  font-family:'Arial';">x xxxx (XXxxxxxxxxxx XXXX XXXXX XXXX) xxx xx x xxxx (XXxxxxxxxxxx XXXX XXXXX XXXX)</span>
    </span>
  </p>
</body>
</html>

Thanks

@danfickle
Copy link
Owner

Hi @DSW-AK,

I think the problem is the absolute positioned elements. position: absolute takes an element out or normal flow and puts it into its own layer. Therefore, I think the following code will work:

import com.openhtmltopdf.layout.Layer;
import com.openhtmltopdf.pdfboxout.PdfBoxRenderer;
import com.openhtmltopdf.pdfboxout.PdfRendererBuilder;
import com.openhtmltopdf.render.Box;

public class AdvancedTestBed {
    // An internal number, should stay the same.
    private static final int PDF_DOTS_PER_PIXEL = 20;
    
    private static int findMaxLayerY(Layer layer) {
        int maxLayerY = layer.getMaster().getAbsY() + layer.getMaster().getHeight();
        
        int maxChildY = layer.getChildren()
                .stream()
                .mapToInt(AdvancedTestBed::findMaxLayerY)
                .max()
                .orElse(0);
             
        return Math.max(maxChildY, maxLayerY);
    }
    
    
    public static void main(String... args) throws Exception {
        PdfRendererBuilder builder = new PdfRendererBuilder();

        String html = "<!DOCTYPE html PUBLIC\n" + 
                "\"-//OPENHTMLTOPDF//MATH XHTML Character Entities With MathML 1.0//EN\" \"\">\n" + 
                "\n" + 
                "<html>\n" + 
                "<head><style type=\"text/css\">p.MsoNormal, li.MsoNormal, div.MsoNormal{ margin-top:0cm; margin-right:0cm; margin-bottom: 0cm; margin-left:0cm;  font-size:10pt;}  \n" + 
                "    @page { size: A4 portrait; margin: 141.7323pt 26.275635pt 85.03938pt 84.75pt; }\n" + 
                "    @page:first { size: A4 portrait; margin: 411.01953pt 26.275635pt 85.03938pt 84.75pt; }\n" + 
                "  </style>\n" + 
                "</head>\n" + 
                "<body style=\"margin:0px;  font-size:10pt;  font-family:'Arial'; line-height:112%\">\n" + 
                "  <p class=\"MsoNormal\">\n" + 
                "    <span style=\"font-family: 'Arial';font-size: 10pt;\">Xx-Xxx:<span style=\"font-family:'Arial';  font-size:10pt;\">&nbsp; &nbsp; </span>\n" + 
                "    <span style=\"position:absolute;  left:123px;  font-size:10pt;  font-family:'Arial;\">22.10.2019</span>\n" + 
                "    </span>\n" + 
                "  </p>\n" + 
                "  <p class=\"MsoNormal\">\n" + 
                "    <span style=\"font-family: 'Arial';font-size: 10pt;\">Xxxxxx:<span style=\"font-family:'Arial';  font-size:10pt;\">&nbsp; &nbsp; </span>\n" + 
                "    <span style=\"position:absolute;  left:123px;  font-size:10pt;  font-family:'Arial';\">23.10.2019</span>\n" + 
                "    </span>\n" + 
                "  </p>\n" + 
                "  <p class=\"MsoNormal\">\n" + 
                "    <span style=\"font-family: 'Arial';font-size: 10pt;\">Xxxxxxxxxxxxx:<span style=\"font-family:'Arial';  font-size:10pt;\">&nbsp; &nbsp; </span>\n" + 
                "    <span style=\"position:absolute;  left:123px;  font-size:10pt;  font-family:'Arial';\">x xxxx (XXxxxxxxxxxx XXXX XXXXX XXXX) xxx xx x xxxx (XXxxxxxxxxxx XXXX XXXXX XXXX)</span>\n" + 
                "    </span>\n" + 
                "  </p>\n" + 
                "</body>\n" + 
                "</html>";
        
        builder.withHtmlContent(html, /* Base url */ null);
        try (PdfBoxRenderer renderer = builder.buildPdfRenderer()) {
            renderer.layout();

            // The root box is <html>
            Box box = renderer.getRootBox();
            
            Layer layer = renderer.getRootBox().getContainingLayer();
            int layerBottom = findMaxLayerY(layer);
            
            System.out.println("layer = " + (layerBottom / PDF_DOTS_PER_PIXEL));
           
            // Optional: Print box size to console.
            System.out.println("Box size = " + (box.getWidth() / PDF_DOTS_PER_PIXEL) + "x"
                    + (box.getHeight() / PDF_DOTS_PER_PIXEL));
            System.out.println("Box element = " + box.getElement().getTagName());
        }
   }
}

This gives a height for the html element of 44, but a max bottom Y for the layers of 59, which I'm guessing is the correct answer.

@DSW-AK
Copy link
Author

DSW-AK commented Jan 28, 2020

Hi Dan,

thanks for your reply.

I use now your solution which calculates the rendered height correct if no page break occurs.
So the result can be seen as a "total height" if a page break happens, and as a "reqired height" without a page break.
Based on this I have to do some extra calculation if a page break occurs.
I use the painting top from last page and a calculated diff from painting info (I really don't know why I need this diff, but I need it for correct height in rare cases).

For the moment I use this method and the result is always exactly right (findMaxLayerY is your original):

  private static float[] calcRequiredHeight(final PdfBoxRenderer renderer) {

    float[] result = new float[2];

    Layer layer = renderer.getRootBox().getContainingLayer();
    int totalHeight = findMaxLayerY(layer);

    float requiredHeight;

    if (renderer.getRootBox().getContainingLayer().getPages().size() > 1) {
      // painting top on last rendered page
      float paintingTop = renderer.getRootBox().getContainingLayer().getLastPage().getPaintingTop();

      // I don't know why I need this, but I need this for correct height
      float heightDiff = (float) Math.abs(renderer.getRootBox().getChild(0).getPaintingInfo().getOuterMarginCorner().getHeight()
          - renderer.getRootBox().getChild(0).getPaintingInfo().getAggregateBounds().getHeight());
      totalHeight += heightDiff;

      // required height on last rendered page
      requiredHeight = totalHeight - paintingTop;
    } else {
      // no page break, total height is the required height
      requiredHeight = totalHeight;
    }
    // calculate to pt
    result[0] = renderer.getOutputDevice().getDeviceLength(requiredHeight);
    result[1] = renderer.getOutputDevice().getDeviceLength(totalHeight);

    return result;
  }

The method returns the required height and the total height in pt.
Do you have a better idea in case of a page break?

Thank you very much

@DSW-AK DSW-AK closed this as completed Jan 28, 2020
@DSW-AK DSW-AK reopened this Jan 28, 2020
@DSW-AK
Copy link
Author

DSW-AK commented Jan 28, 2020

Sorry I closed this by mistake

@stechio
Copy link

stechio commented Mar 4, 2021

[version: 1.0.7-SNAPSHOT; commit: 02719e80f85b5bcf7c8e627018cda9fceb62a3cd]

Having myself the need, after content rendering, to retrieve the current vertical offset (that is, the bottom position of the last graphic element appended), I got a try to your code (BTW, IMO it's a bit weird that a nice rendering library like this doesn't provide out of the box a robust function to reliably calculate such a fundamental property), but the result strangely failed by a constant amount, whatever the height of the appended contents (the horizontal line represents the calculated current vertical offset):

LastPageYOffset_less LastPageYOffset_more

@danfickle, do you have any explanation about this issue?

Here it is the generation code:

public class LastPageYOffsetTest {
    public static void main(String[] args) throws Exception {
        // Content rendering.
        PDDocument doc = new PDDocument();
        PdfBoxRenderer renderer = new PdfRendererBuilder()
            .withFile(new java.io.File("source.html"))
            .usePDDocument(doc)
            .buildPdfRenderer();
        renderer.createPDFWithoutClosing();

        // Current-vertical-offset line drawing.
        PDPage page = renderer.getPdfDocument().getPage(renderer.getPdfDocument().getNumberOfPages() - 1);
        try (PDPageContentStream cs = new PDPageContentStream(
                renderer.getPdfDocument(), page, AppendMode.APPEND, false, false)) {
            float lastPageYOffset = renderer.getOutputDevice().normalizeY(
                getLastPageYOffset(renderer), page.getMediaBox().getHeight()) ;
            System.out.println("LastPageYOffset: " + lastPageYOffset);

            cs.drawLine(0, lastPageYOffset, page.getMediaBox().getWidth(), lastPageYOffset);
        }
        renderer.close();

        OutputStream os = new FileOutputStream("LastPageYOffset.pdf");
        doc.save(os);
        os.close();
    }

    private static float getLastPageYOffset(PdfBoxRenderer r) {
        Layer layer = r.getRootBox().getContainingLayer();
        int totalHeight = findMaxLayerY(layer);
        if (r.getRootBox().getContainingLayer().getPages().size() > 1) {
            /* NOTE: In my test executions this conditional block is never used. */
            totalHeight += ((float) Math.abs(
                r.getRootBox().getChild(0).getPaintingInfo().getOuterMarginCorner().getHeight()
                    - r.getRootBox().getChild(0).getPaintingInfo().getAggregateBounds().getHeight()));
        }
        return r.getOutputDevice().getDeviceLength(totalHeight);
    }

    private static int findMaxLayerY(Layer layer) {
        int maxLayerY = layer.getMaster().getAbsY() + layer.getMaster().getHeight();
        int maxChildY = layer.getChildren().stream().mapToInt(LastPageYOffsetTest::findMaxLayerY)
            .max().orElse(0);
        return Math.max(maxChildY, maxLayerY);
    }
}

Appended HTML source (source.html):

<html>
<head>
</head>
<body>
    <p>** BEGIN APPEND **</p>
    <h1>Trying to calculate the vertical offset of rendered
        contents</h1>
    <p>Some text to fill the space Some text to fill the space Some
        text to fill the space Some text to fill the space Some text to
[cut]
        space Some Some text to fill the space Some Some text to fill
        the space Some Some text to fill the space Some</p>
    <img alt="" src="../images/tree.jpg"/>
    <p>** END APPEND **</p>
</body>
</html>

Result with less contents: LastPageYOffset_less.pdf
Result with more contents: LastPageYOffset_more.pdf

@stechio
Copy link

stechio commented Mar 5, 2021

[version: 1.0.7-SNAPSHOT; commit: 02719e80f85b5bcf7c8e627018cda9fceb62a3cd]

Delving more into the issue, I extended my generation code to reveal the underlying layout boxes. To my great surprise, I discovered that master layer's coordinates are valued as if anchored to the top-left corner (absX == 0, absY == 0), ignoring the margins actually applied -- that's the reason why the horizontal line shown in my previous comment to represent the current vertical offset was shifted upwards by a constant amount! In the following screenshot, you can see that the horizontal line (in red color) overlaps the bottom of the (misplaced) master layer box:

LastPageYOffset_boxes

@danfickle, do you know why the master layer of the root box returns wrong coordinates (absX == 0, absY == 0) instead of the actual margins-aware placement? How can I work around this to obtain the correct coordinates?

Here it is the generation code:

public class LastPageYOffsetTest {
    public static void main(String[] args) throws Exception {
        PDDocument doc = new PDDocument();
        try (PdfBoxRenderer renderer = new PdfRendererBuilder()
                .withFile(new java.io.File("source.html"))
                .usePDDocument(doc)
                .buildPdfRenderer()) {
            // Content rendering.
            renderer.createPDFWithoutClosing();

            PDPage page = renderer.getPdfDocument().getPage(
                renderer.getPdfDocument().getNumberOfPages() - 1);
            try (PDPageContentStream cs = new PDPageContentStream(
                    renderer.getPdfDocument(), page, AppendMode.APPEND, false, false)) {
                // Current-vertical-offset line drawing.
                float lastPageYOffset = renderer.getOutputDevice().normalizeY(
                    getLastPageYOffset(renderer), page.getMediaBox().getHeight()) ;
                cs.setStrokingColor(new Color(255, 0, 0));
                cs.drawLine(0, lastPageYOffset, page.getMediaBox().getWidth(), lastPageYOffset);
                System.out.println("LastPageYOffset: " + lastPageYOffset);

                // Layout boxes drawing.
                cs.setStrokingColor(new Color(0, 0, 255));
                showBoxTree(renderer.getRootBox(), "", renderer, cs, page.getMediaBox().getHeight());
            }
        }
        try (OutputStream os = new FileOutputStream("LastPageYOffset.pdf")) {
            doc.save(os);
        }
    }

    private static int findMaxLayerY(Layer layer) {
        int maxLayerY = layer.getMaster().getAbsY() + layer.getMaster().getHeight();
        int maxChildY = layer.getChildren().stream().mapToInt(LastPageYOffsetTest::findMaxLayerY)
            .max().orElse(0);
        return Math.max(maxChildY, maxLayerY);
    }

    private static float getDeviceLength(float length, PdfBoxRenderer r) {
        return r.getOutputDevice().getDeviceLength(length);
    }

    private static float getLastPageYOffset(PdfBoxRenderer r) {
        Layer layer = r.getRootBox().getContainingLayer();
        int totalHeight = findMaxLayerY(layer);
        if (r.getRootBox().getContainingLayer().getPages().size() > 1) {
            /* NOTE: In my test executions this conditional block is never used. */
            totalHeight += ((float) Math.abs(
                r.getRootBox().getChild(0).getPaintingInfo().getOuterMarginCorner().getHeight()
                    - r.getRootBox().getChild(0).getPaintingInfo().getAggregateBounds().getHeight()));
        }
        return r.getOutputDevice().getDeviceLength(totalHeight);
    }

    private static float getNativeY(float y, PdfBoxRenderer r, float pageHeight) {
        return r.getOutputDevice().normalizeY(y, pageHeight);
    }

    private static void showBoxTree(Box box, String indent, PdfBoxRenderer renderer,
            PDPageContentStream cs, float pageHeight) throws IOException {
        float nativeWidth = getDeviceLength(box.getWidth(), renderer),
                nativeHeight = getDeviceLength(box.getHeight(), renderer),
                nativeX = getDeviceLength(box.getAbsX(), renderer),
                nativeY = getNativeY(getDeviceLength(box.getAbsY(), renderer), renderer, pageHeight)
                        - nativeHeight;
        cs.addRect(nativeX, nativeY, nativeWidth, nativeHeight);
        cs.stroke();

        System.out.printf("%sabs:(%d,%d) rel:(%d,%d) size:(%dx%d); native:(%f,%f)-(%fx%f) %n",
                indent, box.getAbsX(), box.getAbsY(), box.getX(), box.getY(), box.getWidth(),
                box.getHeight(), nativeX, nativeY, nativeWidth, nativeHeight);

        indent += "  ";
        for (Box child : box.getChildren()) {
            showBoxTree(child, indent, renderer, cs, pageHeight);
        }
    }
}

Appended HTML source (source.html):

<html>
<head>
</head>
<body>
    <p>** BEGIN APPEND **</p>
    <h1>Trying to calculate the vertical offset of rendered
        contents</h1>
    <p>Some text to fill the space Some text to fill the space Some
        text to fill the space Some text to fill the space Some text to
[cut]
        space Some Some text to fill the space Some Some text to fill
        the space Some Some text to fill the space Some</p>
    <img alt="" src="../images/tree.jpg"/>
    <p>** END APPEND **</p>
</body>
</html>

Result with layout boxes: LastPageYOffset_boxes.pdf

@stechio
Copy link

stechio commented Mar 6, 2021

[version: 1.0.7-SNAPSHOT; commit: 02719e80f85b5bcf7c8e627018cda9fceb62a3cd]

Debugging the code flow, I noticed that during the layout phase all the layout boxes seem to be defined RELATIVELY to the top-left corner of the containing layer (x=0,y=0); it's only at the start of the subsequent rendering phase that page top and left margins are applied as a global translation (!!) that affects all the (relatively-positioned) drawn contents, shifting them to their proper locations! Therefore, despite their misleading names, absX and absY coordinates of Box objects seem NOT to be absolute at all.

A question immediately arises: if the positions of the layout boxes don't represent their actual page locations, how can the layout engine properly place them within the page margins? I think the trick is this: when the PageBox is instantiated, its height is calculated taking the page margins into account, through this call stack:

    CalculatedStyle.valueByName(CSSName) line: 523 (where CSSName(propName=margin-top, FS_ID=109))
    ...
    CalculatedStyle.getMarginRect(float, CssContext, boolean) line: 466	
    ...
    PageBox.getMarginBorderPadding(CssContext, int) line: 464
    PageBox.getContentHeight(CssContext) line: 207	
    PageBox.setTopAndBottom(CssContext, int) line: 245	
    Layer.addPage(CssContext) line: 1065
    BlockBox.layout(LayoutContext, int) line: 988
    BlockBox.layout(LayoutContext) line: 973
    PdfBoxRenderer.layout() line: 344

Both the layout boxes and their related PageBox don't care at all where they are in the absolute space, their sole concern is just to fit inside an area defined by its dimension (width and height). As mentioned previously, it's only after the layout phase that the absolute adjustment occurs through this translation:

    int top = -page.getPaintingTop() + page.getMarginBorderPadding(c, CalculatedStyle.TOP);
    int left = page.getMarginBorderPadding(c, CalculatedStyle.LEFT);
    int translateX = left + additionalTranslateX;    
    _outputDevice.translate(translateX, top);

which happens in PdfBoxRenderer.paintPageFast(..) on this call stack:

    PdfBoxRenderer.paintPageFast(RenderingContext, PageBox, DisplayListContainer$DisplayListPageContainer, int) line: 888	
    PdfBoxRenderer.writePDFFast(List, RenderingContext, Rectangle2D, PDDocument) line: 624	
    PdfBoxRenderer.createPdfFast(boolean) line: 559	

Although effective, this strategy looks somewhat convoluted...

Applying my deductions to the test case, the positioning is finally solved :)

LastPageYOffset_correct

Here it is the updated generation code:

public class LastPageYOffsetTest {
    public static void main(String[] args) throws Exception {
        PDDocument doc = new PDDocument();
        try (PdfBoxRenderer renderer = new PdfRendererBuilder()
                .withFile(new java.io.File("source.html"))
                .usePDDocument(doc)
                .buildPdfRenderer()) {
            // Content rendering.
            renderer.createPDFWithoutClosing();

            PDPage page = renderer.getPdfDocument().getPage(
                renderer.getPdfDocument().getNumberOfPages() - 1);
            try (PDPageContentStream cs = new PDPageContentStream(
                    renderer.getPdfDocument(), page, AppendMode.APPEND, false, false)) {
                // Current-vertical-offset line drawing.
                float lastPageYOffset = renderer.getOutputDevice().normalizeY(
                    getLastPageYOffset(renderer), page.getMediaBox().getHeight()) ;
                cs.setStrokingColor(new Color(255, 0, 0));
                cs.drawLine(0, lastPageYOffset, page.getMediaBox().getWidth(), lastPageYOffset);
                System.out.println("LastPageYOffset: " + lastPageYOffset);

                // Layout boxes drawing.
                cs.setStrokingColor(new Color(0, 0, 255));
                showBoxTree(renderer.getRootBox(), "", renderer, cs,
                        page.getMediaBox().getHeight(), getAbsX(renderer.getRootBox()),
                        getAbsY(renderer.getRootBox()));
            }
        }
        try (OutputStream os = new FileOutputStream("LastPageYOffset.pdf")) {
            doc.save(os);
        }
    }

    private static int findMaxLayerY(Layer layer) {
        int maxLayerY = layer.getMaster().getAbsY() + layer.getMaster().getHeight();
        int maxChildY = layer.getChildren().stream().mapToInt(LastPageYOffsetTest::findMaxLayerY)
            .max().orElse(0);
        return Math.max(maxChildY, maxLayerY);
    }

    private static int getAbsX(BlockBox box) {
        return box.getAbsX() + box.getContainingLayer().getPages().get(0)
                .getMarginBorderPadding(null, CalculatedStyle.LEFT);
    }

    private static int getAbsY(BlockBox box) {
        return box.getAbsY()
                + box.getContainingLayer().getPages().get(0).getMarginBorderPadding(null,
                        CalculatedStyle.TOP)
                - box.getContainingLayer().getPages().get(0).getPaintingTop();
    }

    private static float getDeviceLength(float length, PdfBoxRenderer r) {
        return r.getOutputDevice().getDeviceLength(length);
    }

    private static float getLastPageYOffset(PdfBoxRenderer r) {
        Layer layer = r.getRootBox().getContainingLayer();
        int totalHeight = findMaxLayerY(layer) + getAbsY(r.getRootBox());
        if (r.getRootBox().getContainingLayer().getPages().size() > 1) {
            /* NOTE: In my test executions this conditional block is never used. */
            totalHeight += ((float) Math.abs(
                r.getRootBox().getChild(0).getPaintingInfo().getOuterMarginCorner().getHeight()
                    - r.getRootBox().getChild(0).getPaintingInfo().getAggregateBounds().getHeight()));
        }
        return r.getOutputDevice().getDeviceLength(totalHeight);
    }

    private static float getNativeY(float y, PdfBoxRenderer r, float pageHeight) {
        return r.getOutputDevice().normalizeY(y, pageHeight);
    }

    private static void showBoxTree(Box box, String indent, PdfBoxRenderer renderer,
            PDPageContentStream cs, float pageHeight, int absX, int absY) throws IOException {
        float nativeWidth = getDeviceLength(box.getWidth(), renderer),
                nativeHeight = getDeviceLength(box.getHeight(), renderer),
                nativeX = getDeviceLength(box.getAbsX() + absX, renderer),
                nativeY = getNativeY(getDeviceLength(box.getAbsY() + absY, renderer), renderer, pageHeight)
                        - nativeHeight;
        cs.addRect(nativeX, nativeY, nativeWidth, nativeHeight);
        cs.stroke();

        System.out.printf("%sabs:(%d,%d) rel:(%d,%d) size:(%dx%d); native:(%f,%f)-(%fx%f) %n",
                indent, box.getAbsX(), box.getAbsY(), box.getX(), box.getY(), box.getWidth(),
                box.getHeight(), nativeX, nativeY, nativeWidth, nativeHeight);

        indent += "  ";
        for (Box child : box.getChildren()) {
            showBoxTree(child, indent, renderer, cs, pageHeight, absX, absY);
        }
    }
}

@danfickle what do you think about this solution? Is there a more robust & straightforward way to achieve my goal?

Appended HTML source (source.html):

<html>
<head>
</head>
<body>
    <p>** BEGIN APPEND **</p>
    <h1>Trying to calculate the vertical offset of rendered
        contents</h1>
    <p>Some text to fill the space Some text to fill the space Some
        text to fill the space Some text to fill the space Some text to
[cut]
        space Some Some text to fill the space Some Some text to fill
        the space Some Some text to fill the space Some</p>
    <img alt="" src="../images/tree.jpg"/>
    <p>** END APPEND **</p>
</body>
</html>

Result with correct layout boxes: LastPageYOffset_correct.pdf

danfickle added a commit that referenced this issue Mar 9, 2021
Including the page positions of all layers. By popular demand!
@danfickle
Copy link
Owner

I added the PR #666 for your perusal. Hopefully this means everyone doesn't need to write their own helper!

danfickle added a commit that referenced this issue Mar 10, 2021
As kindly suggested by @stechio.
Also:
+ Make PagePosition immutable.
+ Rename getPagePositions to getAllLayerPagePositions in case we later want to add methods for boxes, etc.
danfickle added a commit that referenced this issue Mar 12, 2021
Implementing excellent suggestions made by @stechio in #666.

Note: I did not rename findPagePositionsByID as others may be using.
danfickle added a commit that referenced this issue Mar 12, 2021
@stechio
Copy link

stechio commented Mar 19, 2021

[version: 1.0.7; commit: 48a9d8c5018c67d5df86d13c0311076a27d4feb1; case: AbsoluteYOffset]

Thanks to your implementation (#666), now vertical offset over static positioning works well. However, applying it to absolutely- positioned elements (as I was exploring their behavior #668) I got this (red horizontal line represents the calculated vertical offset returned by PdfBoxRenderer.getLastContentBottom(), while the blue rectangle represents the box tree):

absolute-yoffset

absolute-yoffset.pdf
absolute-yoffset.html

Since PdfBoxRenderer.getLastContentBottom() is expected to return the actual bottom of the last content, that red line should obviously stick to the green box, NOT to the bottom of the absolute layer containing it. I understand that absolute positioning takes elements out of the (static) layout flow, nonetheless its contents are right there on the page too, so there must be a way to address them when it comes to calculate the vertical offset of last content.

This is the generating code:

public class AbsoluteYOffsetTest {
    public static void main(String[] args) throws Exception {
        PDDocument doc = new PDDocument();
        try (PdfBoxRenderer renderer = new PdfRendererBuilder()
                .withFile(new java.io.File("absolute-yoffset.html"))
                .usePDDocument(doc)
                .buildPdfRenderer()) {
            // Content rendering.
            renderer.createPDFWithoutClosing();

            PDPage page = renderer.getPdfDocument().getPage(
                renderer.getPdfDocument().getNumberOfPages() - 1);
            try (PDPageContentStream cs = new PDPageContentStream(
                        renderer.getPdfDocument(), page, AppendMode.APPEND, false, false)) {
                    float builtinLastPageYOffset = renderer.getLastContentBottom();

                    System.out.println("LastPageYOffset: " + builtinLastPageYOffset);
                    cs.setStrokingColor(new Color(255, 0, 0));
                    cs.drawLine(0, builtinLastPageYOffset, page.getMediaBox().getWidth(), builtinLastPageYOffset);

                    cs.setStrokingColor(new Color(0, 0, 255));
                    showBoxTree(renderer.getRootBox(), "", renderer, cs,
                            page.getMediaBox().getHeight(), getAbsX(renderer.getRootBox()),
                            getAbsY(renderer.getRootBox()));
            }
        }
        try (OutputStream os = new FileOutputStream("absolute-yoffset.pdf")) {
            doc.save(os);
        }
    }

    private static int getAbsX(BlockBox box) {
        return box.getAbsX() + box.getContainingLayer().getPages().get(0)
                .getMarginBorderPadding(null, CalculatedStyle.LEFT);
    }

    private static int getAbsY(BlockBox box) {
        return box.getAbsY()
                + box.getContainingLayer().getPages().get(0).getMarginBorderPadding(null,
                        CalculatedStyle.TOP)
                - box.getContainingLayer().getPages().get(0).getPaintingTop();
    }

    private static float getDeviceLength(float length, PdfBoxRenderer r) {
        return r.getOutputDevice().getDeviceLength(length);
    }

    private static float getNativeY(float y, PdfBoxRenderer r, float pageHeight) {
        return r.getOutputDevice().normalizeY(y, pageHeight);
    }

    private static void showBoxTree(Box box, String indent, PdfBoxRenderer renderer,
            PDPageContentStream cs, float pageHeight, int absX, int absY) throws IOException {
        float nativeWidth = getDeviceLength(box.getWidth(), renderer),
                nativeHeight = getDeviceLength(box.getHeight(), renderer),
                nativeX = getDeviceLength(box.getAbsX() + absX, renderer),
                nativeY = getNativeY(getDeviceLength(box.getAbsY() + absY, renderer), renderer, pageHeight)
                        - nativeHeight;
        cs.addRect(nativeX, nativeY, nativeWidth, nativeHeight);
        cs.stroke();

        System.out.printf("%sabs:(%d,%d) rel:(%d,%d) size:(%dx%d); native:(%f,%f)-(%fx%f) %n",
                indent, box.getAbsX(), box.getAbsY(), box.getX(), box.getY(), box.getWidth(),
                box.getHeight(), nativeX, nativeY, nativeWidth, nativeHeight);

        indent += "  ";
        for (Box child : box.getChildren()) {
            showBoxTree(child, indent, renderer, cs, pageHeight, absX, absY);
        }
    }

@danfickle
Copy link
Owner

Hi @stechio,

I just did a little investigating and discovered that in my tests it is setting a height on the html element that breaks everything and not absolute elements. I have an absolute element in the official test and it seems to work. Not sure yet why adding a height to html destroys the world.

@danfickle danfickle reopened this Mar 22, 2021
@danfickle danfickle self-assigned this Mar 22, 2021
@stechio
Copy link

stechio commented Mar 26, 2021

[version: 1.0.9-SNAPSHOT; commit: ccd29f03ede2aecadac9c39fda95a5fedfb23645; case: AbsoluteYOffset]

Hi @danfickle,

I re-ran my test case suppressing the html element style supposedly conflicting with the current implementation, but nothing changed (the vertical offset returned by PdfBoxRenderer.getLastContentBottom() is still wrong, at the far-bottom of the page). Contextually, I also applied my alternate algorithm (getLastPageYOffset(..), already used in an earlier test) and, surprisingly, it worked well. In the following image, the red horizontal line represents the calculated vertical offset returned by PdfBoxRenderer.getLastContentBottom(), while the green one represents my alternate algorithm, and the blue rectangle represents the box tree:

absolute-yoffset_2

absolute-yoffset_2.pdf
absolute-yoffset_2.html

Here it is the generating code:

import java.awt.Color;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.PDPageContentStream.AppendMode;
import org.apache.pdfbox.pdmodel.common.PDRectangle;

import com.openhtmltopdf.css.style.CalculatedStyle;
import com.openhtmltopdf.layout.Layer;
import com.openhtmltopdf.pdfboxout.PdfBoxRenderer;
import com.openhtmltopdf.pdfboxout.PdfRendererBuilder;
import com.openhtmltopdf.render.BlockBox;
import com.openhtmltopdf.render.Box;

public class AbsoluteYOffsetTest2 {
    public static void main(String[] args) throws Exception {
        PDDocument doc = new PDDocument();
        try (PdfBoxRenderer renderer = new PdfRendererBuilder()
                .withFile(new java.io.File("absolute-yoffset_2.html"))
                .usePDDocument(doc)
                .buildPdfRenderer()) {
            // Content rendering.
            renderer.createPDFWithoutClosing();

            PDPage page = renderer.getPdfDocument().getPage(
                renderer.getPdfDocument().getNumberOfPages() - 1);
            try (PDPageContentStream cs = new PDPageContentStream(
                    renderer.getPdfDocument(), page, AppendMode.APPEND, false, false)) {
                // Current-vertical-offset line drawing:
                cs.setLineWidth(1);
                // - OHTP implementation
                drawBottomLine(renderer, cs, renderer.getLastContentBottom(),
                    page.getMediaBox(), new Color(255, 0, 0));
                // - my implementation
                drawBottomLine(renderer, cs, getLastPageYOffset(renderer), page.getMediaBox(),
                    new Color(0, 255, 0));

                // Layout boxes drawing.
                cs.setLineWidth(.2f);
                cs.setStrokingColor(new Color(0, 0, 255));
                showBoxTree(renderer.getRootBox(), "", renderer, cs,
                        page.getMediaBox().getHeight(), getAbsX(renderer.getRootBox()),
                        getAbsY(renderer.getRootBox()));
            }
        }
        try (OutputStream os = new FileOutputStream("absolute-yoffset_2.pdf")) {
            doc.save(os);
        }
    }

    private static void drawBottomLine(PdfBoxRenderer r, PDPageContentStream cs, float y,
            PDRectangle mediaBox, Color color) throws IOException {
        float lastContentBottom = r.getOutputDevice().normalizeY(y, mediaBox.getHeight());
        cs.setStrokingColor(color);
        cs.drawLine(0, lastContentBottom, mediaBox.getWidth(), lastContentBottom);

        System.out.println("LastContentBottom: " + lastContentBottom);
    }

    private static int findMaxLayerY(Layer layer) {
        int maxLayerY = layer.getMaster().getAbsY() + layer.getMaster().getHeight();
        int maxChildY = layer.getChildren().stream().mapToInt(LastPageYOffsetTest::findMaxLayerY)
            .max().orElse(0);
        return Math.max(maxChildY, maxLayerY);
    }

    private static int getAbsX(BlockBox box) {
        return box.getAbsX() + box.getContainingLayer().getPages().get(0)
                .getMarginBorderPadding(null, CalculatedStyle.LEFT);
    }

    private static int getAbsY(BlockBox box) {
        return box.getAbsY()
                + box.getContainingLayer().getPages().get(0).getMarginBorderPadding(null,
                        CalculatedStyle.TOP)
                - box.getContainingLayer().getPages().get(0).getPaintingTop();
    }

    private static float getDeviceLength(float length, PdfBoxRenderer r) {
        return r.getOutputDevice().getDeviceLength(length);
    }

    private static float getLastPageYOffset(PdfBoxRenderer r) {
        Layer layer = r.getRootBox().getContainingLayer();
        int totalHeight = findMaxLayerY(layer) + getAbsY(r.getRootBox());
        if (r.getRootBox().getContainingLayer().getPages().size() > 1) {
            /* NOTE: In my test executions this conditional block is never used. */
            totalHeight += ((float) Math.abs(
                r.getRootBox().getChild(0).getPaintingInfo().getOuterMarginCorner().getHeight()
                    - r.getRootBox().getChild(0).getPaintingInfo().getAggregateBounds().getHeight()));
        }
        return r.getOutputDevice().getDeviceLength(totalHeight);
    }

    private static float getNativeY(float y, PdfBoxRenderer r, float pageHeight) {
        return r.getOutputDevice().normalizeY(y, pageHeight);
    }

    private static void showBoxTree(Box box, String indent, PdfBoxRenderer renderer,
            PDPageContentStream cs, float pageHeight, int absX, int absY) throws IOException {
        float nativeWidth = getDeviceLength(box.getWidth(), renderer),
                nativeHeight = getDeviceLength(box.getHeight(), renderer),
                nativeX = getDeviceLength(box.getAbsX() + absX, renderer),
                nativeY = getNativeY(getDeviceLength(box.getAbsY() + absY, renderer), renderer, pageHeight)
                        - nativeHeight;
        cs.addRect(nativeX, nativeY, nativeWidth, nativeHeight);
        cs.stroke();

        System.out.printf("%sabs:(%d,%d) rel:(%d,%d) size:(%dx%d); native:(%f,%f)-(%fx%f) %n",
                indent, box.getAbsX(), box.getAbsY(), box.getX(), box.getY(), box.getWidth(),
                box.getHeight(), nativeX, nativeY, nativeWidth, nativeHeight);

        indent += "  ";
        for (Box child : box.getChildren()) {
            showBoxTree(child, indent, renderer, cs, pageHeight, absX, absY);
        }
    }
}

I experienced a similar problem even using static positioning:

form-yoffset

form-yoffset.pdf
form-yoffset.html

The generating code is the same as above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants