Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #259 - (Properly this time!) #781

Merged
merged 4 commits into from Mar 27, 2017
Merged

Fixes #259 - (Properly this time!) #781

merged 4 commits into from Mar 27, 2017

Conversation

ghost
Copy link

@ghost ghost commented Mar 21, 2017

So I figured out there is already a "dry run" pass in asciidoctor-pdf, and how to use it properly - would be nice if this was documented more obviously, but I guess it's another item on the long list of TODO's? This is definitely stable for multi-page sidebars; verified against full Vulkan spec (and this time checked the right build...)

Happy to take code edits, I just did what seemed sane and clean at the time.

@ghost
Copy link
Author

ghost commented Mar 21, 2017

NB: This code should be more generally applicable to other block types, should someone be so inclined.

@ghost
Copy link
Author

ghost commented Mar 22, 2017

Also FYI, the root cause of most of my woes seems to be something in prawn itself: prawnpdf/prawn#1014

I've been experimenting with layout options and extending them, and this issue basically blocks doing certain things in asciidoctor-pdf at the moment.

@mojavelinux
Copy link
Member

I'll admit that dealing with dry run layouts to estimate rendering length in Prawn is extremely complex. It's probably the hardest thing I've had to do in Asciidoctor PDF. As you discovered, I figured out the trick of using a separate document to perform a dry run to measure how much space a region of content will take. However, I've recently discovered that this logic has flaws since I'm not always accounting for behavior at the boundaries of a page.

There are also very important reasons why we do the things that we do, such as how new pages are created. I know it seems a bit insane and complex, but trust me when I say that there are reasons behind why I do things this way....and it's far from perfect.

The root cause is that Prawn uses a write forward system. Once it writes, it can't unwrite. I did figure out how to delete pages, but there isn't a good way to tell when we created a page intentionally or because the bottom margin of the content pushed us to a new page.

Prawn is great for simple things, but it has become clear to me that in order to keep going on Asciidoctor PDF, we're probably going to need to start creating a layer above Prawn that provides a better model for these sorts of things. That's my sense, anyway.

@mojavelinux
Copy link
Member

I have yet to document dry_run because technically, it still has too many flaws and I really don't want anyone who doesn't understand those flaws to be using it until we get that sorted out.

@mojavelinux
Copy link
Member

One first glance, this patch does look correct. Good job sorting it out! I'll try to get this into alpha.15, but if I run out of time, definitely alpha.16.

@mojavelinux
Copy link
Member

I integrated your logic and also added continuation borders at the page boundaries within the sidebar (just like I had done for the listing block). Give it a try.

There's one known issue with this implementation...but the problem is not in this code. It's in the dry_run logic. If the sidebar does not begin on the first line of the page, the calculated height may end up being too short. This happens because we're not starting at the same y offset in the draft document. As a result, we hit page boundaries at different places and it can cause a different result. This bug has to be dealt with separately. The convert_sidebar method should just assume it was given the correct height.

@mojavelinux
Copy link
Member

If we didn't add the continuation boundaries, the logic is much, much simpler. See #259 (comment).

@ghost
Copy link
Author

ghost commented Mar 27, 2017

@mojavelinux Thanks for the feedback! I did spot a similar height issue whilst looking at some other bits around here, and switching from using total_height to actually returning and using the first height/last height and full pages in between as separate values seems to fix it. Pretty sure it was page-border related, maybe a similar fix would be useful here, but (almost) every conversion function uses "keep_together" at the moment so that's a bunch of extra work that's possibly not worth it just now.

I'm working on the side at other fixes to some of the other layout issues, mostly to do with #578. Nothing as complex as an AST at the moment, but it should at least add a bit more flexibility and get all blocks rendering correctly across multiple pages. I'll have to re-evaluate in light of your continuation boundary logic, but I'm hoping I'll be able to sort the height issue whilst I'm there.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

Ok so I tried your version out, it almost works, but not quite. So this is where my adventures into layout get to help, I guess.

The reason my fix always started these blocks on a new page was partly the height issue, but more awkwardly is the "start_new_page" issue (which I have no idea why it works around, though I only figured this out later...). If you jump back to an earlier page (including via float) and then call start_new_page afterwards, it completely confuses prawn. start_new_page unconditionally starts a new page, even if there's a new page right after the current one - so it's sort of like a (really broken) "insert new page". If a sidebar block includes content that calls start_new_page, which for example a bullet list does, then it tries to insert a new page even though it shouldn't. See prawnpdf/prawn#1014 for some amount of detail on this...

Luckily the fix for this is simply - replace all instances of "start_new_page" in converter.rb with "bounds.move_past_bottom" - which does feel a bit like overkill, but basically all that function does is do the check for an existing page, and moves to it if it does, and only starts a new one if it doesn't - so it's a completely safe change, it just introduces a small extra boolean check which might be unnecessary in some cases. With that change on top of your current one, your patch works for me.

@mojavelinux
Copy link
Member

mojavelinux commented Mar 27, 2017

switching from using total_height to actually returning and using the first height/last height and full pages in between as separate values seems to fix it.

That fixes it for some cases but breaks it for other cases. The only way to truly fix it is to change where the cursor starts in the dry run document. However, there's a catch. We need to start at the top of the page so we know if it can fit on a single page first. So there are actually two steps. I pretty much have the logic worked out. I'll open an issue and detail how I'm going to implement it there.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

That fixes it for some cases but breaks it for other cases. The only way to truly fix it is to change where the cursor starts in the dry run document. However, there's a catch. We need to start at the top of the page so we know if it can fit on a single page first. So there are actually two steps. I pretty much have the logic worked out. I'll open an issue and detail how I'm going to implement it there.

Right, I missed out that bit - but you do need both AFAICT. I was working on the same issue, and actually have some code that I can probably share if it gels with what your approach is...

@mojavelinux
Copy link
Member

Luckily the fix for this is simply - replace all instances of "start_new_page" in converter.rb with "bounds.move_past_bottom"

I'm familiar with that difference and I actually use that approach in various places already. I'm in full agreement that this aspect of Prawn is very brittle. I'd be open to adding a method named advance_page that basically does what we want, but gives us contextual awareness of what's going on and we can document it.

@mojavelinux
Copy link
Member

I'll open an issue regarding the page height so we can capture the discussion there.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

I'm familiar with that difference and I actually use that approach in various places already. I'm in full agreement that this aspect of Prawn is very brittle. I'd be open to adding a method named advance_page that basically does what we want, but gives us contextual awareness of what's going on and we can document it.

There's a commented-out "advance_or_start_new_page" function in prawn_ext\extensions.rb. This should do the trick. I can uncomment this function and switch everything to using it, and add a code comment describing what it actually does. Anywhere else this function needs documenting? Should it live somewhere else?

@mojavelinux
Copy link
Member

Here's my thesis: #789

(I feel better now that I've written that because it's been stuck in my head for a while now).

@mojavelinux
Copy link
Member

I think I just prefer "advance_page" since that's clear enough that it's going to go to the next page, whether it has to create it or not.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

I think I just prefer "advance_page" since that's clear enough that it's going to go to the next page, whether it has to create it or not.

Fine by me, the extra words don't really tell you anything interesting. I'll uncomment it, rename it, add a code comment, then commit it. I'll add the motivation for its existence in the comment too so people know why it exists.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

Here's my thesis: #789

(I feel better now that I've written that because it's been stuck in my head for a while now).

Interesting. Your approach is not dissimilar to at least one iteration I had. I have to go look back at what I've coded before I respond though, which I'm not sure I have time to do right this minute - but I'll definitely do so in the next day or two.

@mojavelinux
Copy link
Member

I was able to reproduce the scenario you gave. I think somewhere in my notes I had said that if a new page was created while doing a dry run, all hell would break loose. You found that hell.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

You found that hell.

Oh boy did I...

@mojavelinux
Copy link
Member

And I can confirm that your proposed fix works. I think I commented it out because I wasn't sure when the function was needed. Now we understand.

@mojavelinux
Copy link
Member

I need to release alpha.15 very imminently, so there are two options at this point:

  1. merge this as is, but replace start_new_page in the list converters to use advance_page (we'll study and replace more cases later)
  2. defer this all until alpha.16 with everything included

wdyt? If you're happy with (1), I'm glad to do it.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

  1. would fix my particular use case for right now, so I'd be happier with that for right now if it can make alpha.15

@mojavelinux
Copy link
Member

I was able to find a few more places we need advance_page...basically anywhere that we start a new page in block-level content. (We don't have to worry about a section atm since a block can't contain a section).

However, I discovered another problem (which I also remember hitting before). If we have a keep together block inside of a keep together block, we get that same crazy behavior. I'll have to take a look at it with a fresh mind to see if we can figure out what to do in that case. I think the problem is that we are using the same scratch document for two parallel operations...so it might just be a matter of creating a new scratch document if we're already in the scratch document.

@mojavelinux
Copy link
Member

Great! I'll push an update for you to test.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

Great! I'll push an update for you to test.

Oops, already jumped the gun on that one, feel free to tweak it. It looks like I need advance_page in a few more places than that anyway.... shouldn't take more than a few minutes to debug (I hope)

@mojavelinux
Copy link
Member

In addition to the list converters, I found the same case with page break and block image. Both attempted to create a new page when there already was one.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

Yea I did a test with blanket replacing all "start_new_page" instances (except in one place where the check is included for other reasons) in converter.rb and extension.rb with advance_page and that definitely works and fixes all my issues. But I don't want to throw it into alpha.15 if you're not happy with it. I can push that to a separate branch if you want to take a look...?

@mojavelinux
Copy link
Member

Looks like we wrote just about the same thing. I take that as a great sign. Sync!

Technically, there is no benefit to calling advance_page in sidebar and listing since we have other problems with those methods, as I mentioned. I'm going to leave those off until we address them.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

Technically, there is no benefit to calling advance_page in sidebar and listing since we have other problems with those methods, as I mentioned. I'm going to leave those off until we address them.

You'd run into that only if open blocks exhibited the keep_together behaviour, for instance, and then put a sidebar in that. But since that's not the case right now, that's fine.

@mojavelinux
Copy link
Member

You'd run into that only if open blocks exhibited the keep_together behaviour

Exactly. I've added a note to add it and I'll likely file an issue to track it.

@mojavelinux
Copy link
Member

Double check the PR. If it works for you, we'll proceed with it for alpha.15 and continue the other improvements for alpha.16.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

Ah shoot - well it works, but I completely forgot about the height thing. So it fixes the page jumping, but the fix is inferior in my use case to my monkey patching because of the height issue. If you'd be willing to, at least for this release, force the sidebar to start on a new page if it spans multiple pages - then I can just use this. Otherwise it's up to you whether you want to take this now or if you want take the time to look for more "advance_page" usage before committing this.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

(I guess this would apply to anyone, so it kind of makes sense to push the change but comment it as to why)

@mojavelinux
Copy link
Member

because of the height issue.

You're referring to the case when the background doesn't always cover the last few lines of the sidebar content, correct? (I want to make sure I'm looking at the right case).

I can go ahead and add a hack to force the sidebar to start on a new page if it spans more than a page and we're not already at the top of the page (the later is the case you had missed).

@ghost
Copy link
Author

ghost commented Mar 27, 2017

You're referring to the case when the background doesn't always cover the last few lines of the sidebar content, correct? (I want to make sure I'm looking at the right case).

Correct.

I can go ahead and add a hack to force the sidebar to start on a new page if it spans more than a page and we're not already at the top of the page (the later is the case you had missed).

Yes please - and you're right I hadn't factored that in. And there was me thinking I was testing all the corner cases...

@mojavelinux
Copy link
Member

I'm quite impressed by how many corner cases you found. Only between the two of us were we able to cover them all. Amazing.

Updating coming.

@ghost
Copy link
Author

ghost commented Mar 27, 2017

I'm quite impressed by how many corner cases you found. Only between the two of us were we able to cover them all. Amazing.

We have a big doc to build (700+ pages!), and I've stared at this problem for a number of days now so it's not completely unsurprising 😄

@mojavelinux
Copy link
Member

Amazing!

Okay, that reintroduces the original fix you made about starting a new page. I went ahead and just changed the other line to advance_page too to be consistent. Sometimes, we have to go around the block just to know why we are standing where we are standing ;)

@mojavelinux
Copy link
Member

Now that we understand this problem, I'm really looking forward to getting #789 solved so that we can get proper page splitting behavior that the author can control. First, we had to write the code to behave kind of the wrong way so that we could understand how to write it the right way.

@mojavelinux
Copy link
Member

Thanks for all your help. You don't know how good it feels to have someone else who actually understands Asciidoctor PDF at this level. Writing PDFs ain't easy...but we'll be able to move forward faster with more heads in the game.

@mojavelinux mojavelinux merged commit 3ee6fcf into asciidoctor:master Mar 27, 2017
@ghost
Copy link
Author

ghost commented Mar 27, 2017

Thanks for all your help. You don't know how good it feels to have someone else who actually understands Asciidoctor PDF at this level. Writing PDFs ain't easy...but we'll be able to move forward faster with more heads in the game.

No problem - thanks for taking the time to look at all of this! Glad I could be of help. I will definitely be finding time to help with #789 and #578 as well at some point. Not sure what I'll do beyond that just yet, but we'll see :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant