Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#pageN anchor not working on first page load #11499

Closed
jtotht opened this issue Jan 10, 2020 · 7 comments · Fixed by #11503
Closed

#pageN anchor not working on first page load #11499

jtotht opened this issue Jan 10, 2020 · 7 comments · Fixed by #11503
Labels

Comments

@jtotht
Copy link

jtotht commented Jan 10, 2020

Attach (recommended) or Link to PDF file here: https://epa.oszk.hu/00800/00861/00032/pdf/02_kristo.pdf#page2

Configuration:

  • Web browser and its version: Firefox ESR 68.4.1 & Developer Edition 73.0b3
  • Operating system and its version: Debian 9 “stretch”
  • PDF.js version: 2.2.178, 2.4.254 (the ones built in these versions of Firefox)
  • Is a browser extension: no, core Firefox

Steps to reproduce the problem:

  1. Open the above link

What is the expected behavior?
The PDF opens with page 2.
PDF showing second page

What went wrong?
The PDF opens with page 1.
PDF showing first page

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension): N/A

Additional details:
The function works after page load (when I enter a different anchor name in the URL bar, Firefox goes straight there, without reloading the page; that’s OK). When the anchor is present in the URL on page load, the following error pops up on the browser console:

PDFLinkService.navigateTo: "null" is not a valid destination array, for dest="page2". viewer.js:6570:17

(the line number is 6507 in v2.2.178, otherwise the same error message). No further stack trace is provided.

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Jan 10, 2020

https://epa.oszk.hu/00800/00861/00032/pdf/02_kristo.pdf#page2

The hash needs to read #page=2 in order for this to work, and https://epa.oszk.hu/00800/00861/00032/pdf/02_kristo.pdf#page=2 works for me.

@jtotht
Copy link
Author

jtotht commented Jan 10, 2020

Thanks for the fast reply! Yes, it works for me, too, with the equals sign. However, it’s still inconsistent: the short version (#page2) doesn’t work on page load, but it does work after that. I don’t know whether it should be supported on page load or support should be dropped in after-load situation, but either one should be taken.

@Snuffleupagus
Copy link
Collaborator

However, it’s still inconsistent: the short version (#page2) doesn’t work on page load, but it does work after that.

Yes, however that format shouldn't actually be working at all :-)
So, please make sure that you use the correct #page=2 format, since the other one probably won't be "supported" for long...

@jtotht
Copy link
Author

jtotht commented Jan 10, 2020

OK, thanks. Maybe this issue can be repurposed to track the removal of this undocumented feature from PDF.js? If not, feel free to close it.

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Jan 10, 2020

Maybe this issue can be repurposed to track the removal of this undocumented feature from PDF.js?

That was my intention, but some quick debugging would suggest that there's no bug in PDF.js and rather that the browser itself is somehow "helpfully" interpreting those incorrect hashes.
That's based on the fact that this only works for pages that have already been loaded/rendered, and if the PDF.js logic was actually invoked it would work for every page (as the #page=n format does).

I'm thoroughly confused now...

Edit: And it even works the exact same way in the simpleviewer example, see https://github.com/mozilla/pdf.js/tree/master/examples/components, and that one doesn't even contain any code for handling URLs/hashes at all.

@jtotht
Copy link
Author

jtotht commented Jan 11, 2020

I found out what happens: pageN is the actual HTML ID of the page canvas. There’s something that seems to be in connection with this in web/pdf_page_view.js, L78, but if you search for #page2 in Firefox Developer Tools’ inspector, it goes straight to the canvas in the DOM tree. HTML4 is quite restrictive about what an ID can contain ([A-Za-z][A-Za-z0-9_:.-]*), but HTML5 seems to be much more permissive, denying only ASCII whitespace (as far as I understand). As PDF.js targets only HTML5-compliant browsers anyway, maybe page=N could be used as actual HTML ID? I don’t know whether this confuses the JS code responsible for navigation.

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Jan 11, 2020

There’s something that seems to be in connection with this in web/pdf_page_view.js, L78,

That property is only used in a JavaScript object, and not actually appended to the DOM.
The actual error rather seems to stem from this line, which we'll have to look into changing/removing in some way.

As PDF.js targets only HTML5-compliant browsers anyway, maybe page=N could be used as actual HTML ID? I don’t know whether this confuses the JS code responsible for navigation.

That would end up "fighting" with the general PDF.js navigation, as you suspected, and there's also the issue of that only working for loaded/rendered pages.
Hence preventing this browser navigation from working really seem like the best solution.


Anyway, thanks for helping out with getting to the bottom of all of this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants