[api-minor] Allow loading pdf fonts into another document. #12131

jsg2021 · 2020-07-27T17:30:44Z

This adds support to specify a document context the pdf will render into when getting the pdf document. I wanted to automatically infer the document, based on the container element, but the fonts are loaded in a way that is detached from the viewer. So this makes the getDocument support a param ownerDocumentContext (better name suggestions are welcome)

Fixes #8271

Snuffleupagus

(better name suggestions are welcome)

How about simply ownerDocument, which seems like slightly less of a mouthful :-)

Also, please note that all getDocument parameters should be documented in

pdf.js/src/display/api.js

Lines 100 to 170 in 4038160

    
           /** 
        
            * Document initialization / loading parameters object. 
        
            * 
        
            * @typedef {Object} DocumentInitParameters 
        
            * @property {string}     [url] - The URL of the PDF. 
        
            * @property {TypedArray|Array|string} [data] - Binary PDF data. Use typed 
        
            *    arrays (Uint8Array) to improve the memory usage. If PDF data is 
        
            *    BASE64-encoded, use atob() to convert it to a binary string first. 
        
            * @property {Object}     [httpHeaders] - Basic authentication headers. 
        
            * @property {boolean}    [withCredentials] - Indicates whether or not 
        
            *   cross-site Access-Control requests should be made using credentials such 
        
            *   as cookies or authorization headers. The default is false. 
        
            * @property {string}     [password] - For decrypting password-protected PDFs. 
        
            * @property {TypedArray} [initialData] - A typed array with the first portion 
        
            *   or all of the pdf data. Used by the extension since some data is already 
        
            *   loaded before the switch to range requests. 
        
            * @property {number}     [length] - The PDF file length. It's used for 
        
            *   progress reports and range requests operations. 
        
            * @property {PDFDataRangeTransport} [range] 
        
            * @property {number}     [rangeChunkSize] - Specify maximum number of bytes 
        
            *   fetched per range request. The default value is 2^16 = 65536. 
        
            * @property {PDFWorker}  [worker] - The worker that will be used for 
        
            *   the loading and parsing of the PDF data. 
        
            * @property {number} [verbosity] - Controls the logging level; the 
        
            *   constants from {VerbosityLevel} should be used. 
        
            * @property {string} [docBaseUrl] - The base URL of the document, 
        
            *   used when attempting to recover valid absolute URLs for annotations, and 
        
            *   outline items, that (incorrectly) only specify relative URLs. 
        
            * @property {string} [cMapUrl] - The URL where the predefined 
        
            *   Adobe CMaps are located. Include trailing slash. 
        
            * @property {boolean} [cMapPacked] - Specifies if the Adobe CMaps are 
        
            *   binary packed. 
        
            * @property {Object} [CMapReaderFactory] - The factory that will be 
        
            *   used when reading built-in CMap files. Providing a custom factory is useful 
        
            *   for environments without `XMLHttpRequest` support, such as e.g. Node.js. 
        
            *   The default value is {DOMCMapReaderFactory}. 
        
            * @property {boolean} [stopAtErrors] - Reject certain promises, e.g. 
        
            *   `getOperatorList`, `getTextContent`, and `RenderTask`, when the associated 
        
            *   PDF data cannot be successfully parsed, instead of attempting to recover 
        
            *   whatever possible of the data. The default value is `false`. 
        
            * @property {number} [maxImageSize] - The maximum allowed image size 
        
            *   in total pixels, i.e. width * height. Images above this value will not be 
        
            *   rendered. Use -1 for no limit, which is also the default value. 
        
            * @property {boolean} [isEvalSupported] - Determines if we can eval 
        
            *   strings as JS. Primarily used to improve performance of font rendering, 
        
            *   and when parsing PDF functions. The default value is `true`. 
        
            * @property {boolean} [disableFontFace] - By default fonts are 
        
            *   converted to OpenType fonts and loaded via font face rules. If disabled, 
        
            *   fonts will be rendered using a built-in font renderer that constructs the 
        
            *   glyphs with primitive path commands. The default value is `false`. 
        
            * @property {boolean} [fontExtraProperties] - Include additional properties, 
        
            *   which are unused during rendering of PDF documents, when exporting the 
        
            *   parsed font data from the worker-thread. This may be useful for debugging 
        
            *   purposes (and backwards compatibility), but note that it will lead to 
        
            *   increased memory usage. The default value is `false`. 
        
            * @property {boolean} [disableRange] - Disable range request loading 
        
            *   of PDF files. When enabled, and if the server supports partial content 
        
            *   requests, then the PDF will be fetched in chunks. 
        
            *   The default value is `false`. 
        
            * @property {boolean} [disableStream] - Disable streaming of PDF file 
        
            *   data. By default PDF.js attempts to load PDFs in chunks. 
        
            *   The default value is `false`. 
        
            * @property {boolean} [disableAutoFetch] - Disable pre-fetching of PDF 
        
            *   file data. When range requests are enabled PDF.js will automatically keep 
        
            *   fetching more data even if it isn't needed to display the current page. 
        
            *   The default value is `false`. 
        
            *   NOTE: It is also necessary to disable streaming, see above, 
        
            *         in order for disabling of pre-fetching to work correctly. 
        
            * @property {boolean} [pdfBug] - Enables special hooks for debugging 
        
            *   PDF.js (see `web/debugger.js`). The default value is `false`. 
        
            */

Furthermore, would it be possible to add a unit-test to ensure that this works (and continues to work) correctly?
A suitable place for the test might be in https://github.com/mozilla/pdf.js/blob/master/test/unit/custom_spec.js

Finally, please see https://github.com/mozilla/pdf.js/wiki/Squashing-Commits

src/display/api.js

src/display/font_loader.js

jsg2021 · 2020-07-28T15:07:49Z

I'll make those changes 😊

I was avoiding _document and ownerDocument to attempt to avoid confusion with the pdf document. though, that's not really referenced throughout the fontloader. I'll use those names. 😊

src/display/api.js

Snuffleupagus · 2020-07-28T19:38:53Z

src/display/api.js

+ * @property {HTMLDocument} [ownerDocument] - Specify an explicit document
+ *   context to create elements for and to load resouces into, such as
+ *   fonts. Defaults to the current document.


@timvandermeij Do you think this documentation is clear enough for users, or do you have any suggestions for e.g. improved wording here (or any other suggests about the overall implementation)?

test/unit/custom_spec.js

timvandermeij

Looks good to me aside from these comments.

test/unit/custom_spec.js

src/display/display_utils.js

test/unit/custom_spec.js

timvandermeij

Looks good to me with these final comments addressed. If @Snuffleupagus is also happy with it, we can merge it.

test/unit/custom_spec.js

Snuffleupagus

Looks OK to me, with the final comments addressed and all test passing.

src/display/api.js

src/display/display_utils.js

src/display/api.js

timvandermeij · 2020-07-31T22:36:13Z

/botio test

pdfjsbot · 2020-07-31T22:36:14Z

From: Bot.io (Linux m4)

Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://54.67.70.0:8877/df8ab94d3ef2257/output.txt

pdfjsbot · 2020-07-31T22:36:14Z

From: Bot.io (Windows)

Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://54.215.176.217:8877/27c8c6af555fd78/output.txt

pdfjsbot · 2020-07-31T23:03:02Z

From: Bot.io (Linux m4)

Failed

Full output at http://54.67.70.0:8877/df8ab94d3ef2257/output.txt

Total script time: 26.79 mins

Font tests: Passed
Unit tests: Passed
Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/df8ab94d3ef2257/reftest-analyzer.html#web=eq.log

pdfjsbot · 2020-07-31T23:05:31Z

From: Bot.io (Windows)

Failed

Full output at http://54.215.176.217:8877/27c8c6af555fd78/output.txt

Total script time: 29.27 mins

Font tests: Passed
Unit tests: Passed
Regression tests: FAILED

Image differences available at: http://54.215.176.217:8877/27c8c6af555fd78/reftest-analyzer.html#web=eq.log

timvandermeij · 2020-07-31T23:14:02Z

Thank you for your contribution! (The reference tests failures are unrelated intermittents.)

jsg2021 · 2020-08-01T00:45:04Z

awesome 😊

Snuffleupagus · 2020-08-01T10:49:31Z

test/unit/custom_spec.js

@@ -107,3 +107,119 @@ describe("custom canvas rendering", function () {
      .catch(done.fail);
  });
 });
+
+describe("alternate document context", function () {
+  const FontFace = global.FontFace;


This is completely wrong, since global isn't actually defined other than in Node.js and these unit-tests now fail in browsers with ReferenceError: global is not defined ...

Fixing that locally, however, still causes these tests to fail, now with TypeError: canvas.getContext is not a function ... in browsers.

When reviewing this, I simply assumed that the new unit-tests were actually tested and thus working, but looking at them again I'm no longer sure that they're doing the right thing in general unfortunately. (E.g. the ownerDocument probably needs to be a valid HTMLDocument-instance, rather than a regular object.)

It's strange that this didn't cause the bots to fail, but it's clear from the logs that these new tests didn't even run. Unfortunately I cannot see an immediate fix for the unit-tests, and just removing the unit-tests seem like the wrong approach.

@timvandermeij Can you please back out this patch, since this shouldn't have landed as-is.

This patch is now reverted. It's indeed really not great that Jasmine doesn't even run the test and therefore the bots think all is fine. I'll try to find out why that happens since we want to get a failed test if that ever happens.

timvandermeij added the core label Jul 27, 2020

Snuffleupagus requested changes Jul 28, 2020

View reviewed changes

src/display/api.js Outdated Show resolved Hide resolved

src/display/font_loader.js Outdated Show resolved Hide resolved

Snuffleupagus requested changes Jul 28, 2020

View reviewed changes

timvandermeij reviewed Jul 28, 2020

View reviewed changes

test/unit/custom_spec.js Outdated Show resolved Hide resolved

test/unit/custom_spec.js Outdated Show resolved Hide resolved

test/unit/custom_spec.js Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

Snuffleupagus requested changes Jul 29, 2020

View reviewed changes

src/display/display_utils.js Show resolved Hide resolved

test/unit/custom_spec.js Outdated Show resolved Hide resolved

test/unit/custom_spec.js Outdated Show resolved Hide resolved

test/unit/custom_spec.js Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

timvandermeij reviewed Jul 29, 2020

View reviewed changes

test/unit/custom_spec.js Outdated Show resolved Hide resolved

test/unit/custom_spec.js Outdated Show resolved Hide resolved

test/unit/custom_spec.js Outdated Show resolved Hide resolved

test/unit/custom_spec.js Outdated Show resolved Hide resolved

Snuffleupagus requested changes Jul 31, 2020

View reviewed changes

src/display/api.js Outdated Show resolved Hide resolved

src/display/display_utils.js Outdated Show resolved Hide resolved

src/display/api.js Outdated Show resolved Hide resolved

Allow loading pdf fonts into another document.

9b16b8e

timvandermeij changed the title ~~Allow loading pdf fonts into another document.~~ [api-minor] Allow loading pdf fonts into another document. Jul 31, 2020

timvandermeij approved these changes Jul 31, 2020

View reviewed changes

timvandermeij merged commit 173b92a into mozilla:master Jul 31, 2020

jsg2021 deleted the issue-8271 branch August 1, 2020 00:45

Snuffleupagus reviewed Aug 1, 2020

View reviewed changes

Snuffleupagus mentioned this pull request Aug 1, 2020

Revert "[api-minor] Allow loading pdf fonts into another document." #12151

Merged

timvandermeij mentioned this pull request Aug 1, 2020

Fix pull request #12131 so it can be merged again #12152

Closed

jsg2021 restored the issue-8271 branch August 1, 2020 17:54

jsg2021 mentioned this pull request Aug 1, 2020

[api-minor] Allow loading pdf fonts into another document. #12154

Merged

hennadiipro pushed a commit to hennadiipro/react-pdf-highlighter that referenced this pull request Jan 13, 2024

Support rendering to alternet windowSee: mozilla/pdf.js#12131

28ba33f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[api-minor] Allow loading pdf fonts into another document. #12131

[api-minor] Allow loading pdf fonts into another document. #12131

jsg2021 commented Jul 27, 2020

Snuffleupagus left a comment

jsg2021 commented Jul 28, 2020

Snuffleupagus Jul 28, 2020

timvandermeij left a comment

This comment has been minimized.

This comment has been minimized.

timvandermeij left a comment

Snuffleupagus left a comment

timvandermeij commented Jul 31, 2020

pdfjsbot commented Jul 31, 2020

pdfjsbot commented Jul 31, 2020

pdfjsbot commented Jul 31, 2020

pdfjsbot commented Jul 31, 2020

timvandermeij commented Jul 31, 2020

jsg2021 commented Aug 1, 2020

Snuffleupagus Aug 1, 2020

timvandermeij Aug 1, 2020

	/**
	* Document initialization / loading parameters object.
	*
	* @typedef {Object} DocumentInitParameters
	* @property {string} [url] - The URL of the PDF.
	* @property {TypedArray\|Array\|string} [data] - Binary PDF data. Use typed
	* arrays (Uint8Array) to improve the memory usage. If PDF data is
	* BASE64-encoded, use atob() to convert it to a binary string first.
	* @property {Object} [httpHeaders] - Basic authentication headers.
	* @property {boolean} [withCredentials] - Indicates whether or not
	* cross-site Access-Control requests should be made using credentials such
	* as cookies or authorization headers. The default is false.
	* @property {string} [password] - For decrypting password-protected PDFs.
	* @property {TypedArray} [initialData] - A typed array with the first portion
	* or all of the pdf data. Used by the extension since some data is already
	* loaded before the switch to range requests.
	* @property {number} [length] - The PDF file length. It's used for
	* progress reports and range requests operations.
	* @property {PDFDataRangeTransport} [range]
	* @property {number} [rangeChunkSize] - Specify maximum number of bytes
	* fetched per range request. The default value is 2^16 = 65536.
	* @property {PDFWorker} [worker] - The worker that will be used for
	* the loading and parsing of the PDF data.
	* @property {number} [verbosity] - Controls the logging level; the
	* constants from {VerbosityLevel} should be used.
	* @property {string} [docBaseUrl] - The base URL of the document,
	* used when attempting to recover valid absolute URLs for annotations, and
	* outline items, that (incorrectly) only specify relative URLs.
	* @property {string} [cMapUrl] - The URL where the predefined
	* Adobe CMaps are located. Include trailing slash.
	* @property {boolean} [cMapPacked] - Specifies if the Adobe CMaps are
	* binary packed.
	* @property {Object} [CMapReaderFactory] - The factory that will be
	* used when reading built-in CMap files. Providing a custom factory is useful
	* for environments without `XMLHttpRequest` support, such as e.g. Node.js.
	* The default value is {DOMCMapReaderFactory}.
	* @property {boolean} [stopAtErrors] - Reject certain promises, e.g.
	* `getOperatorList`, `getTextContent`, and `RenderTask`, when the associated
	* PDF data cannot be successfully parsed, instead of attempting to recover
	* whatever possible of the data. The default value is `false`.
	* @property {number} [maxImageSize] - The maximum allowed image size
	* in total pixels, i.e. width * height. Images above this value will not be
	* rendered. Use -1 for no limit, which is also the default value.
	* @property {boolean} [isEvalSupported] - Determines if we can eval
	* strings as JS. Primarily used to improve performance of font rendering,
	* and when parsing PDF functions. The default value is `true`.
	* @property {boolean} [disableFontFace] - By default fonts are
	* converted to OpenType fonts and loaded via font face rules. If disabled,
	* fonts will be rendered using a built-in font renderer that constructs the
	* glyphs with primitive path commands. The default value is `false`.
	* @property {boolean} [fontExtraProperties] - Include additional properties,
	* which are unused during rendering of PDF documents, when exporting the
	* parsed font data from the worker-thread. This may be useful for debugging
	* purposes (and backwards compatibility), but note that it will lead to
	* increased memory usage. The default value is `false`.
	* @property {boolean} [disableRange] - Disable range request loading
	* of PDF files. When enabled, and if the server supports partial content
	* requests, then the PDF will be fetched in chunks.
	* The default value is `false`.
	* @property {boolean} [disableStream] - Disable streaming of PDF file
	* data. By default PDF.js attempts to load PDFs in chunks.
	* The default value is `false`.
	* @property {boolean} [disableAutoFetch] - Disable pre-fetching of PDF
	* file data. When range requests are enabled PDF.js will automatically keep
	* fetching more data even if it isn't needed to display the current page.
	* The default value is `false`.
	* NOTE: It is also necessary to disable streaming, see above,
	* in order for disabling of pre-fetching to work correctly.
	* @property {boolean} [pdfBug] - Enables special hooks for debugging
	* PDF.js (see `web/debugger.js`). The default value is `false`.
	*/

[api-minor] Allow loading pdf fonts into another document. #12131

[api-minor] Allow loading pdf fonts into another document. #12131

Conversation

jsg2021 commented Jul 27, 2020

Snuffleupagus left a comment

Choose a reason for hiding this comment

jsg2021 commented Jul 28, 2020

Snuffleupagus Jul 28, 2020

Choose a reason for hiding this comment

timvandermeij left a comment

Choose a reason for hiding this comment

This comment has been minimized.

This comment has been minimized.

timvandermeij left a comment

Choose a reason for hiding this comment

Snuffleupagus left a comment

Choose a reason for hiding this comment

timvandermeij commented Jul 31, 2020

pdfjsbot commented Jul 31, 2020

From: Bot.io (Linux m4)

Received

pdfjsbot commented Jul 31, 2020

From: Bot.io (Windows)

Received

pdfjsbot commented Jul 31, 2020

From: Bot.io (Linux m4)

Failed

pdfjsbot commented Jul 31, 2020

From: Bot.io (Windows)

Failed

timvandermeij commented Jul 31, 2020

jsg2021 commented Aug 1, 2020

Snuffleupagus Aug 1, 2020

Choose a reason for hiding this comment

timvandermeij Aug 1, 2020

Choose a reason for hiding this comment