Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Annotation] Add a div containing the text of a FreeText annotation (bug 1780375) #15267

Merged
merged 1 commit into from
Aug 4, 2022

Conversation

calixteman
Copy link
Contributor

An annotation doesn't have to be in the text flow, hence it's likely a bad idea
to insert its text in the text layer. But the text must be visible from a screen
reader point of view so it must somewhere in the DOM.
So with this patch, the text from a FreeText annotation is extracted and added in
a div in its HTML counterpart, and with the patch #15237 the text should be visible
and positioned relatively to the text flow.

includeMarkedContent: true,
combineTextItems: true,
sink,
viewBox: [-Infinity, -Infinity, Infinity, Infinity],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it simply not matter what these values are, or do you perhaps want to pass in the view-property (on the Page-instance) similar to how the "normal" text-extraction does it?

Comment on lines 583 to 594
const partialEvaluator = new PartialEvaluator({
xref: this.xref,
handler,
pageIndex: this.pageIndex,
idFactory: this._localIdFactory,
fontCache: this.fontCache,
builtInCMapCache: this.builtInCMapCache,
standardFontDataCache: this.standardFontDataCache,
globalImageCache: this.globalImageCache,
options: this.evaluatorOptions,
});
const textContentPromises = [];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this down to after the length === 0 early return, to avoid unconditionally initializing this data.

@@ -578,8 +578,20 @@ class Page {
return tree;
}

getAnnotationsData(intent) {
return this._parsedAnnotations.then(function (annotations) {
getAnnotationsData(handler, task, intent) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that this is pre-existing code, but would things become simpler if you change this method to be async instead?

textContentPromises.push(
annotation.extractTextContent(partialEvaluator, task).then(text => {
if (text) {
annotation.data.text = text;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work, and if so make sense, to instead have the extractTextContent-method itself assign the data.text-property when a non-empty string exists?

@Snuffleupagus
Copy link
Collaborator

/botio test

@pdfjsbot
Copy link

pdfjsbot commented Aug 3, 2022

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/428131fe9a90327/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Aug 3, 2022

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/fa9ce29dd7eb747/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Aug 3, 2022

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/fa9ce29dd7eb747/output.txt

Total script time: 25.83 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 14

Image differences available at: http://54.241.84.105:8877/fa9ce29dd7eb747/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

pdfjsbot commented Aug 3, 2022

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/428131fe9a90327/output.txt

Total script time: 29.22 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 6

Image differences available at: http://54.193.163.58:8877/428131fe9a90327/reftest-analyzer.html#web=eq.log

@calixteman
Copy link
Contributor Author

My test is not ideal... There isn't enough transparency, I'll fix.

}

if (text.length > 0) {
this.data.text = text;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurred to me that this property name is very generic, so how about the following instead?

Suggested change
this.data.text = text;
this.data.textContent = text;

@@ -941,6 +941,56 @@ class Annotation {
return null;
}

get hasSomeText() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, with the new property name:

Suggested change
get hasSomeText() {
get hasTextContent() {

) {
annotationsData.push(annotation.data);
}
if (intentAny || (intentDisplay && annotation.viewable)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce the amount of parsing and asynchronicity in this method, since most Annotations don't have textContent, how about we limit this to only the affected Annotations instead?

Suggested change
if (intentAny || (intentDisplay && annotation.viewable)) {
if (
annotation.hasTextContent &&
(intentAny || (intentDisplay && annotation.viewable))
) {

}
if (intentAny || (intentDisplay && annotation.viewable)) {
textContentPromises.push(
annotation.extractTextContent(partialEvaluator, task, this.view)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to ensure that a possible Error in the new method won't break all Annotation-parsing here, so something like the following is probably necessary.

Suggested change
annotation.extractTextContent(partialEvaluator, task, this.view)
annotation
.extractTextContent(partialEvaluator, task, this.view)
.catch(function (reason) {
warn(
`getAnnotationsData - ignoring textContent during "${task.name}" task: "${reason}".`
);
})

Comment on lines +279 to +280
opacity: 0;
color: transparent;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that this is necessary for the a11y-engine to actually "find" the elements, and that e.g. display: none; doesn't work; is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is my understanding of:
#15237 (comment)

In the future, we could create a specific text layer for the annotation and then make the text selectable, highlightable and then searchable.

@calixteman
Copy link
Contributor Author

My test is not ideal... There isn't enough transparency, I'll fix.

In fact it isn't just a matter of taste but then rendering on the canvas was broken.
I had to add:
https://github.com/mozilla/pdf.js/pull/15267/files#diff-a7e409abd3d3209d32d3adc8457c5e23feb4c06113a2e065b0c6a9322a3a3369R984

in order to have a correct rendering.

@Snuffleupagus
Copy link
Collaborator

In fact it isn't just a matter of taste but then rendering on the canvas was broken. I had to add: https://github.com/mozilla/pdf.js/pull/15267/files#diff-a7e409abd3d3209d32d3adc8457c5e23feb4c06113a2e065b0c6a9322a3a3369R984

in order to have a correct rendering.

Oh, that makes a lot of sense!


What happened to the test/test_manifest.json file?

sink,
viewBox,
});
this.appearance.reset();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the rest of this file, don't you actually want this.reset() here?

@calixteman
Copy link
Contributor Author

/botio test

@pdfjsbot
Copy link

pdfjsbot commented Aug 3, 2022

From: Bot.io (Windows)


Received

Command cmd_test from @calixteman received. Current queue size: 0

Live output at: http://54.193.163.58:8877/d6a1b1ce9b0f69c/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Aug 3, 2022

From: Bot.io (Linux m4)


Received

Command cmd_test from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/905f60bf2a3adf6/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Aug 3, 2022

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/905f60bf2a3adf6/output.txt

Total script time: 26.00 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 11

Image differences available at: http://54.241.84.105:8877/905f60bf2a3adf6/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

pdfjsbot commented Aug 3, 2022

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/d6a1b1ce9b0f69c/output.txt

Total script time: 28.98 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 3

Image differences available at: http://54.193.163.58:8877/d6a1b1ce9b0f69c/reftest-analyzer.html#web=eq.log

Comment on lines 587 to 588
const partialEvaluator = new PartialEvaluator({
xref: this.xref,
handler,
pageIndex: this.pageIndex,
idFactory: this._localIdFactory,
fontCache: this.fontCache,
builtInCMapCache: this.builtInCMapCache,
standardFontDataCache: this.standardFontDataCache,
globalImageCache: this.globalImageCache,
options: this.evaluatorOptions,
});
const textContentPromises = [];
const annotationsData = [];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My final suggestion would be that we don't create a PartialEvaluator-instance unconditionally, since most pages won't actually need that:

Suggested change
const partialEvaluator = new PartialEvaluator({
xref: this.xref,
handler,
pageIndex: this.pageIndex,
idFactory: this._localIdFactory,
fontCache: this.fontCache,
builtInCMapCache: this.builtInCMapCache,
standardFontDataCache: this.standardFontDataCache,
globalImageCache: this.globalImageCache,
options: this.evaluatorOptions,
});
const textContentPromises = [];
const annotationsData = [];
const annotationsData = [],
textContentPromises = [];
let partialEvaluator;

annotationsData.push(annotation.data);
}
if (annotation.hasTextContent && isVisible) {
textContentPromises.push(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the comment above, this would become:

Suggested change
textContentPromises.push(
if (!partialEvaluator) {
partialEvaluator = new PartialEvaluator({
xref: this.xref,
handler,
pageIndex: this.pageIndex,
idFactory: this._localIdFactory,
fontCache: this.fontCache,
builtInCMapCache: this.builtInCMapCache,
standardFontDataCache: this.standardFontDataCache,
globalImageCache: this.globalImageCache,
options: this.evaluatorOptions,
});
}
textContentPromises.push(

…bug 1780375)

An annotation doesn't have to be in the text flow, hence it's likely a bad idea
to insert its text in the text layer. But the text must be visible from a screen
reader point of view so it must somewhere in the DOM.
So with this patch, the text from a FreeText annotation is extracted and added in
a div in its HTML counterpart, and with the patch mozilla#15237 the text should be visible
and positioned relatively to the text flow.
@calixteman
Copy link
Contributor Author

/botio test

@pdfjsbot
Copy link

pdfjsbot commented Aug 4, 2022

From: Bot.io (Linux m4)


Received

Command cmd_test from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/1f2e41b6c917329/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Aug 4, 2022

From: Bot.io (Windows)


Received

Command cmd_test from @calixteman received. Current queue size: 0

Live output at: http://54.193.163.58:8877/adc609ea372b366/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Aug 4, 2022

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/1f2e41b6c917329/output.txt

Total script time: 25.96 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: Passed
  • Regression tests: FAILED
  different ref/snapshot: 13
  different first/second rendering: 1

Image differences available at: http://54.241.84.105:8877/1f2e41b6c917329/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

pdfjsbot commented Aug 4, 2022

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/adc609ea372b366/output.txt

Total script time: 28.95 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 4

Image differences available at: http://54.193.163.58:8877/adc609ea372b366/reftest-analyzer.html#web=eq.log

@calixteman calixteman merged commit b985eaa into mozilla:master Aug 4, 2022
@calixteman
Copy link
Contributor Author

/botio makeref

@pdfjsbot
Copy link

pdfjsbot commented Aug 4, 2022

From: Bot.io (Linux m4)


Received

Command cmd_makeref from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/f37ab75057429d0/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Aug 4, 2022

From: Bot.io (Windows)


Received

Command cmd_makeref from @calixteman received. Current queue size: 0

Live output at: http://54.193.163.58:8877/03aadcccfd048a9/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Aug 4, 2022

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/f37ab75057429d0/output.txt

Total script time: 22.09 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@pdfjsbot
Copy link

pdfjsbot commented Aug 4, 2022

From: Bot.io (Windows)


Success

Full output at http://54.193.163.58:8877/03aadcccfd048a9/output.txt

Total script time: 22.80 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants