Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Editor] Implement the new alt text flow (bug 1909604) #18492

Merged
merged 1 commit into from
Jul 30, 2024

Conversation

calixteman
Copy link
Contributor

For the Firefox pdf viewer, we want to use AI to guess an alt-text when adding an image to a pdf. For now the telemtry stuff is not implemented and will come soon.
In order to test it locally:

  • set enableAltText, enableFakeMLManager and enableUpdatedAddImage to true.
    or in Firefox:
  • set browser.ml.enable, pdfjs.enableAltText and pdfjs.enableUpdatedAddImage to true.

@calixteman calixteman requested a review from a team as a code owner July 24, 2024 13:31
@calixteman calixteman requested review from Snuffleupagus and removed request for a team July 24, 2024 13:31
@calixteman
Copy link
Contributor Author

/botio-linux preview

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_preview from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/f773a2bf93be13e/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/f773a2bf93be13e/output.txt

Total script time: 1.05 mins

Published

@calixteman calixteman requested a review from a team July 24, 2024 13:38
l10n/en-US/viewer.ftl Outdated Show resolved Hide resolved
l10n/en-US/viewer.ftl Outdated Show resolved Hide resolved
@calixteman calixteman force-pushed the new_stamp_dialog branch 8 times, most recently from 64cc39b to d83aa2a Compare July 26, 2024 08:09
@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Jul 26, 2024

Given the sheer size of this PR, it'll take some time for me to properly review all of this. Leaving some high level observations and questions for now:

  • Since we have an upcoming release (maybe this weekend), this should not land just before that anyway.

  • There's been some changes/updates of the PR since it's opened, and I'd like to know that the patch has "stabilized" before doing a full review.

  • These changes mean that we now have two separate implementations for the altText-handling in the UI, what's the eventual plan for consolidating this functionality?
    Will the new dialog be usable in the GENERIC viewer, but obviously with AI generation switched off there, since it'd seem quite unfortunate if we'd need to maintain two separate implementations going forward?

  • The FakeMLManager-code should probably be limited, using the pre-processor, to just development- and testing-mode.

  • Looking at the preview above it no longer seems possible to mark an image as decorative, is that intended?

  • Will the COMPONENTS-build work correctly with the new image-resources added here, note this code?

  • There seems to be a little bit more leading padding before the icon in the inline toolbar now, is this intended since it leads to more empty space there (see the arrow in the screen-shot below)?

    padding

@calixteman
Copy link
Contributor Author

Given the sheer size of this PR, it'll take some time for me to properly review all of this. Leaving some high level observations and questions for now:

* Since we have an upcoming release (maybe this weekend), this should _not_ land just before that anyway.

The soft freeze in m-c is next week (thursday) and we really want it asap.
The integration tests are passing locally so hopefully nothing is broken.
That said if it causes a problem for the release, I'd prefer we postpone the release itself.

* There's been some changes/updates of the PR since it's opened, and I'd like to know that the patch has "stabilized" before doing a full review.

It should be stabilized now.

* These changes mean that we now have _two separate_ implementations for the altText-handling in the UI, what's the eventual plan for consolidating this functionality?

We want to experiment the new flow + the ai stuff in the next release and we'll decide what we want to do then.

  Will the new dialog be usable in the GENERIC viewer, but obviously with AI generation switched off there, since it'd seem quite unfortunate if we'd need to maintain two separate implementations going forward?

Whatever we decide, I'll do the cleanup, but for now we must live with I guess. But for sure, we (/I) don't want to have a duplicated flow...

* The `FakeMLManager`-code should probably be limited, using the pre-processor, to just development- and testing-mode.

Fixed.

* Looking at the preview above it no longer seems possible to mark an image as `decorative`, is that intended?

Yes it's.

* Will the COMPONENTS-build work correctly with the new image-resources added here, note [this code](https://github.com/mozilla/pdf.js/blob/2efa3e460c94ad6ad48541e2ec44af6e22153bda/gulpfile.mjs#L1095-L1102)?

Fixed.

* There seems to be a little bit more leading padding before the icon in the inline toolbar now, is this intended since it leads to more empty space there (see the arrow in the screen-shot below)?

Fixed.

@Snuffleupagus
Copy link
Collaborator

The soft freeze in m-c is next week (thursday) and we really want it asap.

Well, I hope you understand that I cannot review well over a thousand lines of new code immediately (and I'm still not at a 100% unfortunately).
Especially since a lot of this is stuff is asynchronous, which requires more time to consider how the various pieces interact.

One quickly observed bug, related to the asynchronicity of the code:

  1. Open the preview in [Editor] Implement the new alt text flow (bug 1909604) #18492 (comment).
  2. Add a new Stamp-editor.
  3. Immediately close the altText dialog and delete the editor.

In the console the following is printed (after a short wait):

TypeError: can't access property "guessedAltText", this[#currentEditor] is null
    #mlGuessAltText new_alt_text_manager.js:248
new_alt_text_manager.js:257:14

@calixteman
Copy link
Contributor Author

The soft freeze in m-c is next week (thursday) and we really want it asap.

Well, I hope you understand that I cannot review well over a thousand lines of new code immediately (and I'm still not at a 100% unfortunately). Especially since a lot of this is stuff is asynchronous, which requires more time to consider how the various pieces interact.

I completely understand your point, but we want to make the experiment in 130.
If you see something obviously wrong I'll fix it immediately, but if you don't notice any bad things, we could merge it and fix the issues we may discover (especially when I'll begin to write tests or if you find them in playing with this new UI) in follow-up patches. Of course, if we discover something bad we can't fix easily, we won't ship it and we can back out it.
I'm a bit in a rush but I really understand you aren't and I don't want to rush you...
And I've few other things to implement... (telemetry and image settings mostly).

One quickly observed bug, related to the asynchronicity of the code:

1. Open the preview in [[Editor] Implement the new alt text flow (bug 1909604) #18492 (comment)](https://github.com/mozilla/pdf.js/pull/18492#issuecomment-2247947991).

2. Add a new Stamp-editor.

3. Immediately close the altText dialog _and_ delete the editor.

In the console the following is printed (after a short wait):

TypeError: can't access property "guessedAltText", this[#currentEditor] is null
    #mlGuessAltText new_alt_text_manager.js:248
new_alt_text_manager.js:257:14

I fixed it two days ago.
I'll push a new version, with the enabled options in order to have a new viewer, and I'll set the options back, once we're more or less happy with patch.

@calixteman
Copy link
Contributor Author

/botio-linux preview

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_preview from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/e9a1e6bcba1d50c/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/e9a1e6bcba1d50c/output.txt

Total script time: 1.08 mins

Published

@timvandermeij timvandermeij self-requested a review July 26, 2024 16:38
@timvandermeij
Copy link
Contributor

/botio-windows preview

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_preview from @timvandermeij received. Current queue size: 0

Live output at: http://54.193.163.58:8877/4fcb10597d5cda9/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Success

Full output at http://54.193.163.58:8877/4fcb10597d5cda9/output.txt

Total script time: 6.03 mins

Published

Copy link
Collaborator

@Snuffleupagus Snuffleupagus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation in NewAltTextManager feels a bit complicated, based on an initial look, but maybe it's not really possible to simplify it?

l10n/en-US/viewer.ftl Outdated Show resolved Hide resolved
l10n/en-US/viewer.ftl Outdated Show resolved Hide resolved
l10n/en-US/viewer.ftl Show resolved Hide resolved
src/display/editor/alt_text.js Outdated Show resolved Hide resolved
src/display/editor/alt_text.js Outdated Show resolved Hide resolved
web/viewer.html Outdated Show resolved Hide resolved
web/new_alt_text_manager.js Outdated Show resolved Hide resolved
web/new_alt_text_manager.js Outdated Show resolved Hide resolved
web/new_alt_text_manager.js Outdated Show resolved Hide resolved
web/firefoxcom.js Outdated Show resolved Hide resolved
Copy link
Contributor

@timvandermeij timvandermeij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall. The comments below are everything I've found going over the entire patch, but I'd like to see the updated patch and play with the preview a bit more before approving this.

l10n/en-US/viewer.ftl Outdated Show resolved Hide resolved
l10n/en-US/viewer.ftl Outdated Show resolved Hide resolved
l10n/en-US/viewer.ftl Outdated Show resolved Hide resolved
src/display/editor/alt_text.js Outdated Show resolved Hide resolved
src/display/editor/alt_text.js Outdated Show resolved Hide resolved
web/viewer.html Outdated Show resolved Hide resolved
web/viewer.html Outdated Show resolved Hide resolved
web/viewer.html Outdated Show resolved Hide resolved
web/viewer.html Outdated Show resolved Hide resolved
web/viewer.html Outdated Show resolved Hide resolved
@calixteman calixteman force-pushed the new_stamp_dialog branch 2 times, most recently from 34630b6 to c989111 Compare July 27, 2024 21:08
@timvandermeij
Copy link
Contributor

/botio-linux preview

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_preview from @timvandermeij received. Current queue size: 1

Live output at: http://54.241.84.105:8877/a29b2029c3a97fd/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/a29b2029c3a97fd/output.txt

Total script time: 1.05 mins

Published

Copy link
Contributor

@timvandermeij timvandermeij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, with:

  • the comments below and the two open tracking comments above addressed;
  • passing CI and tests;
  • a follow-up issue to create integration tests for the new alt text flow because currently we only test the current/old flow (note that we can wait with that until the new alt text flow is accepted and it's combined with removing the current/old flow).

src/display/editor/alt_text.js Outdated Show resolved Hide resolved
web/new_alt_text_manager.js Show resolved Hide resolved
web/new_alt_text_manager.js Show resolved Hide resolved
@calixteman
Copy link
Contributor Author

/botio integrationtest

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_integrationtest from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/7e80beda939a6ac/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_integrationtest from @calixteman received. Current queue size: 0

Live output at: http://54.193.163.58:8877/5dcbb8e369b3d69/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/7e80beda939a6ac/output.txt

Total script time: 10.17 mins

  • Integration Tests: FAILED

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/5dcbb8e369b3d69/output.txt

Total script time: 22.65 mins

  • Integration Tests: FAILED

For the Firefox pdf viewer, we want to use AI to guess an alt-text when adding an image to a pdf.
For now the telemtry stuff is not implemented and will come soon.
In order to test it locally:
 - set enableAltText, enableFakeMLManager and enableUpdatedAddImage to true.
or in Firefox:
 - set browser.ml.enable, pdfjs.enableAltText and pdfjs.enableUpdatedAddImage to true.
@calixteman
Copy link
Contributor Author

/botio integrationtest

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_integrationtest from @calixteman received. Current queue size: 0

Live output at: http://54.241.84.105:8877/2a72847ce482545/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_integrationtest from @calixteman received. Current queue size: 0

Live output at: http://54.193.163.58:8877/40c2572b4b1f093/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/2a72847ce482545/output.txt

Total script time: 8.43 mins

  • Integration Tests: Passed

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/40c2572b4b1f093/output.txt

Total script time: 20.06 mins

  • Integration Tests: FAILED

@calixteman
Copy link
Contributor Author

@Snuffleupagus is there anything blocking ?

@calixteman calixteman merged commit 8f45374 into mozilla:master Jul 30, 2024
9 checks passed
@calixteman calixteman deleted the new_stamp_dialog branch July 30, 2024 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants