Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting Multithreading in Web Extensions #462

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

BrennanBarker
Copy link

Per discussion on microsoft/onnxruntime , I took a stab at implementing the extension example using an offscreen document in order to support multiple threads. There are a few serious drawbacks to this approach that I lay out below, including that the offscreen document seems to interfere with the popup window, so only the context menu workflow works in this PR. But I thought I'd post and give y'all a chance to review and think through next steps.

Overall, things turned out to be a bit more complicated than I thought:

  • While the initial issue with multithreading seemed to be that the background.js service worker couldn't support the onnx runtime's calls to URL.createObjectURL, it turned out that the background service worker also will not support the creation of new web Workers, which onnx creates when entering multi-threading mode. So even if we convinced microsoft/onnxruntime to avoid calls to URL.createObjectURL, it still wouldn't be possible to run multithreaded inference in the background.js service worker. Fortunately, creation of Workers is another valid reason for using an offscreen document.
  • Even within an offscreen document, the way that onnx's creates new threads will throw errors for violating Chrome extensions' Content Security Policy. To get around that I load and run the model in a sandboxed iframe within the offscreen page.

This offscreen page+sandbox iframe strategy does seem to allow for multithreading, but it has a couple of undesirable properties that might mean needing to try something else entirely.

  • Offscreen documents don't really seem intended for staying open for long periods, whereas in this case the offscreen document needs to stay open pretty much indefinitely so that it can load the model once and have it available for multiple inferences. One very serious implication of having the offscreen document open all the time is that it seems to break the popup action -- at least I couldn't get the popup to appear after I had initialized the offscreen document. So this PR only works for the workflow where a user selects text and hits "classify" in the context window. One way forward here might be to see if the popup window (and potentially the offscreen document?) can be replaced by a Chrome Side Panel.
  • The sandbox does not have access to the Cache API, which I dealt with by setting env.useBrowserCache = false. There might be ways around this but I haven't given it much thought.

Anyway, interested to hear your thoughts. If there's interest in exploring the SidePanel approach I'd be happy to take a look and post another PR.

@xenova
Copy link
Collaborator

xenova commented Dec 19, 2023

Thanks for your investigation! I must admit, I also had quite a few issues when originally creating the web extension template, as there were a ton of hoops to jump through to get something working. 😅

Ideally, this would be solved by onnxruntime-web itself, since - as you point out - that bug has been present in the library for quite some time now (nearly a year). Though, it would seem as though they are focusing more of their attention on WebGPU support.

I'd be interested in seeing what performance benefits you are able to achieve with this approach - and if the difference is substantial, it would be nice to create a good example for it.

@BrennanBarker
Copy link
Author

I agree, it's a bummer. An inability to create new Workers in the background script--which was masked by the createObjectURL error--seems like something that will be even harder for the onnxruntime-web folks to work around; in that case we might be waiting on the folks at Chrome to allow service workers to create workers, which might never happen. And even then we'd have to worry about keeping the service worker alive on long-running calculations!

In any case, I'll try replicating the same basic functionality of the current extension example using a Side Panel, which I hope will end up much cleaner than the offscreen document setup.

And yes, when I'm done I'm happy to run some benchmarks to show to what extent (if any!) performance varies between a single thread, multithread-in-offscreen, and multithread-in-sidepanel. I can think of a few experiments to run, but if you have some suggested test models+datasets I'm happy to run those too.

@lxfater
Copy link

lxfater commented Dec 20, 2023

I've tried this architecture and it doesn't seem to affect my popup page. @BrennanBarker

@lxfater
Copy link

lxfater commented Dec 20, 2023

I think chrome.offscreen.Reason.BLOBS, chrome.offscreen.Reason.WORKERS are not necessary, because onnxruntime mainly run inside the iframe. My code isn't in production yet, so if you can, I'd like to elaborate on why offscreen doucument affects your popup page. @BrennanBarker

@BrennanBarker
Copy link
Author

@lxfater if I'm understanding correctly, you're suggesting moving the inference code into a sandboxed iframe within the popup page? That might work, but since the popup page only stays open in the context of one tab/window (and closes when changing tabs or focusing on the content of the current tab), I don't think it would be possible to send data from the rest of the browser into the extension, as @xenova intended with the (right-click) context menu action. It might work for just the popup action, although I'd be worried that any click outside the popup would kill the popup page and require reloading the model. A Side Panel would work similarly, though, (yet would be available to all pages), and likewise would not require an offscreen page at all; I'll submit a PR for this soon.

The reason the initial try here used an offscreen page was another way for the calculation to be done in a background process, which like a SidePanel could also stay open across pages and be and shared between tabs. I suppose it is unclear whether those offscreen.Reasons need to be given if the calls to URL.createObjectURL and new Worker are being used in a sandbox, but as I understand the offscreen documentation, one needs to supply some Reason for opening the offscreen page in the first place, and these are the two reasons we use it in this case.

@lxfater
Copy link

lxfater commented Dec 20, 2023

No, I think the code should be in the iframe of the offscreen document, I just think these two reasons are redundant.
This may invalidate your popup action. Probably. hahaha

@BrennanBarker
Copy link
Author

I went ahead and reimplemented this example using the Chrome Sidepanel.

I was able to avoid the use of the offscreen document, and both example input workflows (context menu and directly via the sidepanel UI) now work correctly. I found the sidepanel had several nice UI advantages as well.

I still needed to make use of a sandboxed page running in an iframe on the sidepanel UI to handle the way ONNX loads code for additional worker threads. This pattern is suggested by the Chrome Extensions docs, and it seems to work fine, with the one complication that, as mentioned above, the sandbox does not have access to the browser cache API, so models need to be reloaded any time the sidepanel is closed. I suspect there might be a way around this using a custom cache (thanks for those!), but I still haven't thought about it much.

As far as benchmarking -- I'm happy to take this on but might need some guidance as to what a good test case would be.
I think we should only expect to see performance increase under some kinds of inference loads, but being less familiar with how ONNX and Transformers.js are taking advantage of multiple threads under the hood I don't have a good sense for when that would be. I'm also not sure if the (very old) hardware I happen to have access to at the moment is a good candidate for showcasing multithreading or running a benchmarking routine efficiently.

@BrennanBarker
Copy link
Author

I also took the opportunity to implement the model loading progress bar, as that was slightly complicated and involved some message passing between the sidepanel UI and the sandbox. It works though!

@xenova
Copy link
Collaborator

xenova commented Jan 10, 2024

Great stuff @BrennanBarker! I'll do a full review soon, but in the meantime, would you mind moving your code to a separate example folder (instead of overwriting the existing example)? Perhaps named something like ./examples/extension-multithreaded?

@lxfater
Copy link

lxfater commented Jan 17, 2024

SharedArrayBuffer does not work in sandboxed page!! @BrennanBarker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants