A Google Chrome extension runs LLM in extension background service worker to support Magda web application frontend chatbot.
Thanks to MLC LLM & WebLLM project. By leveraging WebGPU API, this extension can assist web applications to run the LLM engine within the web browser.
Under the web application frontend code's requests, the extension will download the requested LLM model (around 2GB download depending on models), cache locally and run it in an extension background service worker. This service worker will only remain "active" when any web pages that require it remain active.
The extension has not been published to Google Chrome Web Store yet.
To install, you need to follow the manual steps below:
- 1> Download a release version from the Releases page of this repo. e.g.
magda-llm-service-worker-extension-v1.0.0.zip
- 2> Unzip the downloaded zip file
- 3> Open the URL
chrome://extensions
in a Chrome tab to open the Chrome extension management UI. - 4> Tick the "Developer mode" toggle in the top-right corner.
- 5> Click the "Load unpacked" button in the top-left corner and select the previously downloaded & unzipped folder.
If you want to restrict the LLM model access to specified domain, please modify the
manifest.json
in the unzippped folder after step 2 above. Modify theexternally_connectable.matches
array to set its value to your preferred domain. e.g."matches": ["https://*.mydomain.com/*"]
. More see here.
- Install Node.js & yarn
- Run
yarn install
at the project root - Run
yarn build
to build. You can find built files indist
folder.
WebGPU API can be accessed from web pages directly without the help of extensions. However, in Google Chrome, when the setting "Clear cookies and site data when you close all windows" is selected, there is only around 300MB of local storage available, which is not sufficient to store a 7B LLM model locally. Moreover, on organisation-managed devices (e.g. company laptops), this setting often is turned on and can be changed by the user. More see this issue.
As we can request unlimitedStorage permission via this extension, the LLM model can always be downloaded locally without requiring users to change any browser settings.
When install this extension, it will request the following permissions:
- "unlimitedStorage": be able to save the LLM model locally so that we don't need to re-download it after the web page is closed.
- The cached local data will be removed when you remove this extension from Google Chrome.
- externally_connectable: Allow web application to connect to the extension background service worker from a web page via Long-lived connections.
- By default, the extension's manifest.json allows connections from any domain. You can modify the manifest.json to restrict access for your own project. More see "How to install" section.
Please find the list here: https://mlc.ai/models