Skip to content

Commit

Permalink
merge: local to remote
Browse files Browse the repository at this point in the history
  • Loading branch information
brewcoua committed Jun 10, 2024
2 parents 9c2af3b + 0c86ea1 commit 480a237
Show file tree
Hide file tree
Showing 25 changed files with 927 additions and 1,392 deletions.
71 changes: 71 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
name: CI

on:
push:
branches:
- master
workflow_dispatch:

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4

- name: Set up Bun
uses: oven-sh/setup-bun@v1

- name: Install dependencies
run: bun install --frozen-lockfile

- name: Build project
run: bun run build

publish:
needs: build
runs-on: ubuntu-latest
permissions:
contents: write # Publish GitHub Releases
issues: write # Comment on issues
pull-requests: write # Comment on PRs
id-token: write # OIDC token for npm provenance
steps:
- name: Checkout
uses: actions/checkout@v4

- name: Set up Bun
uses: oven-sh/setup-bun@v1
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: 'lts/iron'

- name: Import GPG
uses: crazy-max/ghaction-import-gpg@v6
with:
gpg_private_key: ${{ secrets.GPG_PRIVATE_KEY }}
passphrase: ${{ secrets.GPG_PASSPHRASE }}
git_user_signingkey: true
git_commit_gpgsign: true

- name: Install dependencies
run: bun install --frozen-lockfile

- name: Build project
run: bun run build

# Workaround for verifyConditions step of @semantic-release/npm
- run: cp ../package.json .
working-directory: dist

- name: Release
run: bunx semantic-release
env:
NPM_TOKEN: ${{ secrets.NPM_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GIT_AUTHOR_NAME: ${{ github.actor }}
GIT_AUTHOR_EMAIL: ${{ github.actor_id }}+${{ github.actor }}@users.noreply.github.com
GIT_COMMITTER_NAME: brewcoua-bot
GIT_COMMITTER_EMAIL: 151367391+brewcoua-bot@users.noreply.github.com
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
node_modules/
node_modules/
.rollup.cache
*.tsbuildinfo
dist/
39 changes: 39 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
## [1.1.3](https://github.com/brewcoua/web-som/compare/v1.1.2...v1.1.3) (2024-06-09)


### Bug Fixes

* **ci:** remove signed push to avoid github failure ([8e2893e](https://github.com/brewcoua/web-som/commit/8e2893ec6c824647308870fcc44c0fdfc5c26a17))
* **ci:** remove tag sign since it forces EDITOR to be set ([d5d6cc9](https://github.com/brewcoua/web-som/commit/d5d6cc9b0d35bed2f0938001fe323caed3265ed5))
* **ci:** update author and committer & sign action commits ([842605a](https://github.com/brewcoua/web-som/commit/842605a65dc9c7eb07e8a48f1c62115c39fa9175))

## [1.1.2](https://github.com/brewcoua/web-som/compare/v1.1.1...v1.1.2) (2024-06-09)


### Bug Fixes

* **release:** add double exec to commit bumped version ([0cb3dcb](https://github.com/brewcoua/web-som/commit/0cb3dcbac61d0860698dac829fd346eba8c31b75))

## [1.1.1](https://github.com/brewcoua/web-som/compare/v1.1.0...v1.1.1) (2024-06-09)


### Bug Fixes

* **rollup:** update built format to UMD & fix usage info ([e027208](https://github.com/brewcoua/web-som/commit/e027208ef1f3f7ac069dddbcb9a55619c0360e49))

# [1.1.0](https://github.com/brewcoua/web-som/compare/v1.0.0...v1.1.0) (2024-06-08)


### Bug Fixes

* **visibility:** missing candidates from QT query & parent zIndex relationship for isAbove ([428a342](https://github.com/brewcoua/web-som/commit/428a3423b605dd7b0004eb9013dc965fd9abac58))


### Features

* **colors:** set box colors contrasted from surrounding colors ([a45bde1](https://github.com/brewcoua/web-som/commit/a45bde1d1f70f782b77d709b873d76e408a597e6))


### Performance Improvements

* use quadtree to optimize element mapping ([32660db](https://github.com/brewcoua/web-som/commit/32660dbda7394a0e8719f8a7ad10daaaadd7e710))
47 changes: 42 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,57 @@
# `web-som`
# `@brewcoua/web-som`

![NPM Version](https://img.shields.io/npm/v/%40brewcoua%2Fweb-som)
![NPM License](https://img.shields.io/npm/l/%40brewcoua%2Fweb-som)
![GitHub branch check runs](https://img.shields.io/github/check-runs/brewcoua/web-som/master)

A Set-of-Marks script for web grounding, suitable for web agent automation.
When using this script, the web page should not have any animations or dynamic content that could interfere with the
script's operation.
script's operation. Additionally, since the script uses quite a few promises, it is recommended to avoid using too little
resources, as it could lead to a deadlock.

## Usage

Include both the script (`SoM.js` or `SoM.min.js`) and the style (`SoM.css` or `SoM.min.css`) in the web page.
You can then use the `SoM` object in the `window` object to interact with the script.
Include the script in your web page (`SoM.js` or `SoM.min.js`), and then call the `display` method on the `SoM` object in the `window` object.
Additionally, you can use [unpkg](https://unpkg.com/) to include the script in your page, like so:

```html
<!-- latest -->
<script src="https://unpkg.com/@brewcoua/web-som"></script>
<!-- specific version -->
<script src="https://unpkg.com/@brewcoua/web-som@1.0.0"></script>
```

### Example

```js
window.SoM.display().then(() => console.log("Set-of-Marks displayed"));
(async () => {
await window.SoM.display();
console.log('Set-of-Marks displayed');
const mark = window.SoM.resolve(4); // Resolve the fourth mark (with label '4')
mark.click(); // Click the mark
})();
```

### How it works

This is a step-by-step guide on how the script works:

#### 1. Elements loading

The script will first query all elements on the page (and inside shadow roots) that fit specific selectors (e.g. `a`, `button`, `input`, etc., see [src/constants.ts](src/constants.ts)). After that, it will go through all elements on the page and find the elements that display a pointer cursor. These elements will be stored in a list of elements that can be clicked, but are less likely to be right than the previously queried elements.

#### 2. Elements filtering

The script will then first proceed to filter out, in both lists, the elements that are not visible enough (see [src/constants.ts](src/constants.ts) for the threshold values, e.g. `0.7`). To do that, we first use an [Intersection Observer](https://developer.mozilla.org/en-US/docs/Web/API/Intersection_Observer_API) to check if the element is visible enough in the viewport, and if it is, we find the elements that are possibly intersecting with the element, using a QuadTree that we previously built with all elements on the page by their bounding boxes. We then query the QuadTree for elements that are possibly intersecting with the element, and we draw them on a canvas, after drawing the original element. We then calculate the pixel-by-pixel visibility ratio by counting the number of pixels that were not overlapped by other elements. If the ratio is above the threshold, we consider the element visible enough.

After that, we take the elements in the second list (the ones that display a pointer cursor) and apply a nesting filter. This filter will remove all elements that are either inside a prioritized element (e.g. a button) or that have too many clickable children. Additionally, we consider elements disjoint if their size is different enough (see [src/constants.ts](src/constants.ts) for the threshold value, e.g. `0.7`).
When applying this filter, we also consider the first list for reference, while not removing any element from that first list afterwards.

#### 3. Elements rendering

Finally, we proceed to render the boxes over the elements that passed the filters. We first render all the boxes, for which we calculate a contrasted color based on the element's background color and all surrounding boxes' colors (we also apply a min/max luminance and minimum saturation). After that, we render labels for the boxes, while calculating the label position that would overlap the least with other labels and boxes, while ignoring any box that fully overlaps that label's box (since some buttons may be completely inside cards, for example). If an element is editable, the box will have a stripped pattern, along with a border to make it more visible.
All boxes ignore pointer events, so the user can still interact with the page.

## License

This project is licensed under either of the following, at your option:
Expand Down
42 changes: 0 additions & 42 deletions build.ts

This file was deleted.

Binary file modified bun.lockb
Binary file not shown.
Loading

0 comments on commit 480a237

Please sign in to comment.