Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove latest in cron job #18

Merged
merged 16 commits into from
Feb 11, 2024
38 changes: 7 additions & 31 deletions .github/workflows/selenium-web.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
name: Aliexpress Product Checker

on:
pull_request:
workflow_dispatch:
schedule:
- cron: 0 9 * * 1 # 9AM every monday
Expand All @@ -9,43 +10,18 @@ jobs:
check-aliexpress-links:
runs-on: ubuntu-latest

container:
image: node:19

services:
selenium:
image: selenium/standalone-firefox
options: --shm-size=2gb

steps:
- uses: actions/checkout@v4

- uses: actions/setup-node@v3
with:
node-version: 19

- uses: abhi1693/setup-browser@v0.3.4
with:
browser: firefox
version: latest
node-version: 18

# Install some selenium dependencies...
- run: |
apt-get update -y &&
apt-get install --no-install-recommends --no-install-suggests -y tzdata ca-certificates bzip2 curl wget libc-dev libxt6 &&
apt-get install --no-install-recommends --no-install-suggests -y `apt-cache depends firefox-esr | awk '/Depends:/{print$2}'` &&
update-ca-certificates &&

# Cleanup unnecessary stuff
apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false &&
rm -rf /var/lib/apt/lists/* /tmp/*

# Install some more selenium dependencies...
- run: |
wget https://github.com/mozilla/geckodriver/releases/download/v0.31.0/geckodriver-v0.31.0-linux64.tar.gz &&
tar -zxf geckodriver-v0.31.0-linux64.tar.gz -C /usr/local/bin &&
chmod +x /usr/local/bin/geckodriver &&
rm geckodriver-v0.31.0-linux64.tar.gz
# - run: sudo apt-get install firefox

- run: npm i selenium-webdriver

- run: npx node .github/workflows/test-ali-links.mjs
env:
DISPLAY: :0
MOZ_HEADLESS: 1
19 changes: 15 additions & 4 deletions .github/workflows/test-ali-links.mjs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { readFile } from 'fs/promises';
import { By, Builder } from 'selenium-webdriver';
import firefox from 'selenium-webdriver/firefox.js'
import Firefox from 'selenium-webdriver/firefox.js'

const links = [];
let driver;
Expand All @@ -14,30 +14,41 @@ const content = async(file) => {

const search = async (item) => {
await driver.get('https://s.click.aliexpress.com/e/_' + item);
return await driver.findElements(By.className('not-found-page'))
let notFound = await driver.findElements(By.className('not-found-page')) // Product not found
let homepage = await driver.findElements(By.className('new-affiliate')) // Link points to homepage
const results = await Promise.all([notFound, homepage])

return results.flat(Infinity);
}

content(bomDocument)
.then(
result => {
const sanitizedMD = result.split('/_')
sanitizedMD.shift(); // remove stuff that doesn't include links
// remove stuff that doesn't include links
sanitizedMD.shift();
sanitizedMD.forEach((el) => {
links.push(el.split(')')[0])
})
}
)
.finally(
async () => {

const options = new Firefox.Options();
options.addArguments('--headless');

driver = new Builder()
.forBrowser('firefox')
.setFirefoxOptions(new firefox.Options().addArguments('--headless'))
.setFirefoxOptions(options)
.build();

// Test scraped aliexpress links from bom-doc
links.every(async el => {
const result = await search(el);

console.log(result)

if (result.length > 0) {
throw new Error(`product ${el} is a broken link`, content)
}
Expand Down
Loading