Scrapers, parsers, data wrangling and utilities for TikTok and YouTube.
We store large files with git lfs. We manage our monorepo with turborepo. We publish new releases with changeset
TikTok utilities for DataSkop
We have a specific setup to run the scraper on the server.
- a Mullvad subscriptions (you need to change the code if you choose another VPN provider)
- a 'Logs Data Platform' instance in Gravelines (GRA) on OVH
NPM_GITHUB_AUTH
token to read private packages on GitHub
# schaufel / DataSkop
PLATFORM_URL=https://dataskop-platform-url.net
SERIOUS_PROTECTION=basic-auth-pw
API_KEY=drf-api-key
# gluetun
VPN_SERVICE_PROVIDER=mullvad
VPN_TYPE=wireguard
WIREGUARD_PRIVATE_KEY=private-key
WIREGUARD_ADDRESSES=ip-address
SERVER_CITIES=a-city
DOT=off
# Send Logs to OVH
_X-OVH-TOKEN=ovh-logs-data-stream-token
# `deploy.sh`
#!/usr/bin/env bash
rsync -avz --exclude node_modules --exclude .git --exclude docker/volume --exclude docker/gluetun-volume --exclude test . sshlocation:~/code/schaufel
ssh awlab1 "cd code/schaufel && NPM_GITHUB_AUTH=the_token docker-compose up --detach --build"
cd packages/schaufel-cli
npm run merge-lookups ~/Library/Application\ Support/Electron/databases/lookup.json ~/Library/Application\ Support/dataskop-electron/databases/lookup.json
Conntect to any Mullvad server and then do the following:
cd packages/schaufel-cli
npm run scrape-meta https://www.tiktok.com/@newmartina/video/7232019489674562842 https://www.tiktok.com/@victordemartrin/video/7228575676335443226
MIT