Figma Files Scraper for Research & Studies
As of Apr 2023, This Archive contains 100GB (Minified, raw - 550GB) of Figma Files, and 3TB of Top-level Images corresponding to Fimga layers and used image fills (3mb optmized), and 30TB of all images (including layers) in the files.
Demo of step 2 ~ 4 running concurrently
With NodeJS Client - @figma-api/community
import { Client } from "@figma-api/community";
const client = Client();
// a file id is a id from figma.com/community/file/:id
// e.g. - https://www.figma.com/community/file/1035203688168086460
const fileid = "1035203688168086460";
// fetch file
const { data: document } = await client.file(fileid);
This scraper is a combination of
- Selenium scraper to crawl the Figma community files (Takes about 5 hours) - You can skip this step and use our latest data
- Selenium automator to copy (duplicate) the file to your account (Takes about 3 days)
- Figma File Archiver to download the File content as JSON (Takes about 5 hours)
- And optionally, Figma Image Archiver to download the in-design images and layer exported as PNGs to your local machine (Takes about 6 days for top-frame layers, and about 1 month for all layers)
pip3 install -r requirements.txt
# step 1. (Skip and use pre-crawled data if you want as mentioned above)
cd figma_archiver
scrapy crawl figma_spider --nolog -a target=popular
# this will output a output.popular.json file
# step 2. You'll need a new figma account since it copies about 30,00+ files to your drafts
# setup .env following the README at figma_copy
cd figma_copy
python3 main.py --file='../data/latest/index.json' --batch-size=10000
# this will output a community : your-file mapping under prgress/your-email@example.com.copies.json
# step 3. you can run this script with figma_copy in parallel
cd figma_archiver
# fetch files
python3 files.py -f ../figma_copy/progress/your-email@example.com.copies.json
# fetch images (this use the output directory from above step)
python3 images.py --src='./downloads/*.json'
This is a brief example of how-to-use, for full-setup, please read the README on each automators. and the script argument may differ by your configurations, and as you use extarnal drive.
Requirements
- About 1TB of free space on your local machine. (Minimal, for full scraping, without images)
- About 100TB of free space on your external drive. (If you are collecting images as well. Full setup)
Todo
- Docker image for easy deployment and running on the cloud
- Official CDN Server will latest data
This repository contains a Figma community crawler that collects and processes data from Figma community files. Some of the files used in this project are licensed under the Creative Commons Attribution 4.0 International License (CC-By 4.0). In accordance with this license, the following attribution is provided:
This work includes material that is derived from or based on Title of the original work by [Author's Name], which is licensed under the "Creative Commons Attribution 4.0 International License."
If you use or redistribute the data generated by this crawler, you must also adhere to the terms of the CC-By 4.0 license by providing appropriate credit to the original authors, linking to the license, and indicating if any changes were made.
Please note that this repository is provided "as-is" without warranty of any kind, express or implied. The creators of this repository are not responsible for any errors or omissions, or for the results obtained from the use of the data. Users are solely responsible for complying with the CC-By 4.0 license and any other applicable laws and regulations.
Remember to replace the placeholders ([Title of the original work], [URL_to_original_work], and [Author's Name]) with the relevant information for each work you include in your dataset.
Learn more about the Figma community license here.