A collection of tools for collecting, processing, and searching Bluesky starter packs. These tools help you collect, process, and search through Bluesky's starter packs and their user data.
They also build a MongoDB index that can easily be searched. For a prototype of an index frontend of BlueSky Starterpacks see the demo currently running on vercel, github repo.
- Node.js (v16 or higher): Download Node.js
- MongoDB (required for data storage): Download MongoDB
- Git (for cloning the repository)
git clone https://github.com/CrispStrobe/bluesky-starterpacks-index.git
cd bsky-starter-pack-tools
npm install
- Create a
.env
file:
cp .env.example .env
- Edit the
.env
file with your credentials:
BSKY_USERNAME=your.username.bsky.social
BSKY_PASSWORD=your_password
MONGODB_URI=mongodb://localhost:27017/starterpacks
# Regular processing with MongoDB storage
node data-processor.js
# Process without MongoDB storage (file output only)
node data-processor.js --nomongodb
# Update MongoDB from existing files
node data-processor.js --updatemongodb
# Purge all data and start fresh
node data-processor.js --purge
The search tool supports both pack-centric and user-centric views, with various data sources:
# Search in MongoDB (default)
node search-packs.js "search term"
# Show results organized by users
node search-packs.js --users "search term"
# Show results organized by packs (default)
node search-packs.js --packs "search term"
# Search in JSON file
node search-packs.js -s json "search term"
# Search in YAML file
node search-packs.js -s yaml "search term"
--source
or-s
: Specify data source (mongodb, json, yaml)--file
or-f
: Input file path for JSON/YAML--exact
or-e
: Perform exact match--case-sensitive
or-c
: Make search case-sensitive--users
: Show results organized by users--packs
: Show results organized by packs (default)--help
: Show help message
Examples:
# Exact match search
node search-packs.js --exact "javascript"
# Case-sensitive search with user-centric view
node search-packs.js --users --case-sensitive "JavaScript"
# Search in JSON with pack-centric view
node search-packs.js -s json --packs "javascript"
The tools generate several files:
starter_packs.json
: JSON output of processed starter packsstarter_packs.yaml
: YAML output of processed starter packscheckpoints.json
: Progress tracking and resumption datadata_processor.log
: Processing logs
- Collects and processes Bluesky starter packs
- Stores data in MongoDB and/or files (JSON/YAML)
- Handles incremental updates
- Tracks processing progress
- Supports resume after interruption
- Searches by handle, display name, pack name, or description
- Supports exact and partial matching
- Case-sensitive and case-insensitive options
- Two view formats: pack-centric and user-centric
- Multiple data source options (MongoDB/JSON/YAML)
- Implements Bluesky's rate limit of 3000 requests per 5 minutes
- Uses exponential backoff for rate limit handling
- Automatically retries failed requests
- Handles network errors gracefully
{
did: String, // User's DID
handle: String, // User's handle
display_name: String, // User's display name
pack_ids: [String], // Array of pack IDs user belongs to
last_updated: Date, // Last profile update time
profile_check_needed: Boolean // Profile update flag
}
{
rkey: String, // Pack's unique key
name: String, // Pack name
creator: String, // Pack creator's handle
description: String, // Pack description
user_count: Number, // Number of users in pack
created_at: Date, // Creation timestamp
updated_at: Date // Last update timestamp
}
- MongoDB connection errors: Verify MongoDB is running and connection string is correct
- Rate limit errors: The script will automatically handle these with backoff
- Network timeouts: Check your internet connection
- Interrupted processing: Will automatically resume from last checkpoint
- Check
data_processor.log
for detailed processing information - Use
--nomongodb
flag to test file processing without database - MongoDB errors will be logged with full stack traces
bsky-starter-pack-tools/
├── data-processor.js # Main processing script
├── search-packs.js # Search interface
├── .env # Configuration (not in git)
├── .env.example # Example configuration
└── package.json # Project dependencies
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request